- 동시접속자 기반 월간 대역폭 자동 추정 - DAU(일일활성사용자) 추정치 표시 (동접 × 10-14) - 대역폭 기반 Linode/Vultr 자동 선택 로직 - 비용 분석에 대역폭 비용 포함 - 지역 미선택시 서울/도쿄/오사카/싱가포르 기본 표시 - 지역별 서버 분리 표시 (GROUP BY instance + region) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
233 lines
6.1 KiB
Markdown
233 lines
6.1 KiB
Markdown
# Deployment Verification - VPSBenchmarks Scraper Upgrade
|
|
|
|
## Deployment Status: ✅ SUCCESSFUL
|
|
|
|
**Deployment Time**: 2026-01-24T01:11:30Z
|
|
**Version ID**: 1fc24577-b50b-4f46-83ad-23d60f5fe7d3
|
|
**Production URL**: https://server-recommend.kappa-d8e.workers.dev
|
|
|
|
## Changes Deployed
|
|
|
|
### 1. Cloudflare Browser Rendering API Integration
|
|
- ✅ BROWSER binding configured in wrangler.toml
|
|
- ✅ BROWSER: Fetcher added to Env interface
|
|
- ✅ Successfully deployed and recognized by Workers
|
|
|
|
### 2. Scraper Rewrite
|
|
- ✅ Browser Rendering API integration complete
|
|
- ✅ Multiple parsing strategies implemented
|
|
- ✅ Error handling and fallback logic in place
|
|
- ✅ Database deduplication logic updated
|
|
|
|
### 3. Database Schema
|
|
- ✅ Unique constraint for deduplication ready
|
|
- ✅ ON CONFLICT clause aligned with schema
|
|
|
|
### 4. Cron Trigger
|
|
- ✅ Daily schedule: 0 9 * * * (9:00 AM UTC)
|
|
- ✅ Auto-trigger configured and deployed
|
|
|
|
## Bindings Verified
|
|
|
|
```
|
|
env.DB (cloud-instances-db) D1 Database ✅
|
|
env.BROWSER Browser ✅
|
|
env.AI AI ✅
|
|
```
|
|
|
|
## Next Automatic Scrape
|
|
|
|
**Next Run**: Tomorrow at 9:00 AM UTC (2026-01-25 09:00:00 UTC)
|
|
|
|
## Manual Testing
|
|
|
|
### Option 1: Wait for Cron Trigger
|
|
The scraper will automatically run daily at 9:00 AM UTC.
|
|
|
|
### Option 2: Test Locally
|
|
```bash
|
|
# Terminal 1: Start dev server
|
|
npm run dev
|
|
|
|
# Terminal 2: Trigger scraper manually
|
|
curl "http://localhost:8793/__scheduled?cron=0+9+*+*+*"
|
|
```
|
|
|
|
### Option 3: Monitor Logs
|
|
```bash
|
|
# Watch production logs in real-time
|
|
npx wrangler tail
|
|
|
|
# Wait for next cron trigger to see output
|
|
```
|
|
|
|
## Expected Log Output
|
|
|
|
When the scraper runs (locally or in production), you should see:
|
|
|
|
```
|
|
[Scraper] Starting VPSBenchmarks.com scrape with Browser Rendering API
|
|
[Scraper] Fetching rendered HTML from vpsbenchmarks.com
|
|
[Scraper] Rendered HTML length: XXXXX
|
|
[Scraper] Extracted X benchmarks from HTML
|
|
[Scraper] Found X benchmark entries
|
|
[DB] Inserted/Updated: Provider PlanName
|
|
[Scraper] Completed in XXXms: X inserted, X skipped, X errors
|
|
```
|
|
|
|
## Verification Checklist
|
|
|
|
- [x] Code deployed successfully
|
|
- [x] All bindings configured (DB, BROWSER, AI)
|
|
- [x] Health check endpoint responding
|
|
- [x] Cron trigger scheduled
|
|
- [ ] First scrape run (waiting for next cron or manual test)
|
|
- [ ] Benchmarks extracted from vpsbenchmarks.com
|
|
- [ ] New records inserted into D1 database
|
|
|
|
## Database Verification
|
|
|
|
After the first scrape run, verify data:
|
|
|
|
```bash
|
|
# Check total benchmark count
|
|
npx wrangler d1 execute cloud-instances-db --command="SELECT COUNT(*) as total FROM vps_benchmarks"
|
|
|
|
# View latest benchmarks
|
|
npx wrangler d1 execute cloud-instances-db --command="SELECT provider_name, plan_name, geekbench_single, geekbench_multi, created_at FROM vps_benchmarks ORDER BY created_at DESC LIMIT 10"
|
|
```
|
|
|
|
## Browser Rendering API Usage
|
|
|
|
### Free Tier Limits
|
|
- **Quota**: 10 minutes per day
|
|
- **Current Usage**: 0 minutes (first run pending)
|
|
- **Estimated per Run**: < 1 minute
|
|
- **Daily Capacity**: ~10 runs (only 1 scheduled)
|
|
|
|
### Monitor Usage
|
|
Check Cloudflare dashboard for Browser Rendering API usage metrics.
|
|
|
|
## Troubleshooting
|
|
|
|
### If No Benchmarks Found
|
|
|
|
1. **Check Logs**
|
|
```bash
|
|
npx wrangler tail
|
|
```
|
|
|
|
2. **Inspect Rendered HTML**
|
|
- Look for "Rendered HTML length" in logs
|
|
- If length is very small (< 10KB), page may not be loading
|
|
- If length is large but no data extracted, selectors may need adjustment
|
|
|
|
3. **Test Locally**
|
|
- Run `npm run dev`
|
|
- Trigger manually with curl
|
|
- Debug with detailed logs
|
|
|
|
4. **Update Selectors**
|
|
- Visit https://www.vpsbenchmarks.com/
|
|
- Inspect actual HTML structure
|
|
- Update CSS selectors in `scrapeBenchmarksWithScrapeAPI()`
|
|
- Adjust parsing patterns in `extractBenchmarksFromHTML()`
|
|
|
|
### If Browser Rendering API Errors
|
|
|
|
1. **Check Quota**
|
|
- Verify not exceeding 10 minutes/day
|
|
- Check Cloudflare dashboard
|
|
|
|
2. **Check Network**
|
|
- Ensure Browser Rendering API is accessible
|
|
- Check for any Cloudflare service issues
|
|
|
|
3. **Adjust Timeout**
|
|
- Current: 30 seconds
|
|
- Increase if pages load slowly
|
|
- Decrease to fail faster if needed
|
|
|
|
### If Database Errors
|
|
|
|
1. **Verify Schema**
|
|
```bash
|
|
npx wrangler d1 execute cloud-instances-db --command="PRAGMA table_info(vps_benchmarks)"
|
|
```
|
|
|
|
2. **Check Unique Constraint**
|
|
```bash
|
|
npx wrangler d1 execute cloud-instances-db --command="SELECT * FROM sqlite_master WHERE type='index' AND tbl_name='vps_benchmarks'"
|
|
```
|
|
|
|
3. **Test Insertion**
|
|
- Check logs for specific SQL errors
|
|
- Verify field types match schema
|
|
- Ensure no NULL in NOT NULL columns
|
|
|
|
## Performance Monitoring
|
|
|
|
### Key Metrics
|
|
|
|
1. **Scraper Execution Time**
|
|
- Target: < 60 seconds per run
|
|
- Check "Completed in XXms" log message
|
|
|
|
2. **Benchmarks Found**
|
|
- Current database: 269 records
|
|
- Expected: Gradual growth from daily scrapes
|
|
- If consistently 0, parsing logic needs adjustment
|
|
|
|
3. **Error Rate**
|
|
- Parse errors: Structure changes on source site
|
|
- API errors: Quota or connectivity issues
|
|
- DB errors: Schema mismatches
|
|
|
|
4. **Browser Rendering Usage**
|
|
- Free tier: 10 min/day
|
|
- Expected: < 1 min/day
|
|
- Monitor Cloudflare dashboard
|
|
|
|
## Success Criteria
|
|
|
|
✅ Deployment successful
|
|
✅ All bindings configured
|
|
✅ Cron trigger active
|
|
⏳ First scrape pending (manual test or wait for cron)
|
|
⏳ Data extraction verified
|
|
⏳ Database updated with new records
|
|
|
|
## Next Steps
|
|
|
|
1. **Wait for Next Cron** (2026-01-25 09:00 UTC)
|
|
2. **Monitor Logs** (`npx wrangler tail`)
|
|
3. **Verify Database** (check for new records)
|
|
4. **Adjust if Needed** (update selectors based on actual results)
|
|
|
|
## Rollback Instructions
|
|
|
|
If critical issues occur:
|
|
|
|
```bash
|
|
# 1. Checkout previous version
|
|
git checkout <previous-commit>
|
|
|
|
# 2. Redeploy
|
|
npm run deploy
|
|
|
|
# 3. Verify
|
|
curl https://server-recommend.kappa-d8e.workers.dev/api/health
|
|
```
|
|
|
|
## Contact & Support
|
|
|
|
- **Project**: /Users/kaffa/server-recommend
|
|
- **Logs**: `npx wrangler tail`
|
|
- **Database**: `npx wrangler d1 execute cloud-instances-db`
|
|
- **Documentation**: See UPGRADE_SUMMARY.md and test-scraper.md
|
|
|
|
---
|
|
|
|
**Status**: ✅ Deployed and Ready
|
|
**Next Action**: Monitor first cron run or test manually
|