feat: 대역폭 추정 및 DAU 표시 기능 추가

- 동시접속자 기반 월간 대역폭 자동 추정 - DAU(일일활성사용자) 추정치 표시 (동접 × 10-14) - 대역폭 기반 Linode/Vultr 자동 선택 로직 - 비용 분석에 대역폭 비용 포함 - 지역 미선택시 서울/도쿄/오사카/싱가포르 기본 표시 - 지역별 서버 분리 표시 (GROUP BY instance + region) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 09:40:36 +09:00
commit 4cb9da06dc
3337 changed files with 1048645 additions and 0 deletions
--- a/DEPLOYMENT_VERIFICATION.md
+++ b/DEPLOYMENT_VERIFICATION.md
@@ -0,0 +1,232 @@
+# Deployment Verification - VPSBenchmarks Scraper Upgrade
+
+## Deployment Status: ✅ SUCCESSFUL
+
+**Deployment Time**: 2026-01-24T01:11:30Z
+**Version ID**: 1fc24577-b50b-4f46-83ad-23d60f5fe7d3
+**Production URL**: https://server-recommend.kappa-d8e.workers.dev
+
+## Changes Deployed
+
+### 1. Cloudflare Browser Rendering API Integration
+- ✅ BROWSER binding configured in wrangler.toml
+- ✅ BROWSER: Fetcher added to Env interface
+- ✅ Successfully deployed and recognized by Workers
+
+### 2. Scraper Rewrite
+- ✅ Browser Rendering API integration complete
+- ✅ Multiple parsing strategies implemented
+- ✅ Error handling and fallback logic in place
+- ✅ Database deduplication logic updated
+
+### 3. Database Schema
+- ✅ Unique constraint for deduplication ready
+- ✅ ON CONFLICT clause aligned with schema
+
+### 4. Cron Trigger
+- ✅ Daily schedule: 0 9 * * * (9:00 AM UTC)
+- ✅ Auto-trigger configured and deployed
+
+## Bindings Verified
+
+```
+env.DB (cloud-instances-db)    D1 Database      ✅
+env.BROWSER                    Browser          ✅
+env.AI                         AI               ✅
+```
+
+## Next Automatic Scrape
+
+**Next Run**: Tomorrow at 9:00 AM UTC (2026-01-25 09:00:00 UTC)
+
+## Manual Testing
+
+### Option 1: Wait for Cron Trigger
+The scraper will automatically run daily at 9:00 AM UTC.
+
+### Option 2: Test Locally
+```bash
+# Terminal 1: Start dev server
+npm run dev
+
+# Terminal 2: Trigger scraper manually
+curl "http://localhost:8793/__scheduled?cron=0+9+*+*+*"
+```
+
+### Option 3: Monitor Logs
+```bash
+# Watch production logs in real-time
+npx wrangler tail
+
+# Wait for next cron trigger to see output
+```
+
+## Expected Log Output
+
+When the scraper runs (locally or in production), you should see:
+
+```
+[Scraper] Starting VPSBenchmarks.com scrape with Browser Rendering API
+[Scraper] Fetching rendered HTML from vpsbenchmarks.com
+[Scraper] Rendered HTML length: XXXXX
+[Scraper] Extracted X benchmarks from HTML
+[Scraper] Found X benchmark entries
+[DB] Inserted/Updated: Provider PlanName
+[Scraper] Completed in XXXms: X inserted, X skipped, X errors
+```
+
+## Verification Checklist
+
+- [x] Code deployed successfully
+- [x] All bindings configured (DB, BROWSER, AI)
+- [x] Health check endpoint responding
+- [x] Cron trigger scheduled
+- [ ] First scrape run (waiting for next cron or manual test)
+- [ ] Benchmarks extracted from vpsbenchmarks.com
+- [ ] New records inserted into D1 database
+
+## Database Verification
+
+After the first scrape run, verify data:
+
+```bash
+# Check total benchmark count
+npx wrangler d1 execute cloud-instances-db --command="SELECT COUNT(*) as total FROM vps_benchmarks"
+
+# View latest benchmarks
+npx wrangler d1 execute cloud-instances-db --command="SELECT provider_name, plan_name, geekbench_single, geekbench_multi, created_at FROM vps_benchmarks ORDER BY created_at DESC LIMIT 10"
+```
+
+## Browser Rendering API Usage
+
+### Free Tier Limits
+- **Quota**: 10 minutes per day
+- **Current Usage**: 0 minutes (first run pending)
+- **Estimated per Run**: < 1 minute
+- **Daily Capacity**: ~10 runs (only 1 scheduled)
+
+### Monitor Usage
+Check Cloudflare dashboard for Browser Rendering API usage metrics.
+
+## Troubleshooting
+
+### If No Benchmarks Found
+
+1. **Check Logs**
+   ```bash
+   npx wrangler tail
+   ```
+
+2. **Inspect Rendered HTML**
+   - Look for "Rendered HTML length" in logs
+   - If length is very small (< 10KB), page may not be loading
+   - If length is large but no data extracted, selectors may need adjustment
+
+3. **Test Locally**
+   - Run `npm run dev`
+   - Trigger manually with curl
+   - Debug with detailed logs
+
+4. **Update Selectors**
+   - Visit https://www.vpsbenchmarks.com/
+   - Inspect actual HTML structure
+   - Update CSS selectors in `scrapeBenchmarksWithScrapeAPI()`
+   - Adjust parsing patterns in `extractBenchmarksFromHTML()`
+
+### If Browser Rendering API Errors
+
+1. **Check Quota**
+   - Verify not exceeding 10 minutes/day
+   - Check Cloudflare dashboard
+
+2. **Check Network**
+   - Ensure Browser Rendering API is accessible
+   - Check for any Cloudflare service issues
+
+3. **Adjust Timeout**
+   - Current: 30 seconds
+   - Increase if pages load slowly
+   - Decrease to fail faster if needed
+
+### If Database Errors
+
+1. **Verify Schema**
+   ```bash
+   npx wrangler d1 execute cloud-instances-db --command="PRAGMA table_info(vps_benchmarks)"
+   ```
+
+2. **Check Unique Constraint**
+   ```bash
+   npx wrangler d1 execute cloud-instances-db --command="SELECT * FROM sqlite_master WHERE type='index' AND tbl_name='vps_benchmarks'"
+   ```
+
+3. **Test Insertion**
+   - Check logs for specific SQL errors
+   - Verify field types match schema
+   - Ensure no NULL in NOT NULL columns
+
+## Performance Monitoring
+
+### Key Metrics
+
+1. **Scraper Execution Time**
+   - Target: < 60 seconds per run
+   - Check "Completed in XXms" log message
+
+2. **Benchmarks Found**
+   - Current database: 269 records
+   - Expected: Gradual growth from daily scrapes
+   - If consistently 0, parsing logic needs adjustment
+
+3. **Error Rate**
+   - Parse errors: Structure changes on source site
+   - API errors: Quota or connectivity issues
+   - DB errors: Schema mismatches
+
+4. **Browser Rendering Usage**
+   - Free tier: 10 min/day
+   - Expected: < 1 min/day
+   - Monitor Cloudflare dashboard
+
+## Success Criteria
+
+✅ Deployment successful
+✅ All bindings configured
+✅ Cron trigger active
+⏳ First scrape pending (manual test or wait for cron)
+⏳ Data extraction verified
+⏳ Database updated with new records
+
+## Next Steps
+
+1. **Wait for Next Cron** (2026-01-25 09:00 UTC)
+2. **Monitor Logs** (`npx wrangler tail`)
+3. **Verify Database** (check for new records)
+4. **Adjust if Needed** (update selectors based on actual results)
+
+## Rollback Instructions
+
+If critical issues occur:
+
+```bash
+# 1. Checkout previous version
+git checkout <previous-commit>
+
+# 2. Redeploy
+npm run deploy
+
+# 3. Verify
+curl https://server-recommend.kappa-d8e.workers.dev/api/health
+```
+
+## Contact & Support
+
+- **Project**: /Users/kaffa/server-recommend
+- **Logs**: `npx wrangler tail`
+- **Database**: `npx wrangler d1 execute cloud-instances-db`
+- **Documentation**: See UPGRADE_SUMMARY.md and test-scraper.md
+
+---
+
+**Status**: ✅ Deployed and Ready
+**Next Action**: Monitor first cron run or test manually