feat: 대역폭 추정 및 DAU 표시 기능 추가

- 동시접속자 기반 월간 대역폭 자동 추정
- DAU(일일활성사용자) 추정치 표시 (동접 × 10-14)
- 대역폭 기반 Linode/Vultr 자동 선택 로직
- 비용 분석에 대역폭 비용 포함
- 지역 미선택시 서울/도쿄/오사카/싱가포르 기본 표시
- 지역별 서버 분리 표시 (GROUP BY instance + region)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
kappa
2026-01-25 09:40:36 +09:00
commit 4cb9da06dc
3337 changed files with 1048645 additions and 0 deletions

232
DEPLOYMENT_VERIFICATION.md Normal file
View File

@@ -0,0 +1,232 @@
# Deployment Verification - VPSBenchmarks Scraper Upgrade
## Deployment Status: ✅ SUCCESSFUL
**Deployment Time**: 2026-01-24T01:11:30Z
**Version ID**: 1fc24577-b50b-4f46-83ad-23d60f5fe7d3
**Production URL**: https://server-recommend.kappa-d8e.workers.dev
## Changes Deployed
### 1. Cloudflare Browser Rendering API Integration
- ✅ BROWSER binding configured in wrangler.toml
- ✅ BROWSER: Fetcher added to Env interface
- ✅ Successfully deployed and recognized by Workers
### 2. Scraper Rewrite
- ✅ Browser Rendering API integration complete
- ✅ Multiple parsing strategies implemented
- ✅ Error handling and fallback logic in place
- ✅ Database deduplication logic updated
### 3. Database Schema
- ✅ Unique constraint for deduplication ready
- ✅ ON CONFLICT clause aligned with schema
### 4. Cron Trigger
- ✅ Daily schedule: 0 9 * * * (9:00 AM UTC)
- ✅ Auto-trigger configured and deployed
## Bindings Verified
```
env.DB (cloud-instances-db) D1 Database ✅
env.BROWSER Browser ✅
env.AI AI ✅
```
## Next Automatic Scrape
**Next Run**: Tomorrow at 9:00 AM UTC (2026-01-25 09:00:00 UTC)
## Manual Testing
### Option 1: Wait for Cron Trigger
The scraper will automatically run daily at 9:00 AM UTC.
### Option 2: Test Locally
```bash
# Terminal 1: Start dev server
npm run dev
# Terminal 2: Trigger scraper manually
curl "http://localhost:8793/__scheduled?cron=0+9+*+*+*"
```
### Option 3: Monitor Logs
```bash
# Watch production logs in real-time
npx wrangler tail
# Wait for next cron trigger to see output
```
## Expected Log Output
When the scraper runs (locally or in production), you should see:
```
[Scraper] Starting VPSBenchmarks.com scrape with Browser Rendering API
[Scraper] Fetching rendered HTML from vpsbenchmarks.com
[Scraper] Rendered HTML length: XXXXX
[Scraper] Extracted X benchmarks from HTML
[Scraper] Found X benchmark entries
[DB] Inserted/Updated: Provider PlanName
[Scraper] Completed in XXXms: X inserted, X skipped, X errors
```
## Verification Checklist
- [x] Code deployed successfully
- [x] All bindings configured (DB, BROWSER, AI)
- [x] Health check endpoint responding
- [x] Cron trigger scheduled
- [ ] First scrape run (waiting for next cron or manual test)
- [ ] Benchmarks extracted from vpsbenchmarks.com
- [ ] New records inserted into D1 database
## Database Verification
After the first scrape run, verify data:
```bash
# Check total benchmark count
npx wrangler d1 execute cloud-instances-db --command="SELECT COUNT(*) as total FROM vps_benchmarks"
# View latest benchmarks
npx wrangler d1 execute cloud-instances-db --command="SELECT provider_name, plan_name, geekbench_single, geekbench_multi, created_at FROM vps_benchmarks ORDER BY created_at DESC LIMIT 10"
```
## Browser Rendering API Usage
### Free Tier Limits
- **Quota**: 10 minutes per day
- **Current Usage**: 0 minutes (first run pending)
- **Estimated per Run**: < 1 minute
- **Daily Capacity**: ~10 runs (only 1 scheduled)
### Monitor Usage
Check Cloudflare dashboard for Browser Rendering API usage metrics.
## Troubleshooting
### If No Benchmarks Found
1. **Check Logs**
```bash
npx wrangler tail
```
2. **Inspect Rendered HTML**
- Look for "Rendered HTML length" in logs
- If length is very small (< 10KB), page may not be loading
- If length is large but no data extracted, selectors may need adjustment
3. **Test Locally**
- Run `npm run dev`
- Trigger manually with curl
- Debug with detailed logs
4. **Update Selectors**
- Visit https://www.vpsbenchmarks.com/
- Inspect actual HTML structure
- Update CSS selectors in `scrapeBenchmarksWithScrapeAPI()`
- Adjust parsing patterns in `extractBenchmarksFromHTML()`
### If Browser Rendering API Errors
1. **Check Quota**
- Verify not exceeding 10 minutes/day
- Check Cloudflare dashboard
2. **Check Network**
- Ensure Browser Rendering API is accessible
- Check for any Cloudflare service issues
3. **Adjust Timeout**
- Current: 30 seconds
- Increase if pages load slowly
- Decrease to fail faster if needed
### If Database Errors
1. **Verify Schema**
```bash
npx wrangler d1 execute cloud-instances-db --command="PRAGMA table_info(vps_benchmarks)"
```
2. **Check Unique Constraint**
```bash
npx wrangler d1 execute cloud-instances-db --command="SELECT * FROM sqlite_master WHERE type='index' AND tbl_name='vps_benchmarks'"
```
3. **Test Insertion**
- Check logs for specific SQL errors
- Verify field types match schema
- Ensure no NULL in NOT NULL columns
## Performance Monitoring
### Key Metrics
1. **Scraper Execution Time**
- Target: < 60 seconds per run
- Check "Completed in XXms" log message
2. **Benchmarks Found**
- Current database: 269 records
- Expected: Gradual growth from daily scrapes
- If consistently 0, parsing logic needs adjustment
3. **Error Rate**
- Parse errors: Structure changes on source site
- API errors: Quota or connectivity issues
- DB errors: Schema mismatches
4. **Browser Rendering Usage**
- Free tier: 10 min/day
- Expected: < 1 min/day
- Monitor Cloudflare dashboard
## Success Criteria
✅ Deployment successful
✅ All bindings configured
✅ Cron trigger active
⏳ First scrape pending (manual test or wait for cron)
⏳ Data extraction verified
⏳ Database updated with new records
## Next Steps
1. **Wait for Next Cron** (2026-01-25 09:00 UTC)
2. **Monitor Logs** (`npx wrangler tail`)
3. **Verify Database** (check for new records)
4. **Adjust if Needed** (update selectors based on actual results)
## Rollback Instructions
If critical issues occur:
```bash
# 1. Checkout previous version
git checkout <previous-commit>
# 2. Redeploy
npm run deploy
# 3. Verify
curl https://server-recommend.kappa-d8e.workers.dev/api/health
```
## Contact & Support
- **Project**: /Users/kaffa/server-recommend
- **Logs**: `npx wrangler tail`
- **Database**: `npx wrangler d1 execute cloud-instances-db`
- **Documentation**: See UPGRADE_SUMMARY.md and test-scraper.md
---
**Status**: ✅ Deployed and Ready
**Next Action**: Monitor first cron run or test manually