Files

kappa 4cb9da06dc feat: 대역폭 추정 및 DAU 표시 기능 추가

- 동시접속자 기반 월간 대역폭 자동 추정
- DAU(일일활성사용자) 추정치 표시 (동접 × 10-14)
- 대역폭 기반 Linode/Vultr 자동 선택 로직
- 비용 분석에 대역폭 비용 포함
- 지역 미선택시 서울/도쿄/오사카/싱가포르 기본 표시
- 지역별 서버 분리 표시 (GROUP BY instance + region)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-25 09:40:36 +09:00

6.1 KiB

Raw Permalink Blame History

Deployment Verification - VPSBenchmarks Scraper Upgrade

Deployment Status: ✅ SUCCESSFUL

Deployment Time: 2026-01-24T01:11:30Z Version ID: 1fc24577-b50b-4f46-83ad-23d60f5fe7d3 Production URL: https://server-recommend.kappa-d8e.workers.dev

Changes Deployed

1. Cloudflare Browser Rendering API Integration

✅ BROWSER binding configured in wrangler.toml
✅ BROWSER: Fetcher added to Env interface
✅ Successfully deployed and recognized by Workers

2. Scraper Rewrite

✅ Browser Rendering API integration complete
✅ Multiple parsing strategies implemented
✅ Error handling and fallback logic in place
✅ Database deduplication logic updated

3. Database Schema

✅ Unique constraint for deduplication ready
✅ ON CONFLICT clause aligned with schema

4. Cron Trigger

✅ Daily schedule: 0 9 * * * (9:00 AM UTC)
✅ Auto-trigger configured and deployed

Bindings Verified

env.DB (cloud-instances-db)    D1 Database      ✅
env.BROWSER                    Browser          ✅
env.AI                         AI               ✅

Next Automatic Scrape

Next Run: Tomorrow at 9:00 AM UTC (2026-01-25 09:00:00 UTC)

Manual Testing

Option 1: Wait for Cron Trigger

The scraper will automatically run daily at 9:00 AM UTC.

Option 2: Test Locally

# Terminal 1: Start dev server
npm run dev

# Terminal 2: Trigger scraper manually
curl "http://localhost:8793/__scheduled?cron=0+9+*+*+*"

Option 3: Monitor Logs

# Watch production logs in real-time
npx wrangler tail

# Wait for next cron trigger to see output

Expected Log Output

When the scraper runs (locally or in production), you should see:

[Scraper] Starting VPSBenchmarks.com scrape with Browser Rendering API
[Scraper] Fetching rendered HTML from vpsbenchmarks.com
[Scraper] Rendered HTML length: XXXXX
[Scraper] Extracted X benchmarks from HTML
[Scraper] Found X benchmark entries
[DB] Inserted/Updated: Provider PlanName
[Scraper] Completed in XXXms: X inserted, X skipped, X errors

Verification Checklist

Code deployed successfully
All bindings configured (DB, BROWSER, AI)
Health check endpoint responding
Cron trigger scheduled
First scrape run (waiting for next cron or manual test)
Benchmarks extracted from vpsbenchmarks.com
New records inserted into D1 database

Database Verification

After the first scrape run, verify data:

# Check total benchmark count
npx wrangler d1 execute cloud-instances-db --command="SELECT COUNT(*) as total FROM vps_benchmarks"

# View latest benchmarks
npx wrangler d1 execute cloud-instances-db --command="SELECT provider_name, plan_name, geekbench_single, geekbench_multi, created_at FROM vps_benchmarks ORDER BY created_at DESC LIMIT 10"

Browser Rendering API Usage

Free Tier Limits

Quota: 10 minutes per day
Current Usage: 0 minutes (first run pending)
Estimated per Run: < 1 minute
Daily Capacity: ~10 runs (only 1 scheduled)

Monitor Usage

Check Cloudflare dashboard for Browser Rendering API usage metrics.

Troubleshooting

If No Benchmarks Found

Check Logs
```
npx wrangler tail
```
Inspect Rendered HTML
- Look for "Rendered HTML length" in logs
- If length is very small (< 10KB), page may not be loading
- If length is large but no data extracted, selectors may need adjustment
Test Locally
- Run npm run dev
- Trigger manually with curl
- Debug with detailed logs
Update Selectors
- Visit https://www.vpsbenchmarks.com/
- Inspect actual HTML structure
- Update CSS selectors in scrapeBenchmarksWithScrapeAPI()
- Adjust parsing patterns in extractBenchmarksFromHTML()

If Browser Rendering API Errors

Check Quota
- Verify not exceeding 10 minutes/day
- Check Cloudflare dashboard
Check Network
- Ensure Browser Rendering API is accessible
- Check for any Cloudflare service issues
Adjust Timeout
- Current: 30 seconds
- Increase if pages load slowly
- Decrease to fail faster if needed

If Database Errors

Verify Schema

npx wrangler d1 execute cloud-instances-db --command="PRAGMA table_info(vps_benchmarks)"

Check Unique Constraint

npx wrangler d1 execute cloud-instances-db --command="SELECT * FROM sqlite_master WHERE type='index' AND tbl_name='vps_benchmarks'"

Test Insertion
- Check logs for specific SQL errors
- Verify field types match schema
- Ensure no NULL in NOT NULL columns

Performance Monitoring

Key Metrics

Scraper Execution Time
- Target: < 60 seconds per run
- Check "Completed in XXms" log message
Benchmarks Found
- Current database: 269 records
- Expected: Gradual growth from daily scrapes
- If consistently 0, parsing logic needs adjustment
Error Rate
- Parse errors: Structure changes on source site
- API errors: Quota or connectivity issues
- DB errors: Schema mismatches
Browser Rendering Usage
- Free tier: 10 min/day
- Expected: < 1 min/day
- Monitor Cloudflare dashboard

Success Criteria

✅ Deployment successful ✅ All bindings configured ✅ Cron trigger active ⏳ First scrape pending (manual test or wait for cron) ⏳ Data extraction verified ⏳ Database updated with new records

Next Steps

Wait for Next Cron (2026-01-25 09:00 UTC)
Monitor Logs (npx wrangler tail)
Verify Database (check for new records)
Adjust if Needed (update selectors based on actual results)

Rollback Instructions

If critical issues occur:

# 1. Checkout previous version
git checkout <previous-commit>

# 2. Redeploy
npm run deploy

# 3. Verify
curl https://server-recommend.kappa-d8e.workers.dev/api/health

Contact & Support

Project: /Users/kaffa/server-recommend
Logs: npx wrangler tail
Database: npx wrangler d1 execute cloud-instances-db
Documentation: See UPGRADE_SUMMARY.md and test-scraper.md

Status: ✅ Deployed and Ready Next Action: Monitor first cron run or test manually

6.1 KiB Raw Permalink Blame History