Files
cloud-orchestrator/CLAUDE.md
kappa 411cde4801 feat: add region diversity, HTML report, and transfer pricing
Region Diversity:
- No region specified → same spec from 3 different regions
- Cache key now includes region_preference
- Fixed server_id to use ap.id (pricing) instead of ai.id (instance)

HTML Report:
- New /api/recommend/report endpoint for printable reports
- Supports multi-language (en, ko, ja, zh)
- Displays bandwidth_info with proper KRW formatting

Transfer Pricing:
- bandwidth_info includes overage costs from anvil_transfer_pricing
- available_regions shows alternative regions with prices

Code Quality:
- Extracted region-utils.ts for flexible region matching
- Cleaned up AI prompt (removed obsolete provider references)
- Renamed project to cloud-orchestrator

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 02:49:24 +09:00

12 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Cloudflare Worker-based AI server recommendation service. Uses OpenAI GPT-4o-mini (via AI Gateway), D1 database, KV cache, and VPS benchmark data to recommend cost-effective servers based on natural language requirements.

Production URL: https://cloud-orchestrator.kappa-d8e.workers.dev

Commands

# Development
npm run dev              # Start local development server (wrangler dev)
npm run deploy           # Deploy to Cloudflare Workers
npm run typecheck        # TypeScript type checking

# Database operations (D1)
npx wrangler d1 execute cloud-instances-db --file=schema.sql        # Apply schema
npx wrangler d1 execute cloud-instances-db --file=seed.sql          # Seed data
npx wrangler d1 execute cloud-instances-db --file=fix-tech-specs.sql # Update tech specs
npx wrangler d1 execute cloud-instances-db --command="SELECT ..."   # Ad-hoc queries

# View logs
npx wrangler tail

Architecture

src/
├── index.ts              # Main router, CORS, request handling
├── config.ts             # Configuration constants
├── types.ts              # TypeScript type definitions
├── utils.ts              # Utilities (bandwidth, response, AI, benchmarks, candidates, techSpecs)
├── region-utils.ts       # Region matching utilities (flexible region conditions)
└── handlers/
    ├── health.ts         # GET /api/health
    ├── servers.ts        # GET /api/servers - List servers with filtering
    ├── recommend.ts      # POST /api/recommend - AI-powered recommendations
    └── report.ts         # GET /api/recommend/report - HTML report generation

Key Data Flow

  1. User sends request (tech_stack, expected_users, use_case, region_preference)
  2. Tech specs calculation with DB workload multiplier based on use_case
  3. Candidate filtering with flexible region matching
  4. VPS benchmarks retrieval (Geekbench 6), prioritizing same provider
  5. AI analysis returns 3 tiers: Budget, Balanced, Premium
  6. Results cached in KV (5 min TTL, empty results not cached)

D1 Database Tables (cloud-instances-db)

Primary tables (Anvil pricing):

  • anvil_instances - Anvil server specifications (vcpus, memory_gb, disk_gb, transfer_tb, etc.)
  • anvil_regions - Anvil data center regions (name, display_name, country_code)
  • anvil_pricing - Anvil pricing data (monthly_price in USD)
  • anvil_transfer_pricing - Transfer/bandwidth overage pricing by region (price_per_gb in USD)

Support tables:

  • tech_specs - Resource requirements per technology (vcpu_per_users, min_memory_mb)
  • vps_benchmarks - Geekbench 6 benchmark data (269 records)
  • benchmark_results / benchmark_types / processors - Phoronix benchmark data

Legacy tables (no longer used):

  • providers, instance_types, pricing, regions - Old Linode/Vultr data

Key Implementation Details

DB Workload Multiplier (recommend.ts)

Database resource requirements vary by workload type, not just user count:

Workload Type Multiplier Example Use Cases
Heavy 0.3x analytics, log server, reporting, dashboard
Medium-Heavy 0.5x e-commerce, ERP, CRM, community forum
Medium 0.7x API, SaaS, app backend
Light 1.0x blog, portfolio, documentation, wiki

Example: PostgreSQL (vcpu_per_users: 200) with 1000 users

  • Analytics dashboard: 1000 / (200 × 0.3) = 17 vCPU
  • E-commerce: 1000 / (200 × 0.5) = 10 vCPU
  • Personal blog: 1000 / (200 × 1.0) = 5 vCPU

Bandwidth Estimation (bandwidth.ts)

Estimates monthly bandwidth based on use_case patterns:

Pattern Page Size Pages/Day Active Ratio
E-commerce 2.5MB 20 40%
Streaming 50MB 5 20%
Analytics 0.7MB 30 50%
Blog/Content 1.5MB 4 30%

Heavy bandwidth (>1TB/month) triggers warning about overage costs.

Flexible Region Matching

Both /api/recommend and /api/servers use buildFlexibleRegionConditionsAnvil() for anvil_regions:

LOWER(ar.name) = ? OR
LOWER(ar.name) LIKE ? OR
LOWER(ar.display_name) LIKE ? OR
LOWER(ar.country_code) = ?

Valid inputs: "korea", "KR", "seoul", "tokyo", "japan", "ap-northeast-2", "icn"

Country names are auto-expanded via COUNTRY_NAME_TO_REGIONS mapping.

Exchange Rate Handling (utils.ts)

  • Korean users (lang=ko) see prices in KRW
  • Exchange rate fetched from open.er-api.com with 1-hour KV cache
  • Fallback rate: 1450 KRW/USD if API unavailable

Transfer Pricing (bandwidth_info)

Each recommendation includes bandwidth_info with transfer/bandwidth cost details:

Field Description KRW Rounding
included_transfer_tb Free bandwidth included in plan (TB/month) -
overage_cost_per_gb Overage cost per GB 1원 단위
overage_cost_per_tb Overage cost per TB 100원 단위
estimated_monthly_tb Estimated monthly usage (TB) -
estimated_overage_tb Estimated overage (TB) -
estimated_overage_cost Estimated overage charges 100원 단위
total_estimated_cost Server + overage total 100원 단위
currency "USD" or "KRW" -

Data sources:

  • included_transfer_tb: From anvil_instances.transfer_tb
  • overage_cost_per_gb: From anvil_transfer_pricing.price_per_gb

HTML Report Endpoint (handlers/report.ts)

GET /api/recommend/report?data={base64}&lang={en|ko}

Generates printable/PDF-friendly HTML report from recommendation results.

Parameters:

  • data: Base64-encoded JSON of recommendation response
  • lang: Language (en, ko, ja, zh) - defaults to 'en'

Usage:

// Get recommendations
const result = await fetch('/api/recommend', {...});
const data = await result.json();

// Generate report URL
const reportUrl = `/api/recommend/report?data=${btoa(JSON.stringify(data))}&lang=ko`;
window.open(reportUrl); // Opens printable HTML

Region-Based Recommendation Strategy (recommend.ts)

When region IS specified (e.g., region_preference: ["seoul"]):

  • Returns 3 spec tiers (Budget/Balanced/Premium) within that region
  • Example: Seoul 1 - Standard 4GB, Standard 8GB, Pro 16GB

When NO region specified:

  • Returns same/similar spec from 3 DIFFERENT regions for location comparison
  • Example: Standard 4GB from Osaka 2, Seoul 1, Singapore 1
  • Implemented by sending only 1 server per region to AI (forces diversity)

AI Prompt Strategy (recommend.ts)

  • Uses OpenAI GPT-4o-mini via Cloudflare AI Gateway (bypasses regional restrictions)
  • Server list format: [server_id=XXXX] Provider Name... for accurate ID extraction
  • server_id uses ap.id (pricing ID, unique per instance+region combination)
  • Scoring: Cost efficiency (40%) + Capacity fit (30%) + Scalability (30%)
  • Capacity response in Korean for Korean users
  • Prompt injection protection: User inputs sanitized via sanitizeForAIPrompt()
  • Token optimization: Candidates limited to top 15 by price

Security Features

  • Input validation: Comprehensive checks with length limits (tech_stack ≤20, use_case ≤500 chars)
  • Rate limiting: 60 req/min per IP with in-memory fallback when KV unavailable
  • SQL injection prevention: All queries use parameterized statements
  • Security headers: CSP, HSTS, X-Frame-Options, X-Content-Type-Options
  • API key protection: Keys never logged, sanitized from error messages

Configuration (config.ts)

Centralized limits and constants:

LIMITS = {
  MAX_REQUEST_BODY_BYTES: 10240,    // 10KB
  CACHE_TTL_SECONDS: 300,            // 5 minutes
  RATE_LIMIT_MAX_REQUESTS: 60,       // per minute
  MAX_AI_CANDIDATES: 15,             // reduce API cost
  MAX_TECH_STACK: 20,
  MAX_USE_CASE_LENGTH: 500,
}

Bindings (wrangler.toml)

[[kv_namespaces]]
binding = "CACHE"
id = "c68cdb477022424cbe4594f491390c8a"

[[d1_databases]]
binding = "DB"
database_name = "cloud-instances-db"
database_id = "bbcb472d-b25e-4e48-b6ea-112f9fffb4a8"

[vars]
OPENAI_API_KEY = "sk-..." # Set via wrangler secret

Testing

Note: Use single-line curl commands. Backslash line continuation (\) may not work in some environments.

# Health check
curl -s https://cloud-orchestrator.kappa-d8e.workers.dev/api/health | jq .

# Recommendation - nodejs/redis real-time chat (Japan)
curl -s -X POST https://cloud-orchestrator.kappa-d8e.workers.dev/api/recommend -H "Content-Type: application/json" -d '{"tech_stack":["nodejs","redis"],"expected_users":1000,"use_case":"real-time chat","region_preference":["japan"]}' | jq .

# Recommendation - php/mysql community forum (Korea)
curl -s -X POST https://cloud-orchestrator.kappa-d8e.workers.dev/api/recommend -H "Content-Type: application/json" -d '{"tech_stack":["php","mysql"],"expected_users":800,"use_case":"community forum","region_preference":["korea"]}' | jq .

# Recommendation - analytics dashboard (heavier DB workload)
curl -s -X POST https://cloud-orchestrator.kappa-d8e.workers.dev/api/recommend -H "Content-Type: application/json" -d '{"tech_stack":["postgresql"],"expected_users":500,"use_case":"analytics dashboard","region_preference":["japan"]}' | jq .

# Server list with filters (supports flexible region: korea, seoul, tokyo, etc.)
curl -s "https://cloud-orchestrator.kappa-d8e.workers.dev/api/servers?region=korea&minCpu=4" | jq .

# HTML Report (encode recommendation result as base64)
# 1. Get recommendation and save to variable
RESULT=$(curl -s -X POST https://cloud-orchestrator.kappa-d8e.workers.dev/api/recommend -H "Content-Type: application/json" -d '{"tech_stack":["nodejs"],"expected_users":500,"use_case":"simple api","lang":"ko"}')
# 2. Generate report URL with base64-encoded data
REPORT_URL="https://cloud-orchestrator.kappa-d8e.workers.dev/api/recommend/report?data=$(echo $RESULT | base64 | tr -d '\n')&lang=ko"
# 3. Open in browser or fetch
echo $REPORT_URL

Recent Changes

Region Diversity & Bug Fixes (Latest)

  • Region diversity: No region specified → same spec from 3 different regions for comparison
  • Cache key fix: region_preference now included in cache key
  • Server ID fix: Changed from ai.id (instance) to ap.id (pricing) for unique region+instance identification
  • Prompt cleanup: Removed obsolete Linode/Vultr/DigitalOcean references (Anvil only)

Transfer Pricing & Reporting

  • Transfer pricing: Added anvil_transfer_pricing table data to recommendations
  • bandwidth_info: Each recommendation includes transfer costs (included_tb, overage costs)
  • available_regions: Lists other regions where same server spec is available with prices
  • HTML report: New /api/recommend/report endpoint for printable reports
  • KRW conversion: Bandwidth costs converted to KRW for Korean users (GB: 1원, TB/total: 100원 rounding)

Anvil Pricing Migration

  • New tables: Migrated from pricing to anvil_pricing tables
  • Provider: Now uses "Anvil" as single provider (previously Linode/Vultr)
  • Exchange rate: Real-time USD→KRW conversion via open.er-api.com
  • Removed: provider_filter parameter no longer supported
  • Currency handling: Korean users see KRW, others see USD

Architecture

  • Modular architecture: Split from single 2370-line file into organized modules
  • Centralized config: All magic numbers moved to LIMITS in config.ts
  • Type safety: parseAIResponse(unknown) with proper type guards

Features

  • DB workload multiplier: Database resource calculation based on use_case
  • Bandwidth estimation: Automatic bandwidth category detection for provider filtering
  • Tech specs update: Realistic vcpu_per_users values for 150+ technologies

Security

  • Prompt injection protection: sanitizeForAIPrompt() filters malicious patterns
  • Rate limiting fallback: In-memory Map when KV unavailable
  • Input sanitization: All user inputs validated and length-limited

Performance

  • O(1) VPS lookup: Map-based benchmark matching (was O(n×m))
  • AI token optimization: Candidates limited to 15 (was 50)
  • KV caching: 5-minute TTL, empty results not cached
  • Parallel queries: Promise.all for independent DB operations

Code Quality

  • Dead code removed: Unused queryVPSBenchmarks function deleted
  • Flexible region matching: Both endpoints support country/city/code inputs (korea, seoul, icn)