cloud-orchestrator/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Cloudflare Worker-based AI server recommendation service. Uses OpenAI GPT-4o-mini (via AI Gateway), D1 database, KV cache, and VPS benchmark data to recommend cost-effective servers based on natural language requirements.

**Production URL**: `https://server-recommend.kappa-d8e.workers.dev`

## Commands

```bash
# Development
npm run dev              # Start local development server (wrangler dev)
npm run deploy           # Deploy to Cloudflare Workers
npm run typecheck        # TypeScript type checking

# Database operations (D1)
npx wrangler d1 execute cloud-instances-db --file=schema.sql        # Apply schema
npx wrangler d1 execute cloud-instances-db --file=seed.sql          # Seed data
npx wrangler d1 execute cloud-instances-db --file=fix-tech-specs.sql # Update tech specs
npx wrangler d1 execute cloud-instances-db --command="SELECT ..."   # Ad-hoc queries

# View logs
npx wrangler tail
```

## Architecture

```
src/
├── index.ts              # Main router, CORS, request handling
├── config.ts             # Configuration constants
├── types.ts              # TypeScript type definitions
├── utils.ts              # Utilities (bandwidth, response, AI, benchmarks, candidates, techSpecs)
└── handlers/
    ├── health.ts         # GET /api/health
    ├── servers.ts        # GET /api/servers - List servers with filtering
    └── recommend.ts      # POST /api/recommend - AI-powered recommendations
```

### Key Data Flow

1. User sends request (`tech_stack`, `expected_users`, `use_case`, `region_preference`)
2. Tech specs calculation with **DB workload multiplier** based on use_case
3. Candidate filtering with **flexible region matching**
4. VPS benchmarks retrieval (Geekbench 6), **prioritizing same provider**
5. AI analysis returns 3 tiers: Budget, Balanced, Premium
6. Results cached in KV (5 min TTL, empty results not cached)

### D1 Database Tables (cloud-instances-db)

- `providers` - Cloud providers (50+)
- `instance_types` - Server specifications
- `pricing` - Regional pricing
- `regions` - Geographic regions
- `tech_specs` - Resource requirements per technology (vcpu_per_users, min_memory_mb)
- `vps_benchmarks` - Geekbench 6 benchmark data (269 records)
- `benchmark_results` / `benchmark_types` / `processors` - Phoronix benchmark data

## Key Implementation Details

### DB Workload Multiplier (`recommend.ts`)

Database resource requirements vary by workload type, not just user count:

| Workload Type | Multiplier | Example Use Cases |
|---------------|------------|-------------------|
| Heavy | 0.3x | analytics, log server, reporting, dashboard |
| Medium-Heavy | 0.5x | e-commerce, ERP, CRM, community forum |
| Medium | 0.7x | API, SaaS, app backend |
| Light | 1.0x | blog, portfolio, documentation, wiki |

**Example**: PostgreSQL (vcpu_per_users: 200) with 1000 users
- Analytics dashboard: 1000 / (200 × 0.3) = 17 vCPU
- E-commerce: 1000 / (200 × 0.5) = 10 vCPU
- Personal blog: 1000 / (200 × 1.0) = 5 vCPU

### Bandwidth Estimation (`bandwidth.ts`)

Estimates monthly bandwidth based on use_case patterns:

| Pattern | Page Size | Pages/Day | Active Ratio |
|---------|-----------|-----------|--------------|
| E-commerce | 2.5MB | 20 | 40% |
| Streaming | 50MB | 5 | 20% |
| Analytics | 0.7MB | 30 | 50% |
| Blog/Content | 1.5MB | 4 | 30% |

Heavy bandwidth (>1TB/month) prefers Linode for included bandwidth.

### Flexible Region Matching (`candidates.ts`)

Region matching supports multiple input formats:
```sql
LOWER(r.region_code) = ? OR
LOWER(r.region_code) LIKE ? OR
LOWER(r.region_name) LIKE ? OR
LOWER(r.country_code) = ?
```

Valid inputs: `"korea"`, `"KR"`, `"seoul"`, `"ap-northeast-2"`, `"icn"`

### AI Prompt Strategy (`ai.ts`)

- Uses OpenAI GPT-4o-mini via Cloudflare AI Gateway (bypasses regional restrictions)
- Server list format: `[server_id=XXXX] Provider Name...` for accurate ID extraction
- Scoring: Cost efficiency (40%) + Capacity fit (30%) + Scalability (30%)
- Capacity response in Korean for Korean users

## Bindings (wrangler.toml)

```toml
[[kv_namespaces]]
binding = "CACHE"
id = "c68cdb477022424cbe4594f491390c8a"

[[d1_databases]]
binding = "DB"
database_name = "cloud-instances-db"
database_id = "bbcb472d-b25e-4e48-b6ea-112f9fffb4a8"

[vars]
OPENAI_API_KEY = "sk-..." # Set via wrangler secret
```

## Testing

```bash
# Health check
curl https://server-recommend.kappa-d8e.workers.dev/api/health

# Recommendation (e-commerce)
curl -X POST https://server-recommend.kappa-d8e.workers.dev/api/recommend \
  -H "Content-Type: application/json" \
  -d '{
    "tech_stack": ["php", "mysql"],
    "expected_users": 1000,
    "use_case": "e-commerce shopping mall",
    "region_preference": ["korea"]
  }'

# Recommendation (analytics - heavier DB workload)
curl -X POST https://server-recommend.kappa-d8e.workers.dev/api/recommend \
  -H "Content-Type: application/json" \
  -d '{
    "tech_stack": ["postgresql"],
    "expected_users": 500,
    "use_case": "analytics dashboard",
    "region_preference": ["japan"]
  }'
```

## Recent Changes

- **Modular architecture**: Split from single 2370-line file into 7 modules
- **DB workload multiplier**: Database resource calculation based on use_case
- **KV caching**: 5-minute cache with smart invalidation (empty results not cached)
- **OpenAI integration**: GPT-4o-mini via AI Gateway for better recommendations
- **Bandwidth estimation**: Automatic bandwidth category detection for provider filtering
- **Tech specs update**: Realistic vcpu_per_users values for 150+ technologies