Initial commit: CrowdSec BunnyCDN bouncer

- sync/bouncer.py: CrowdSec LAPI → bloom filter → BunnyCDN Edge Script - edge/middleware.ts: BunnyCDN edge middleware with bloom filter + Turnstile CAPTCHA - README.md: architecture and deployment docs
2026-03-15 13:13:54 +09:00
parent 06af81c198
commit 1dcf2f448e
4 changed files with 855 additions and 68 deletions
--- a/README.md
+++ b/README.md
@@ -1,101 +1,132 @@
-# CrowdSec Bunny Bouncer
+# CrowdSec BunnyCDN Bouncer

-[CrowdSec](https://crowdsec.net) bouncer for [Bunny CDN](https://bunny.net). CrowdSec의 차단 목록을 Bunny CDN Edge Script와 연동하여 악성 IP를 CDN 엣지에서 차단합니다.
+CrowdSec LAPI에서 탐지한 악성 IP를 BunnyCDN 에지에서 선제 차단하는 bouncer.

-## Architecture
+## 아키텍처

 ```
-CrowdSec LAPI ──stream──▶ bouncer.py ──libSQL──▶ Bunny Database
-                                                       │
-                                               Edge Script (index.ts)
-                                               onOriginRequest 에서
-                                               DB 조회 → 403 / pass
+[CrowdSec LAPI (jp1, 10.253.100.240:8080)]
+       ↓ 크론잡 (3분 간격)
+       ↓ /v1/decisions/stream API
+       ↓
+[bouncer.py]
+  → IP 목록 → Bloom filter 생성
+  → BunnyCDN Compute API로 Edge Script 코드 업데이트
+       ↓
+[BunnyCDN Edge Script (middleware.ts)]
+  → X-Real-Ip 헤더에서 클라이언트 IP 추출
+  → Bloom filter로 IP 매칭
+  → 매칭되면 Turnstile CAPTCHA 챌린지
+  → 캡차 통과하면 4시간 검증 (verified_ips DB)
+  → Bloom filter에 없으면 오리진으로 통과
 ```

- **bouncer.py** — CrowdSec LAPI Streaming API를 폴링하여 차단 결정을 Bunny Database(libSQL HTTP API)에 동기화
- **edge-script/index.ts** — Bunny CDN Edge Script 미들웨어. 모든 요청에 대해 DB를 조회하고 차단 IP면 403 응답
- **setup.py** — Edge Script 코드를 Bunny CDN에 업로드 및 퍼블리시
- **monitor.py** — Edge Script 사용량, CDN 트래픽/에러, WAF 이벤트 로그 모니터링
+## 구성 요소

-## Features
+### sync/bouncer.py

- CrowdSec Streaming API 기반 실시간 동기화 (기본 60초 간격)
- CAPI(Community Blocklist) 포함 27,000+ IP 처리
- Edge Script 인메모리 캐시 (정상 IP 5분, 차단 IP 1분 TTL)
- Fail-open 설계 — DB 장애 시 요청 허용
- 6시간 주기 전체 재동기화
- 헬스체크 파일 기반 컨테이너 모니터링
+CrowdSec LAPI → BunnyCDN Edge Script 동기화 스크립트.

-## Prerequisites
+- CrowdSec LAPI stream endpoint에서 ban 결정 fetch
+- IP 목록을 bloom filter (FNV-1a 해시)로 변환
+- BunnyCDN Compute API로 Edge Script 코드 내 `BLOOM_B64` 상수 교체
+- 스크립트 퍼블리시

- CrowdSec LAPI 접속 가능 (bouncer API 키 필요)
- Bunny CDN 계정 + Pull Zone
- Bunny Database (libSQL)
- Bunny Edge Scripting 활성화
- Python 3.12+
- Docker/Podman (운영 환경)
+### edge/middleware.ts

-## Quick Start
+BunnyCDN Edge Script 미들웨어 (현재 inouter 풀존에 배포됨, script ID: 64811).

-### 1. Bunny Database 및 Edge Script 준비
+- `@bunny.net/edgescript-sdk` 사용
+- Bloom filter 기반 IP 차단
+- Cloudflare Turnstile CAPTCHA로 false positive 대응
+- LibSQL DB로 verified IP 4시간 캐싱
+- HMAC 서명 쿠키로 세션 유지
+- Clean IP 캐시 (네거티브 캐시, 최대 50K)

-Bunny 대시보드에서:
-1. Database 생성 → URL과 Token 기록
-2. Edge Scripting에서 스크립트 생성 → Script ID 기록
-3. Edge Script에 `BUNNY_DATABASE_URL`, `BUNNY_DATABASE_AUTH_TOKEN` 환경변수 설정
-4. Edge Script를 Pull Zone에 연결
+## 환경 변수

-### 2. 환경변수 설정
+### bouncer.py
+
+| 변수 | 설명 | 기본값 |
+|------|------|--------|
+| `CROWDSEC_LAPI_URL` | CrowdSec LAPI URL | `http://10.253.100.240:8080` |
+| `CROWDSEC_BOUNCER_KEY` | Bouncer API key | (required) |
+| `BUNNY_API_KEY` | BunnyCDN account API key | (required) |
+| `BUNNY_SCRIPT_ID` | Edge Script ID | `64811` |
+| `STATE_FILE` | 상태 파일 경로 | `/var/lib/crowdsec-bouncer/state.json` |
+
+### Edge Script Variables
+
+| 변수 | 설명 |
+|------|------|
+| `TURNSTILE_SITE_KEY` | Cloudflare Turnstile site key |
+| `TURNSTILE_SECRET_KEY` | Cloudflare Turnstile secret key |
+| `CACHE_MODE` | 캐시 모드 (auto) |
+| `BUNNY_DATABASE_URL` | LibSQL database URL |
+| `BUNNY_DATABASE_AUTH_TOKEN` | LibSQL auth token |
+
+## 배포
+
+### 1. CrowdSec bouncer 등록

 ```bash
-cp .env.example .env
-# .env 파일 편집
+# jp1 crowdsec 컨테이너에서
+cscli bouncers add bunny-cdn-bouncer
+# 출력된 API key를 Vault에 저장
 ```

-| 변수 | 필수 | 설명 |
-|------|------|------|
-| `CROWDSEC_LAPI_URL` | O | CrowdSec LAPI 주소 |
-| `CROWDSEC_LAPI_KEY` | O | Bouncer API 키 |
-| `BUNNY_DB_URL` | O | Bunny Database libSQL URL |
-| `BUNNY_DB_TOKEN` | O | Bunny Database 인증 토큰 |
-| `SYNC_INTERVAL` | | 동기화 간격 초 (기본: 60) |
-| `INCLUDE_CAPI` | | CAPI 차단 목록 포함 (기본: true) |
-| `FULL_RESYNC_INTERVAL` | | 전체 재동기화 간격 초 (기본: 21600) |
-| `LOG_LEVEL` | | 로그 레벨 (기본: INFO) |
-| `BUNNY_API_KEY` | | Bunny API 키 (setup.py, monitor.py용) |
-| `BUNNY_SCRIPT_ID` | | Edge Script ID (setup.py용) |
-
-### 3. Edge Script 배포
+### 2. Vault에 시크릿 저장

 ```bash
-python setup.py
+vault kv put secret/infra/crowdsec-bunny-bouncer \
+  bouncer_key=<CROWDSEC_BOUNCER_KEY>
 ```

-### 4. Bouncer 실행
+### 3. Incus 컨테이너에 배포

 ```bash
-# Docker/Podman
-podman compose up -d
-
-# 또는 직접 실행
-pip install -r requirements.txt
-python bouncer.py
+# jp1 infra-tool 컨테이너
+pip3 install -r sync/requirements.txt
+cp sync/bouncer.py /opt/crowdsec-bouncer/bouncer.py
 ```

-## Monitoring
+### 4. Cron 설정

 ```bash
-# 기본 (최근 30일)
-python monitor.py
-
-# 기간 지정
-python monitor.py 7
+# /etc/cron.d/crowdsec-bunny-bouncer
+*/3 * * * * root CROWDSEC_BOUNCER_KEY="..." BUNNY_API_KEY="..." /usr/bin/python3 /opt/crowdsec-bouncer/bouncer.py >> /var/log/crowdsec-bouncer.log 2>&1
+# 매시 정각: 전체 동기화 (startup mode)
+3 * * * * root CROWDSEC_BOUNCER_KEY="..." BUNNY_API_KEY="..." /usr/bin/python3 /opt/crowdsec-bouncer/bouncer.py --startup >> /var/log/crowdsec-bouncer.log 2>&1
 ```

-출력 항목:
- **Edge Script 사용량** — 월간 요청 수, CPU 시간, 무료 한도(25M) 대비 비율
- **CDN 통계** — 총 요청, 캐시 히트율, 대역폭, 3xx/4xx/5xx 에러
- **WAF / Bunny Shield** — Shield 상태, WAF 룰 현황, 이벤트 로그 요약 (공격 유형, IP, 국가별)
+## 테스트
+
+```bash
+# 테스트 IP 추가
+ssh incus-jp1 "incus exec crowdsec -- cscli decisions add -i 198.51.100.1 -d 10m -R 'test'"
+
+# 동기화 실행
+python3 sync/bouncer.py --startup -v
+
+# 테스트 IP 제거
+ssh incus-jp1 "incus exec crowdsec -- cscli decisions delete -i 198.51.100.1"
+python3 sync/bouncer.py -v
+```
+
+## 풀존 적용 현황
+
+| 풀존 | ID | 미들웨어 | 상태 |
+|------|-----|---------|------|
+| inouter | 5316471 | 64811 | 적용됨 |
+| actions | 5330178 | - | 미적용 |
+
+## Bloom Filter 세부
+
+- 해시 함수: FNV-1a (32-bit) double hashing
+- False positive rate: 0.1%
+- 헤더: 8 bytes (m: uint32 LE, k: uint32 LE)
+- 페이로드: bit array
+- 버전: MD5 해시 앞 16자 (변경 감지용)

 ## Edge Script Pricing

--- a/edge/middleware.ts
+++ b/edge/middleware.ts
--- a/sync/bouncer.py
+++ b/sync/bouncer.py
@@ -0,0 +1,411 @@
+#!/usr/bin/env python3
+"""
+CrowdSec → BunnyCDN Bouncer Sync
+
+CrowdSec LAPI에서 ban 결정을 가져와 bloom filter로 변환 후
+BunnyCDN Edge Script 코드에 임베딩하여 배포.
+
+Usage:
+    python3 bouncer.py                   # 스트림 모드 (delta)
+    python3 bouncer.py --startup         # 전체 동기화
+    python3 bouncer.py --dry-run         # 변경 없이 출력만
+
+Environment:
+    CROWDSEC_LAPI_URL       CrowdSec LAPI URL (default: http://10.253.100.240:8080)
+    CROWDSEC_BOUNCER_KEY    Bouncer API key
+    BUNNY_API_KEY           BunnyCDN account API key
+    BUNNY_SCRIPT_ID         Edge Script ID (default: 64811)
+"""
+
+import argparse
+import base64
+import hashlib
+import json
+import logging
+import math
+import os
+import re
+import struct
+import sys
+import time
+from ipaddress import IPv4Network, ip_address
+from typing import Optional
+
+import requests
+
+logging.basicConfig(
+    level=logging.INFO,
+    format="%(asctime)s [%(levelname)s] %(message)s",
+)
+log = logging.getLogger("bouncer")
+
+# ---------------------------------------------------------------------------
+# Config
+# ---------------------------------------------------------------------------
+
+CROWDSEC_LAPI_URL = os.environ.get("CROWDSEC_LAPI_URL", "http://10.253.100.240:8080")
+CROWDSEC_BOUNCER_KEY = os.environ.get("CROWDSEC_BOUNCER_KEY", "")
+BUNNY_API_KEY = os.environ.get("BUNNY_API_KEY", "")
+BUNNY_SCRIPT_ID = int(os.environ.get("BUNNY_SCRIPT_ID", "64811"))
+
+# Bloom filter parameters
+BLOOM_FP_RATE = 0.001  # 0.1% false positive rate
+BLOOM_MIN_ITEMS = 100  # minimum expected items (avoid tiny filters)
+
+# State file for stream cursor
+STATE_FILE = os.environ.get("STATE_FILE", "/var/lib/crowdsec-bouncer/state.json")
+
+# ---------------------------------------------------------------------------
+# Bloom filter (FNV-1a — must match edge script)
+# ---------------------------------------------------------------------------
+
+
+def fnv1a32(data: bytes) -> int:
+    h = 0x811C9DC5
+    for b in data:
+        h ^= b
+        h = (h * 0x01000193) & 0xFFFFFFFF
+    return h
+
+
+class BloomFilter:
+    def __init__(self, expected_items: int, fp_rate: float = 0.001):
+        n = max(expected_items, BLOOM_MIN_ITEMS)
+        self.m = max(self._optimal_m(n, fp_rate), 64)  # bits
+        self.k = max(self._optimal_k(self.m, n), 1)    # hash functions
+        byte_count = (self.m + 7) // 8
+        self.bits = bytearray(byte_count)
+
+    @staticmethod
+    def _optimal_m(n: int, p: float) -> int:
+        return int(-n * math.log(p) / (math.log(2) ** 2))
+
+    @staticmethod
+    def _optimal_k(m: int, n: int) -> int:
+        return max(int((m / n) * math.log(2)), 1)
+
+    def add(self, item: str) -> None:
+        ip_bytes = item.encode("utf-8")
+        ip_bytes_ff = ip_bytes + b"\xff"
+        h1 = fnv1a32(ip_bytes)
+        h2 = fnv1a32(ip_bytes_ff)
+        for i in range(self.k):
+            pos = (h1 + i * h2) % self.m
+            self.bits[pos >> 3] |= 1 << (pos & 7)
+
+    def to_base64(self) -> str:
+        # Header: 4 bytes m (little-endian), 4 bytes k (little-endian)
+        header = struct.pack("<II", self.m, self.k)
+        payload = header + bytes(self.bits)
+        return base64.b64encode(payload).decode("ascii")
+
+    def version_hash(self) -> str:
+        return hashlib.md5(self.bits).hexdigest()[:16]
+
+
+# ---------------------------------------------------------------------------
+# CIDR expansion
+# ---------------------------------------------------------------------------
+
+
+def expand_cidr(value: str) -> list[str]:
+    """Expand a CIDR range to individual IPs (max /16)."""
+    try:
+        net = IPv4Network(value, strict=False)
+        if net.prefixlen < 16:
+            log.warning("Skipping too-large range: %s", value)
+            return []
+        if net.prefixlen == 32:
+            return [str(net.network_address)]
+        return [str(ip) for ip in net.hosts()]
+    except ValueError:
+        # Not a CIDR, treat as single IP
+        try:
+            ip_address(value)
+            return [value]
+        except ValueError:
+            log.warning("Invalid IP/CIDR: %s", value)
+            return []
+
+
+# ---------------------------------------------------------------------------
+# CrowdSec LAPI client
+# ---------------------------------------------------------------------------
+
+
+def load_state() -> Optional[str]:
+    """Load last stream cursor timestamp."""
+    try:
+        with open(STATE_FILE) as f:
+            data = json.load(f)
+            return data.get("last_pull")
+    except (FileNotFoundError, json.JSONDecodeError):
+        return None
+
+
+def save_state(timestamp: str) -> None:
+    """Save stream cursor timestamp."""
+    os.makedirs(os.path.dirname(STATE_FILE), exist_ok=True)
+    with open(STATE_FILE, "w") as f:
+        json.dump({"last_pull": timestamp}, f)
+
+
+def fetch_decisions(startup: bool = False) -> tuple[list[str], list[str]]:
+    """Fetch decisions from CrowdSec LAPI stream endpoint.
+
+    Returns (new_ips, deleted_ips).
+    """
+    url = f"{CROWDSEC_LAPI_URL}/v1/decisions/stream"
+    params = {"startup": "true"} if startup else {}
+
+    if not startup:
+        last_pull = load_state()
+        if last_pull:
+            # For non-startup, we only care about changes since last pull
+            pass
+
+    headers = {"X-Api-Key": CROWDSEC_BOUNCER_KEY}
+
+    try:
+        resp = requests.get(url, headers=headers, params=params, timeout=30)
+        resp.raise_for_status()
+    except requests.RequestException as e:
+        log.error("Failed to fetch decisions: %s", e)
+        return [], []
+
+    data = resp.json()
+    new_ips = []
+    deleted_ips = []
+
+    for decision in data.get("new") or []:
+        value = decision.get("value", "")
+        if decision.get("type", "").lower() == "ban":
+            new_ips.extend(expand_cidr(value))
+
+    for decision in data.get("deleted") or []:
+        value = decision.get("value", "")
+        deleted_ips.extend(expand_cidr(value))
+
+    save_state(time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()))
+    return new_ips, deleted_ips
+
+
+def fetch_all_active_decisions() -> list[str]:
+    """Fetch all currently active ban decisions."""
+    url = f"{CROWDSEC_LAPI_URL}/v1/decisions"
+    headers = {"X-Api-Key": CROWDSEC_BOUNCER_KEY}
+
+    try:
+        resp = requests.get(url, headers=headers, params={"type": "ban"}, timeout=30)
+        resp.raise_for_status()
+    except requests.RequestException as e:
+        log.error("Failed to fetch all decisions: %s", e)
+        return []
+
+    decisions = resp.json() or []
+    ips = []
+    for d in decisions:
+        value = d.get("value", "")
+        ips.extend(expand_cidr(value))
+    return ips
+
+
+# ---------------------------------------------------------------------------
+# IP state management
+# ---------------------------------------------------------------------------
+
+
+class IPState:
+    """Manages the set of currently blocked IPs."""
+
+    def __init__(self, state_file: str):
+        self._file = state_file.replace("state.json", "blocked_ips.json")
+        self.ips: set[str] = set()
+        self._load()
+
+    def _load(self) -> None:
+        try:
+            with open(self._file) as f:
+                data = json.load(f)
+                self.ips = set(data.get("ips", []))
+        except (FileNotFoundError, json.JSONDecodeError):
+            self.ips = set()
+
+    def save(self) -> None:
+        os.makedirs(os.path.dirname(self._file), exist_ok=True)
+        with open(self._file, "w") as f:
+            json.dump({"ips": sorted(self.ips)}, f)
+
+    def apply_delta(self, new_ips: list[str], deleted_ips: list[str]) -> bool:
+        """Apply changes and return True if the set changed."""
+        before = len(self.ips)
+        self.ips.update(new_ips)
+        self.ips -= set(deleted_ips)
+        after = len(self.ips)
+        changed = before != after or bool(new_ips) or bool(deleted_ips)
+        if changed:
+            self.save()
+        return changed
+
+    def full_sync(self, all_ips: list[str]) -> bool:
+        """Full replacement. Returns True if changed."""
+        new_set = set(all_ips)
+        if new_set == self.ips:
+            return False
+        self.ips = new_set
+        self.save()
+        return True
+
+    def build_bloom(self) -> BloomFilter:
+        bf = BloomFilter(len(self.ips))
+        for ip in self.ips:
+            bf.add(ip)
+        return bf
+
+
+# ---------------------------------------------------------------------------
+# BunnyCDN Edge Script updater
+# ---------------------------------------------------------------------------
+
+BLOOM_PATTERN = re.compile(
+    r'(const BLOOM_B64\s*=\s*")([^"]*)(";)',
+    re.DOTALL,
+)
+VERSION_PATTERN = re.compile(
+    r'(const BLOOM_VERSION\s*=\s*")([^"]*)(";)',
+)
+
+
+def get_current_script() -> Optional[str]:
+    """Get current Edge Script source code."""
+    url = f"https://api.bunny.net/compute/script/{BUNNY_SCRIPT_ID}/code"
+    headers = {"AccessKey": BUNNY_API_KEY}
+
+    try:
+        resp = requests.get(url, headers=headers, timeout=30)
+        resp.raise_for_status()
+        return resp.json().get("Code", "")
+    except requests.RequestException as e:
+        log.error("Failed to get script: %s", e)
+        return None
+
+
+def update_script(code: str) -> bool:
+    """Update Edge Script source code and publish."""
+    url = f"https://api.bunny.net/compute/script/{BUNNY_SCRIPT_ID}/code"
+    headers = {
+        "AccessKey": BUNNY_API_KEY,
+        "Content-Type": "application/json",
+    }
+
+    try:
+        resp = requests.post(url, headers=headers, json={"Code": code}, timeout=60)
+        resp.raise_for_status()
+    except requests.RequestException as e:
+        log.error("Failed to update script: %s", e)
+        return False
+
+    # Publish
+    pub_url = f"https://api.bunny.net/compute/script/{BUNNY_SCRIPT_ID}/publish"
+    try:
+        resp = requests.post(pub_url, headers=headers, json={}, timeout=60)
+        resp.raise_for_status()
+        log.info("Script published successfully")
+        return True
+    except requests.RequestException as e:
+        log.error("Failed to publish script: %s", e)
+        return False
+
+
+def embed_bloom_in_script(code: str, bloom: BloomFilter) -> Optional[str]:
+    """Replace BLOOM_B64 and BLOOM_VERSION in script code."""
+    b64 = bloom.to_base64()
+    version = bloom.version_hash()
+
+    new_code = BLOOM_PATTERN.sub(rf'\g<1>{b64}\g<3>', code)
+    if new_code == code:
+        log.error("Could not find BLOOM_B64 in script")
+        return None
+
+    new_code = VERSION_PATTERN.sub(rf'\g<1>{version}\g<3>', new_code)
+    return new_code
+
+
+# ---------------------------------------------------------------------------
+# Main
+# ---------------------------------------------------------------------------
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(description="CrowdSec → BunnyCDN Bouncer")
+    parser.add_argument("--startup", action="store_true", help="Full sync (startup mode)")
+    parser.add_argument("--dry-run", action="store_true", help="Don't deploy changes")
+    parser.add_argument("--verbose", "-v", action="store_true")
+    args = parser.parse_args()
+
+    if args.verbose:
+        logging.getLogger().setLevel(logging.DEBUG)
+
+    if not CROWDSEC_BOUNCER_KEY:
+        log.error("CROWDSEC_BOUNCER_KEY not set")
+        return 1
+    if not BUNNY_API_KEY:
+        log.error("BUNNY_API_KEY not set")
+        return 1
+
+    state = IPState(STATE_FILE)
+    log.info("Current blocked IPs: %d", len(state.ips))
+
+    if args.startup:
+        log.info("Full sync mode (startup)")
+        new_ips, _ = fetch_decisions(startup=True)
+        # startup=true returns all active decisions in "new"
+        changed = state.full_sync(new_ips)
+    else:
+        log.info("Delta sync mode")
+        new_ips, deleted_ips = fetch_decisions(startup=False)
+        if new_ips:
+            log.info("New bans: %d IPs", len(new_ips))
+        if deleted_ips:
+            log.info("Removed bans: %d IPs", len(deleted_ips))
+        changed = state.apply_delta(new_ips, deleted_ips)
+
+    log.info("Total blocked IPs after sync: %d", len(state.ips))
+
+    if not changed:
+        log.info("No changes — skipping deploy")
+        return 0
+
+    bloom = state.build_bloom()
+    b64 = bloom.to_base64()
+    version = bloom.version_hash()
+    log.info("Bloom filter: m=%d, k=%d, size=%d bytes, version=%s",
+             bloom.m, bloom.k, len(b64), version)
+
+    if args.dry_run:
+        log.info("[DRY RUN] Would update script with %d IPs", len(state.ips))
+        log.info("[DRY RUN] Bloom b64 length: %d chars", len(b64))
+        return 0
+
+    code = get_current_script()
+    if code is None:
+        return 1
+
+    new_code = embed_bloom_in_script(code, bloom)
+    if new_code is None:
+        return 1
+
+    # Check if bloom actually changed
+    if new_code == code:
+        log.info("Bloom filter unchanged — skipping deploy")
+        return 0
+
+    if update_script(new_code):
+        log.info("Successfully deployed bloom filter with %d IPs", len(state.ips))
+        return 0
+
+    return 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/sync/requirements.txt
+++ b/sync/requirements.txt
@@ -0,0 +1 @@
+requests>=2.31.0