obsidian/history/2026-04-08-anomaly-detect-iterations.md

---
date: 2026-04-08
topic: anomaly-detect 3차 재설계 과정 (gemma → cohort → agentic Grok-4)
areas:
  - infra/anomaly-detect.md
  - infra/crowdsec-safeline.md
tags: [history, anomaly-detect, crowdsec, ai, grok]
---

anomaly-detect 시스템의 3번의 설계 반복과 최종 agentic 구조 확정 과정.

## 1차 구현 (2026-04-08 초반) — gemma4:e4b + stats 파이프라인

per-IP 통계 게이트(count/4xx/5xx/499/distinct paths) → 후보 N개 → ollama gemma4:e4b(Q4_K_M, 8.0B) yes/no 분류. Python이 축을 정의하고 LLM은 판정만 수행하는 구조.

## 2차 구현 (같은 날) — cohort 탐지 추가

`_cohort_path_candidates`, `_cohort_ua_candidates` 등 집단 축 탐지 로직 추가. 여전히 "Python이 탐지, LLM이 분류"라는 본질적 한계 동일.

## 코드 리뷰 수정 2회

### 초기 리뷰 (커밋 d5310f0)
- LAPI POST 실패 시 dedup 선기록 버그 수정
- XFF CSV 파싱 버그 수정 (`extract_client_ip()` 헬퍼)
- 사설망/Tailscale 필터 개선 (`ipaddress.ip_address().is_private`)
- dedup.json 원자적 쓰기 (`tempfile` + `os.replace`)

### 2차 리뷰 (커밋 b0e3c68) — High 5건 / Medium 4건
- H1: `events_count` 프로토콜 오용 → sample 10건을 events에 풀어서 넣기
- H2: `limit=20000` raw 로그 pull → `| stats by (remote_addr)` 서버측 집계
- H4: prompt injection via path → `json.dumps` + 경고 삽입
- H5: `num_predict=80` 한국어 truncation → 256
- M1~M4, M7: start_at/stop_at, 파일 핸들 누수, ratio에 499 포함, set cap, housekeeping

### 남은 설계 이슈
- H3: 분산 봇넷 대비 게이트 사각지대 (per-IP 게이트의 한계)

## 3차 재설계 — agentic 구조 (최종)

사용자 원래 의도 "시계열 DB를 AI에 연결하면 AI가 알아서 찾는다"에 부합하도록 전면 재설계. OpenRouter `x-ai/grok-4-fast`에 tool 2개(logsql_query, ban_ips)만 노출하는 agentic 구조.

### 모델 벤치마크 결과

| 모델 | 턴 | 비용 | 결과 |
|------|-----|------|------|
| x-ai/grok-4-fast | 4 | $0.0036 | 정답 |
| qwen/qwen3-235b-a22b-2507 | 7 | $0.0012 | 정답 |
| google/gemini-2.0-flash-001 | 10(max) | $0.0026 | 결론 없음 |
| deepseek/deepseek-chat-v3.1 | 10(max) | $0.0127 | 결론 없음 |

### E2E 검증
- VictoriaLogs에 270 rows 주입, Grok-4-fast 5턴에 31개 공격 IP 정확 식별, 정상 IP 0건 ban
- DRY_RUN=0 활성화, 실운영 개시

## 전임자 폐기

- `ddos-detect` (Go, jp1 crowdsec 컨테이너, 60s 폴링, Claude CLI sonnet 호출) — 60s 폴링 + 동기 Claude CLI 구조 한계로 폐기
- 제거 항목: `ddos-detect.service`, Go 바이너리+소스, `ddos-detect.sh`, `extract_behavior.py`, `ddos-logs/`

## 참조

- `infra/anomaly-detect.md` — 현재 agentic 구조
- Gitea: `kaffa/anomaly-detect`, `kaffa/ddos-detect`(보존)