Commit Graph

7 Commits

Author SHA1 Message Date
kaffa
a28d7fd646 Add time-period models, outlier filtering, and background retrain
- Split single IsolationForest into 4 period models (night/morning/afternoon/evening)
- Each period trained independently on its time window data
- Filter attack samples during retrain using existing model scores (threshold -0.5)
- Retrain runs in background thread with lock, inference continues uninterrupted
- New pickle format 'period_models' with automatic old format detection
- SIGUSR1 and auto-retrain both use background mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:43:38 +09:00
kaffa
3d1e353b1a Replace CSV traffic log with SQLite for better performance
- traffic_log.csv → traffic_log.db (SQLite with indexed timestamp)
- INSERT instead of CSV append, DELETE instead of file rewrite
- CLI queries use SQL (GROUP BY for traffic, LIMIT for log)
- retrain_from_log() uses read-only connection with time range query
- Config key: traffic_log_file → traffic_log_db

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:30:10 +09:00
kaffa
11c1ab0134 Add time-aware traffic logger and auto-retrain system
- Log traffic features with timestamps to CSV every 5s
- Add hour_sin/hour_cos time features (15 → 17 feature vector)
- Auto-retrain from traffic log at configurable interval (default 24h)
- Detect old 15-feature models and switch to learning mode
- SIGUSR1 now retrains from traffic log first, falls back to collect mode
- Add CLI: `ai traffic` (time-bucketed summary), `ai log` (recent entries)
- Add config keys: traffic_log_file, retention_days, retrain_window

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:14:07 +09:00
kaffa
667c6eac81 Fix 12 code review issues (4 MEDIUM + 8 LOW)
MEDIUM:
- M1: Whitelist direct IP/CIDR additions now persist to direct.txt
- M2: get_map_id() uses 5s TTL cache (single bpftool call for all maps)
- M3: IPv6 extension header parsing in xdp_ddos.c (hop-by-hop/routing/frag/dst)
- M4: Shell injection prevention - sanitize_input() + sys.argv[] for all Python calls

LOW:
- L1: Remove redundant self.running (uses _stop_event only)
- L2: Remove unused config values (rate_limit_after, cooldown_multiplier, retrain_interval)
- L3: Thread poll intervals reloaded on SIGHUP
- L4: batch_map_operation counts only successfully written entries
- L5: Clarify unique_ips_approx comment (per-packet counter)
- L6: Document LRU_HASH multi-CPU race condition as acceptable
- L7: Download Cloudflare IPv6 ranges in whitelist preset
- L8: Fix file handle leak in xdp_country.py list_countries()

Also: SIGHUP now preserves EWMA/violation state, daemon skips whitelisted
IPs in EWMA/AI escalation, deep copy for default config, IHL validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 09:23:41 +09:00
kaffa
dbfcb62cdf Add README.md with full documentation in Korean
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:50:59 +09:00
kaffa
069d09339c Support direct IP/CIDR in whitelist add/del commands
Previously whitelist only accepted named presets (cloudflare, aws, etc).
Now `xdp-defense whitelist add 8.8.8.8/32` works directly for both
IPv4 and IPv6 addresses, writing to the shared pinned BPF map.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:44:14 +09:00
kaffa
1bcaddce25 Unify xdp-blocker and xdp-ddos into single xdp-defense project
Chain two XDP programs via libxdp dispatcher on the same interface:
xdp_blocker (priority 10) handles CIDR/country/whitelist blocking,
xdp_ddos (priority 20) handles rate limiting, EWMA analysis, and AI
anomaly detection. Whitelist maps are shared via BPF map pinning so
whitelisted IPs bypass both blocklist checks and DDoS rate limiting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:39:21 +09:00