Implementation Plan: Engineering Backlog & Schedule
This report translates the MVP architecture into concrete engineering tasks, grouped by milestone, with clear acceptance criteria and difficulty estimates. It also provides a realistic 60‑day schedule assuming a small team (1–2 engineers).
---
Backlog by Milestone
Milestone 1: CDC Ingestion (Days 1–10)
Goal: Establish raw change capture from MySQL and verify delivery to overlay.
Tasks:
1. Set up MySQL test instance (S)
- Install MySQL 8 on a VM or Docker; enable binlog (`log_bin=ON`, `binlog_format=ROW`, `binlog_row_image=FULL`).
- Create `pharmacy` schema (from Report 3 DDL).
- Acceptance: `mysql -u root -e "SHOW BINARY LOGS"` shows output; schema exists.
2. Install Debezium Server (HTTP sink) (M)
- Download `debezium-server-2.x.x.Final.tar.gz`; configure `application.properties` for MySQL connector and HTTP endpoint (`http://localhost:8080/ingest`).
- Start Debezium; monitor log for connector start.
- Acceptance: Debezium logs “Connected to MySQL” and “Sending events to http://…”.
3. Implement `/ingest` HTTP endpoint in Go (S)
- POST handler decodes JSON array into `RawChangeEvent` structs; validates required fields; forwards to internal channel.
- Returns `200 OK` with count; `400` on bad payload.
- Acceptance: `curl -X POST -d '[{...}]' http://localhost:8080/ingest` returns 200; logs show received.
4. End‑to‑end connectivity test (S)
- Make a change in MySQL (insert/update); confirm Debezium posts to `/ingest` and Go service receives.
- Acceptance: Log timestamp matches; payload contains expected table.
---
Milestone 2: Normalization & Storage (Days 5–12)
Goal: Persist canonical events with deduplication and transaction support.
Tasks:
5. Define CanonicalChangeEvent and conversion (M)
- Implement `Normalize(raw RawChangeEvent) (CanonicalChangeEvent, error)`.
- Generate `EventID` as UUID v5: `namespace="cdc"`, `name= source.table + pk + txid + seq`.
- Unit test with sample raw events.
- Acceptance: `go test ./normalize` passes; deterministic IDs.
6. SQLite schema and repository (S)
- SQL: `CREATE TABLE events (...)` with indexes (timestamp, entity_type+entity_id, transaction_id).
- Go repository: `Append(ctx, canonical CanonicalChangeEvent) error`.
- Acceptance: `INSERT` works; indexes created.
7. Idempotent ingestion (S)
- `INSERT OR IGNORE` by `event_id`; if conflict, skip.
- Acceptance: Duplicate `Append` does not increase row count.
8. Transaction atomicity (M)
- Ensure all canonical events belonging to the same database transaction are stored in one SQLite transaction. If any fails, roll back whole batch.
- Acceptance: Simulate failure on second event; no events persisted.
---
Milestone 3: Inference Engine (Days 10–20)
Goal: Produce BusinessEvents from canonical changes.
Tasks:
9. Rule engine core (M)
- Accept a slice of `CanonicalChangeEvent`s that share the same `TransactionID`; apply all rules; return zero or more `BusinessEvent`s.
- Acceptance: Given known transaction, returns expected types.
10. Sale completion inference (S)
- Detect `sales` insert + multiple `sales_items` inserts within same transaction.
- Build `sale.completed` event with aggregated items.
- Acceptance: Transaction produces correct sale with line totals.
11. Inventory adjustment inference (S)
- Detect `inventory` update where `quantity` changes. Derive reason from `adjustment_reason` column; if absent, mark as `OTHER`.
- Acceptance: Adjustment event includes `old_qty`, `new_qty`, `reason`.
12. User role change inference (S)
- Detect `users` update where `role_id` value changes.
- Acceptance: `user.role_changed` emitted with old/new.
13. Unit test suite (M)
- Cover all rule patterns; target 80%+ coverage.
- Acceptance: `go test ./... -cover` ≥ 80%.
---
Milestone 4: AI Layer (Days 15–30)
Goal: Add summaries, anomaly detection, and LLM‑based explanations.
Tasks:
14. Ollama client integration (S)
- HTTP client to `http://localhost:11434/api/generate`; model `llama3.2:3b`.
- Timeout 30s; retry 3×.
- Acceptance: “Hello world” returns completion.
15. Daily summary batch job (M)
- Cron‑like scheduler inside overlay (or external cron) runs at 23:30.
- Query today’s `BusinessEvent`s; summarize via Ollama with prompt: “Summarize in 3 paragraphs…”.
- Store result in table `daily_summaries(date, text)`.
- Acceptance: Next day summary available via API.
16. Heuristic anomaly detection (M)
- Compute rolling stats for inventory adjustments, branch sales, user activity. Flag outliers (z>3 or after‑hours).
- Persist to `anomalies` table with `severity` and `reason`.
- Acceptance: Known injected anomaly appears in `anomalies`.
17. LLM‑based anomaly explanations (M)
- For each new anomaly, call Ollama with context (event payload, history) to produce a short explanation; store.
- Cache explanations to avoid repeated calls.
- Acceptance: Explanations are coherent and reference data.
18. Caching layer (S)
- In‑memory LRU cache for summaries and anomaly explanations (keyed by date or anomaly ID). TTL 24h.
- Acceptance: Repeated calls within TTL hit cache (log message).
---
Milestone 5: Interfaces (Days 25–35)
Goal: Provide usable CLI and MCP access.
Tasks:
19. CLI `auditctl` – summary (S)
- Command: `auditctl summary --date YYYY-MM-DD`.
- Fetches from `daily_summaries` or triggers on‑the‑fly generation.
- Acceptance: Prints summary to stdout.
20. CLI – anomalies (S)
- `auditctl anomalies --start ... --end ...` prints list.
- Acceptance: Shows stored anomalies.
21. CLI – trace (S)
- `auditctl trace <entity_type> <entity_id>` prints event timeline.
- Acceptance: Correct ordering and content.
22. CLI – export (S)
- `auditctl export --format csv --type sales --start ... --end ...` writes file.
- Acceptance: CSV file opens with headers.
23. MCP server implementation (M)
- Subcommand `audit-overlay mcp` runs JSON‑RPC loop on stdin/stdout.
- Implement all five tools; proper error handling.
- Acceptance: Connect via ChatGPT or test client; calls succeed.
24. (Stretch) Web UI (L)
- Single‑page React app (Vite) served via `embed.FS`; pages: Dashboard (recent events, summary), Anomalies list, Entity trace.
- Acceptance: UI loads at `http://localhost:8081`; data displays.
---
Milestone 6: Security & Polish (Days 30–40)
Goal: Harden installation and meet enterprise concerns.
Tasks:
25. Data masking config (S)
- Define `maskColumns` map[table][]column; apply before any LLM submission.
- Acceptance: Masked fields replaced with `[REDACTED]` in prompts.
26. Secrets management (S)
- Support environment variables and Docker secrets; fallback to `.env` with strict file perm check (`0600`).
- Acceptance: Service starts with secrets loaded; logs do not contain them.
27. Self‑audit logging (S)
- Structured logs (JSON) to rotating file; include timestamps, levels, component.
- Acceptance: Log rotates after size limit; old files compressed.
28. Docker Compose bundle (M)
- `docker-compose.yml` with services: `mysql` (optional), `debezium`, `audit-overlay`.
- Volumes for data persistence; healthchecks.
- Acceptance: `docker compose up -d` brings all up; overlay healthy.
29. Installation guide (M)
- Markdown doc: prerequisites, step‑by‑step, verification, troubleshooting, uninstall.
- Acceptance: Internally reviewed for clarity.
---
Milestone 7: Synthetic Data & Demo (Days 35–50)
Goal: Deliver a compelling demo out of the box.
Tasks:
30. Synthetic data generator development (M)
- Go program that creates branches, products, users, suppliers; then simulates 90 days of activity with configurable anomaly rates.
- Acceptance: Database populated with approx 1M events; realistic distributions.
31. Integrate generator into Docker Compose (S)
- Add `init-generator` service that runs once against MySQL and then exits.
- Acceptance: `docker compose up` automatically seeds data; overlay starts with data available.
32. Demo script and slide notes (S)
- Documented demo flow: show daily summary, trace an entity, highlight an anomaly, use MCP query.
- Acceptance: Sales team can run demo without engineering help.
---
Milestone 8: Pilot Prep & Documentation (Days 45–60)
Goal: Finalize package for customer pilot.
Tasks:
33. End‑to‑end testing & tuning (M)
- Run overlay on synthetic data for full 2‑week simulated period; tune anomaly thresholds to reduce false positives.
- Acceptance: Anomaly list matches expectations; summary quality acceptable.
34. Operator’s manual (M)
- Config reference, monitoring, log inspection, updating, support contacts.
- Acceptance: Document covers all CLI flags and config options.
35. Pilot proposal template (S)
- Editable Markdown → PDF template with scope, success criteria, timeline, exit clause.
- Acceptance: Can generate customer‑specific proposal in <1 hour.
36. Code cleanup & release tagging (M)
- Remove debug flags; ensure graceful shutdown; add basic Prometheus metrics (events ingested, errors).
- Tag v0.1.0; build Docker image.
- Acceptance: `docker pull localhost/audit-overlay:0.1.0` runs cleanly.
---
Summary Statistics
- Total tasks: 36 (mix of S/M/L).
- Estimated effort: Roughly 1.5–2 engineer‑months (assuming 5–6 tasks in parallel possible). Fits 60‑day timeline with slack.
---
60‑Day Schedule
| Week | Focus | Deliverables |
|------|-------|--------------|
| Days 1‑10 | Ingestion & connectivity | Tasks 1–4; raw events flowing into Go service. |
| Days 11‑20 | Normalization + inference core | Tasks 5–13; BusinessEvents stored; unit tests. |
| Days 21‑30 | AI layer + basic CLI | Tasks 14–18; daily summary + anomaly detection; CLI summary/ anomalies/ trace commands working. |
| Days 31‑40 | MCP + security hardening | Tasks 19–29; MCP server; Docker Compose; data masking; logging. |
| Days 41‑50 | Synthetic data + demo | Tasks 30–32; generator integrated; demo script. |
| Days 51‑60 | Polishing + pilot kit | Tasks 33–36; final testing; operator manual; release. |
Critical path: Ingestion → Normalization → Inference → AI → CLI. Parallelizable: security, docs, generator.
---
Risks & Mitigations
- LLM quality on‑prem: If Ollama model is too weak, summaries may be poor. Mitigation: have cloud fallback ready; try a slightly larger model (7B) if hardware permits.
- Performance: High event volume could overwhelm SQLite. Mitigation: Use WAL mode, tune cache; monitor; consider Postgres upgrade path.
- Synthetic data realism: Over‑tuned generator may not reflect real anomalies. Mitigation: involve domain expert (Antonio) to review and adjust patterns.
- Team bandwidth: 36 tasks across 60 days is aggressive with 1 engineer. Mitigation: prioritize MVP features (Report 4 core set); push stretch features (Web UI, advanced MCP tools) to post‑pilot.
---
Conclusion
This plan provides a realistic path to a working MVP by Day 60. The backlog is broken into small, testable increments, and the schedule builds in slack for iteration. Success depends on maintaining the architecture boundaries (canonical events, rule engine, AI abstraction) so that improvements can be made post‑MVP without rewrites.
---
Word count: ~1,150