← Back to project

Architecture Sketches



High-level diagrams and component breakdowns for the AI Audit Overlay MVP.

Mental Model (Initial)



Legacy enterprise app → MySQL database (read-only replica or CDC-enabled primary)

We attach a CDC connector (e.g., Debezium) to the database's binary log. It streams row-level change events to a message broker (Kafka? or direct to consumer). Our service consumes these events, normalizes them into a canonical schema, infers business events (e.g., "inventory adjustment", "sale", "user role change"), stores them in an append-only event log, and provides AI-powered summaries and anomaly detection.

Key idea: We are an overlay — no app code changes. We observe data mutations and reconstruct business processes.

Component Map (Draft)




[MySQL binlog]
↓ (Debezium MySQL connector)
[CDC events] → Kafka (optional intermediary; could be direct)
↓ (ingestion service)
[RawChangeEvent] → Normalizer → [CanonicalChangeEvent]
↓ (inference engine)
[BusinessEvent] → Append-only Event Store (Postgres/SQLite)

[AI Layer] ← query/index service

[CLI + Web UI + MCP]


Simpler for MVP: Debezium → direct gRPC/HTTP consumer → normalization → inference → SQLite/Postgres → FastAPI/Go service → CLI/Web/MCP.

Data Flow



1. Capture: CDC connector reads binlog, produces change events (insert/update/delete with before/after images).
2. Normalize: Convert DB-specific event format to `CanonicalChangeEvent` (table, operation, pk, old, new, timestamp, transaction id).
3. Infer: Apply rules/ML to map sequences of changes to higher-level business events (e.g., "stock transfer" detected from inventory adjustments + location change).
4. Store: Persist events immutably; build secondary indexes for fast query (by entity, time, type).
5. AI: Precompute daily summaries; detect anomalies via heuristics + LLM explanations; semantic search over event corpus.
6. Serve: CLI commands, optional web dashboard, MCP tools for external agents.


Constraints & Assumptions



Open Questions (to move to scrapbook)



Keep this updated as architecture evolves.