MCP Tooling Specification & Security/Compliance Checklist
This report details the Model Context Protocol (MCP) interface that allows external AI assistants to query the audit overlay, and provides a security/compliance checklist for IT approval. It also outlines a low‑risk pilot proposal.
---
1. MCP Server Interface
MCP (Model Context Protocol) is a JSON‑RPC‑based standard for tools that AI agents can invoke. Running the overlay as an MCP server enables direct integration with ChatGPT, Claude, or other MCP‑aware clients.
Transport
- Stdio: The server runs as a subprocess; communication via stdin/stdout. This is simplest and most secure (no network exposure).
- Alternative: TCP socket (disabled by default for security).
Available Tools
| Tool | Description | Parameters | Returns |
|------|-------------|------------|---------|
| `audit.search_events` | Free‑text search over event payloads | `query` (string) | list of matching events (id, type, timestamp, entity, snippet) |
| `audit.get_daily_summary` | Natural‑language summary for a date | `date` (YYYY‑MM‑DD) | `summary` (string) |
| `audit.get_anomalies` | Retrieve flagged anomalies | `start_date`, `end_date` (YYYY‑MM‑DD), optional `severity` (string) | list of anomalies with explanation |
| `audit.trace_entity` | Full timeline for an entity | `entity_type` (string), `entity_id` (string) | list of events sorted chronologically |
| `audit.export_report` | Export data in various formats | `format` ("json"|"csv"|"md"), `type` (e.g., "sales"), `start`, `end` dates | file content (base64 or direct) |
All tools are read‑only and execute in milliseconds for cached data. The `get_daily_summary` may trigger an on‑the‑fly generation if not pre‑computed; that’s the only LLM call that might take seconds.
Example MCP Call (JSON‑RPC)
json
{
"jsonrpc": "2.0",
"method": "audit.trace_entity",
"params": { "entity_type": "product", "entity_id": "SKU12345" },
"id": 42
}
Response:
json
{
"jsonrpc": "2.0",
"result": {
"events": [
{
"event_id": "uuid...",
"type": "inventory.adjustment",
"timestamp": "2026-02-15T10:30:00Z",
"payload": { "reason": "SALE", "qty_change": -2 }
},
...
]
},
"id": 42
}
Implementation in Go
The overlay binary will support a `mcp` subcommand:
bash
audit-overlay mcp --stdio
It will:
- Initialize storage (SQLite).
- Register tool handlers.
- Loop: read JSON‑RPC request from stdin, dispatch, write response to stdout.
- Respect cancellation via context.
No external dependencies beyond the core library.
---
2. Security & Compliance Checklist
When presenting to a customer’s IT or security team, address these points:
2.1 Privileges
Database:
- MySQL: Create a dedicated user with `REPLICATION SLAVE` privilege and `SELECT` on all tables to be captured. Use a read‑only replica if possible to avoid any performance impact on the primary.
- SQL Server: Enable CDC on required tables (one‑time DBA task). Create a login with `SELECT` on the CDC change tables and `EXECUTE` on the `cdc.fn_cdc_get_all_changes_*` functions.
- No `INSERT`/`UPDATE`/`DELETE` rights on production tables.
- The overlay never writes to the source database (offset storage is internal).
2.2 Network Layout
- All components run on a dedicated host or VM inside the customer’s network.
- Docker network isolates services; no inbound ports from the internet unless a UI is explicitly exposed behind the corporate firewall.
- Optional: Deploy behind an internal reverse proxy with TLS termination if remote access is needed (e.g., auditors from another office).
2.3 Encryption
- In transit:
- Database connections use TLS if the DB server supports it (MySQL `--require-secure-transport`, SQL Server `Encrypt=yes`).
- Internal HTTP between Debezium and overlay can be plaintext on localhost; if crossing hosts, enable HTTPS with self‑signed certs.
- At rest:
- SQLite database file stored with `0600` permissions (owner read/write only).
- Customer may enable disk‑level encryption (LUKS, BitLocker, etc.) as part of their standard.
- Secrets: Stored in environment variables or Docker secrets; never written to logs.
2.4 Audit Logging of the Overlay Itself
- The overlay logs all major actions (ingest errors, startup/shutdown, MCP requests) to a rotating file (`audit-overlay.log`) with size‑based rotation (e.g., 10 MB max, keep 5).
- Logs include timestamps but do not log raw event payloads (to avoid PII duplication). They may log metadata counts.
- Log file is readable only by the service account.
2.5 Data Minimization
Before any data leaves the host (e.g., if using a cloud LLM), we apply a masking filter:
- Maintain a configurable list of columns to redact (e.g., `users.username`, `customers.ssn`, `sales.credit_card_last4`).
- When preparing data for LLM summarization, those fields are replaced with `[REDACTED]` or hashed.
- By default, the MVP uses on‑prem LLM (Ollama), so no outbound traffic is required. Cloud LLM is opt‑in only after customer signs a data processing agreement.
2.6 On‑Prem vs Cloud LLM
- On‑prem (default): Ollama running locally (`localhost:11434`). Models like `llama3.2:3b` or `mistral:7b` give adequate summarization for demo purposes. No data leaves network; CPU usage acceptable on a modern server.
- Cloud (optional): OpenRouter or Anthropic API endpoints. Higher quality but requires outbound internet and a data‑processing agreement. We allow this via configuration (`LLM_BACKEND=cloud`, `LLM_API_KEY=…`). Secure transmission via HTTPS with API key.
Recommendation for pilot: Start on‑prem; if summaries are too weak, discuss cloud option with customer.
---
3. Pilot Proposal Outline (Low‑Risk Engagement)
Goal: Prove value in 2 weeks on a representative environment with minimal friction.
Installation
- Provide a single `docker-compose.yml` that starts Debezium, the overlay, and (optionally) a sample MySQL instance with synthetic pharmacy data.
- If the customer wants to use their own data, we take a fresh backup or replica, scrub PII (automated script), and point the overlay at it.
- One‑command install: `docker compose up -d`. Uninstall: `docker compose down -v` (removes all data).
Support
- Daily check‑ins for the first week.
- Remote monitoring via logs (customer can share log snippets).
Success Criteria
- Functional: All core features (daily summary, inventory anomaly, traceability) run without errors for the full 2‑week period.
- Auditability: The audit team reports they can answer at least 3 typical questions (e.g., “What happened to product X last Tuesday?”) faster than before.
- Anomaly detection: At least one true positive anomaly is identified that the customer acknowledges would have been missed manually.
- Usability: The MCP interface allows their internal AI assistant (if they have one) to retrieve information without writing custom SQL.
Exit
If the customer decides not to proceed, they simply stop the containers. No database changes were made beyond read‑only access; the overlay can be cleanly removed.
---
4. Recommended MVP Stack (Recap from Report 2)
- CDC Ingestion: Debezium Server (HTTP sink) for MySQL; future: native SQL Server poller.
- Normalization: Go service, canonical event structs.
- Storage: SQLite (upgradeable to Postgres).
- AI: Ollama on‑prem (default), cloud LLM as fallback.
- Interfaces: CLI (`auditctl`), MCP server (stdio), optional Web UI.
- Deployment: Docker Compose, single binary for Go service.
This stack meets the “easy install, on‑prem friendly, value in days” mandate.
---
Word count: ~950