Audit Intelligence MVP Features

The AI Audit Overlay must produce immediate, tangible value for a pilot customer. These eight features are designed to be sellable: they solve real audit and operational intelligence problems with minimal setup. Each combines a simple heuristic baseline with LLM‑powered explanations, making them both transparent (“why was I alerted?”) and scalable.

---

1. Daily Audit Summary

What: At the end of each day (configurable time), generate a natural‑language overview of the day’s business activity, highlighting anomalies and significant changes.

Signals: All `BusinessEvent`s for the day, plus any flagged anomalies.

Heuristic baseline: Collect the day’s events, group by type, count per branch. Emit a plain bullet list: “X sales, Y inventory adjustments, Z user role changes.”

AI improvement: Feed aggregated stats and top‑N events (by impact) into an LLM with a prompt: “You are an audit assistant. Write a 3‑paragraph summary of today’s activity. Start with the most critical points (anomalies, high‑value events). Then give a brief overview of normal operations. End with any open questions.” The output is human‑readable, insightful, and can be posted to a channel or saved as a report.

Pilot metrics:

% reduction in time auditors spend assembling daily reports (baseline vs overlay).

Qualitative feedback on usefulness of narrative.

Number of anomalies surfaced that were previously missed.

---

2. Inventory Anomaly Detection

What: Detect suspicious changes to inventory levels, which often indicate theft, fraud, or process breakdowns.

Signals: `inventory.adjustment` events (reason_code includes DAMAGE, THEFT, COUNT, SUPPLIER_ERROR) and also adjustments linked to sales (reason_code=SALE) but out of expected pattern.

Heuristic baseline:

Track daily adjustment volume per (branch, product). Compute rolling 30‑day average and standard deviation. Flag if today’s count > µ+3σ.

Flag adjustments occurring outside business hours (e.g., 22:00–06:00).

Flag any `reason_code='THEFT'` automatically.

Flag adjustments where `notes` contain keywords (“lost”, “broken”) but no corresponding sale or PO.

AI improvement: For each flagged event, generate a concise explanation referencing related events if any. Example: “Large negative adjustment of 50 units of Product X at Branch 5 after hours, with reason ‘COUNT’. No matching sale in the last 24h. Possible data entry error or theft.” The LLM can also suggest further investigation steps.

Pilot metrics:

True positive rate: % of alerts that, upon manual review, indicate a real issue.

Time to detect a known injected anomaly (we’ll inject test thefts).

Reduction in month‑end inventory variance.

---

3. Suspicious User Activity

What: Identify risky behavior by system users (employees), such as unauthorized role changes, excessive data modifications, or after‑hours activity.

Signals: `user.role_changed`, `user.created`, `user.deactivated`, and also high‑frequency `inventory.adjustment` or `sale` events by the same `adjusted_by_user_id` or `cashier_user_id`.

Heuristic baseline:

Role changes not performed by admin roles → flag.

New user assigned a privileged role immediately → flag.

Count of modifications (inventory adjustments, sales deletions) per user per hour; if > threshold (e.g., 50 changes in 10 minutes) → flag.

Activity between 23:00–05:00 local time → highlight.

AI improvement: LLM creates a short behavioral note: “User ‘jdoe’ made 72 inventory adjustments in 8 minutes during off‑hours, all with reason ‘OTHER’. Recommend review.” It can also correlate with other events (e.g., “followed by a sale void”).

Pilot metrics:

Number of policy violations detected that were previously unknown.

False positive rate (alerts that are benign after review).

Time saved per audit cycle.

---

4. Branch‑to‑Branch Variance

What: Compare key metrics across branches to spot outliers that may need attention (fraud, training gaps, supply issues).

Signals: Daily aggregated data per branch: total sales, number of transactions, average basket size, inventory adjustment rate, number of users changed.

Heuristic baseline: For each metric, compute branch‑level z‑scores against the branch population for that day. Flag any branch with |z| > 2 on any metric. Also track week‑over‑week changes.

AI improvement: LLM summarizes: “Branch 3 sales dropped 35% compared to last week while its inventory adjustment rate doubled. Possible staffing issue or local supply problem.” It can suggest follow‑up questions.

Pilot metrics:

How many flagged branches had an identifiable issue (e.g., staff shortage, system outage)?

Management uptake: number of actions taken based on alerts.

---

5. Entity Traceability Report

What: Given an entity ID (product SKU, user, branch), produce a chronological audit trail of all events affecting that entity.

Signals: Any `BusinessEvent` where `entity_id` matches the query.

Heuristic baseline: Simple timeline listing: timestamp, event type, payload summary, source transaction ID.

AI improvement: LLM narrativizes the timeline: “Product P123 saw three price changes on Feb 10, followed by an unusual inventory adjustment. This could indicate price‑testing or manipulation.” The LLM can also answer natural‑language follow‑up questions (“Was there a sale around that time?”) by recalling events.

Pilot metrics:

Time saved during ad‑hoc investigations (e.g., “Why did inventory of drug X vanish?”).

Completeness: % of relevant events captured in timeline.

---

6. Sales Anomaly Detection (Stretch)

What: Detect patterns in sales that may indicate fraud or gaming.

Signals: `sale.completed` events.

Heuristic:

Flag sales with extreme `total_amount` (> 3σ above branch average).

Flag rapid sequence of small sales followed by a large refund (look for refund transactions if we model them; for MVP we may skip refunds).

Flag many sales with “test” cashier or self‑checkout anomalies.

AI improvement: LLM contextualizes: “Three sales over $1000 occurred just after a manual inventory adjustment of the same products. Verify for potential collusion.”

Pilot metrics: Fraud detection yield; false positives.

---

7. Purchase Order Fraud Detection (Stretch)

What: Spot suspicious supplier interactions.

Signals: `purchase_order.received` with `quantity_received` much higher than ordered, new suppliers suddenly receiving large orders, or repeated over‑billing.

Heuristic:

New supplier (first PO) > $X threshold → flag.

`(quantity_received - quantity_ordered) / quantity_ordered > 0.2` → flag.

Frequent supplier changes after a role change in procurement.

AI improvement: LLM generates a summary: “Supplier ‘ABC’ received 150% of ordered quantity on first order. Investigate potential invoice fraud.”

Pilot metrics: Cost savings from overbilling detection; supplier quality insights.

---

8. Role Change Alert (Stretch)

What: Immediate notification when a user’s role is escalated to a privileged level.

Signals: `user.role_changed` events.

Heuristic: Compare new role’s privilege level (we maintain a simple matrix: admin > manager > cashier > warehouse). If role increases, flag unless actor is an admin.

AI improvement: LLM produces a one‑sentence alert: “User ‘asmith’ promoted from cashier to manager by ‘admin1’ at 02:14 AM. Verify authorization.”

Pilot metrics: Detection of unauthorized promotions; reduction in privilege creep.

---

Implementation Notes

Caching: Daily summary is computed once per day and stored; subsequent requests use cached version.

Anomaly detection can run as a nightly batch over the previous day’s events (simpler for MVP). Real‑time alerts are stretch.

AI calls: Batch multiple explanations together to reduce LLM API usage. For on‑prem Ollama, throughput may be limited; schedule during off‑peak.

Data minimization: Before sending to LLM, strip any PII fields (customer names, PII from `users`). We’ll maintain a column blacklist per table.

Delivery: CLI commands (`auditctl summary`, `auditctl anomalies`), optional Web UI dashboard, MCP tools for external agents.

---

Recommended MVP Feature Set (Day 60)

To stay within 60 days, prioritize:

1. Daily Audit Summary (core)
2. Inventory Anomaly Detection (high business value)
3. Suspicious User Activity (security)
4. Entity Traceability (investigation aid)
5. Branch‑to‑Branch Variance (operational insight)

The remaining three (Sales, PO, Role Change) are valuable but can be post‑MVP extensions. The architecture is built to accommodate them with minimal changes.

---

Word count: ~1,200