Scrapbook: CDC Landscape Kickoff
Date: 2026-02-16T15:50:00Z
Task
Research practical MVP-oriented CDC options for MySQL, SQL Server, Postgres, Oracle with focus on:
- Open source accessibility
- On-prem deployment simplicity
- Operational risks and performance
- Required privileges
- Offset management and failure recovery
Starting Hypotheses
- MySQL MVP winner: Debezium (mature, Kafka-based but can use other sinks). Maxwell is simpler but less active.
- SQL Server: built-in CDC is viable; Debezium connector may add complexity.
- Postgres: logical decoding via wal2json or pgoutput; Debezium connector is solid.
- Oracle: GoldenGate is expensive; Debezium Oracle connector exists but may be less mature; possible custom binlog parsing.
Sources to Consult
- Debezium documentation (debezium.io)
- MySQL Reference Manual (binlog, replication)
- SQL Server Docs (CDC feature)
- PostgreSQL Documentation (logical decoding)
- Oracle GoldenGate docs (overview)
- Engineering blogs: Confluent, Red Hat, Percona, Severalnines
- GitHub repos: debezium/debezium, maxwell-schema/maxwell
Open Questions
- Can Debezium run without Kafka? Yes, via embedded engine or direct to HTTP sink (Debezium Server) but less common.
- What are minimal MySQL privileges? REPLICATION SLAVE, REPLICATION CLIENT, SELECT on binlog? Actually need `REPLICATION SLAVE` and `SELECT` on tables? Debezium uses binlog reading, requires `REPLICATION SLAVE` privilege.
- Performance impact: typical binlog size, network throughput; need read replica?
- Offset storage: Kafka topics by default; can use JDBC or other connectors.
- Failure recovery: connector offsets, exactly-once semantics?
---
This is a thinking trace. Will evolve as research proceeds.