MDDB Replication (Leader-Follower)
MDDB supports leader-follower replication for horizontal read scaling and high availability. A single leader node handles all writes and streams changes to one or more read-only follower nodes via a binary replication log (binlog).
Architecture
graph LR C1[Clients<br/>reads + writes] --> L[Leader<br/>read-write] C2[Clients<br/>reads only] --> F1[Follower 1<br/>read-only] C3[Clients<br/>reads only] --> F2[Follower 2<br/>read-only] L -- "gRPC StreamBinlog" --> F1 L -- "gRPC StreamBinlog" --> F2 L -- "gRPC Snapshot" --> F1 L -- "gRPC Snapshot" --> F2
Replication Flow
sequenceDiagram participant C as Client participant L as Leader participant BL as Binlog participant F as Follower C->>L: POST /v1/add (write) L->>L: BoltDB commit L->>BL: Append entry (LSN=42) BL-->>F: StreamBinlog (LSN=42) F->>F: Apply to local BoltDB F->>L: AcknowledgeLSN(42) Note over F: Follower is now up to date C->>F: POST /v1/get (read) F-->>C: Document (from local BoltDB)
Model
- Single-leader: One node accepts writes, followers are read-only
- Pull-based: Followers pull changes from the leader via gRPC streaming
- Eventual consistency: Followers lag behind the leader by milliseconds (typically <50ms on LAN)
- Automatic catch-up: Followers reconnect and catch up after disconnects
- Full snapshot sync: New followers receive a full database snapshot before switching to incremental replication
Quick Start
1. Start the Leader
MDDB_REPLICATION_ROLE=leader \
MDDB_DB_PATH=/data/leader.db \
MDDB_HTTP_PORT=11023 \
MDDB_GRPC_PORT=11024 \
./mddbd
The leader automatically enables the binlog and exposes the MDDBReplication gRPC service.
2. Start a Follower
MDDB_REPLICATION_ROLE=follower \
MDDB_REPLICATION_LEADER_ADDR=leader-host:11024 \
MDDB_DB_PATH=/data/follower.db \
MDDB_HTTP_PORT=11033 \
MDDB_GRPC_PORT=11034 \
./mddbd
The follower:
- Automatically switches to read-only mode
- Connects to the leader's gRPC port
- Downloads a full snapshot (if starting fresh)
- Tails the binlog for real-time updates
3. Verify Replication
curl -X POST http://leader-host:11023/v1/add \ -H "Content-Type: application/json" \ -d '{ "collection": "blog", "key": "hello", "lang": "en_US", "contentMd": "# Hello from leader" }' curl -X POST http://follower-host:11033/v1/get \ -d '{"collection":"blog","key":"hello","lang":"en_US"}' curl http://leader-host:11023/v1/replication/status | jq
Docker Compose
services: leader: image: tradik/mddb:latest ports: - "11023:11023" - "11024:11024" volumes: - leader-data:/data environment: MDDB_REPLICATION_ROLE: leader MDDB_DB_PATH: /data/mddb.db follower-1: image: tradik/mddb:latest ports: - "11033:11023" - "11034:11024" volumes: - follower1-data:/data environment: MDDB_REPLICATION_ROLE: follower MDDB_REPLICATION_LEADER_ADDR: leader:11024 MDDB_DB_PATH: /data/mddb.db depends_on: leader: condition: service_healthy follower-2: image: tradik/mddb:latest ports: - "11043:11023" - "11044:11024" volumes: - follower2-data:/data environment: MDDB_REPLICATION_ROLE: follower MDDB_REPLICATION_LEADER_ADDR: leader:11024 MDDB_DB_PATH: /data/mddb.db depends_on: leader: condition: service_healthy panel: image: tradik/mddb:panel ports: - "3000:80" environment: MDDB_API_URL: http://leader:11023 volumes: leader-data: follower1-data: follower2-data:
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
MDDB_REPLICATION_ROLE | "" (standalone) | Node role: leader, follower, or empty for standalone |
MDDB_REPLICATION_LEADER_ADDR | - | Follower only: gRPC address of the leader (e.g. leader:11024) |
MDDB_NODE_ID | auto-generated | Unique node identifier |
MDDB_BINLOG_ENABLED | false | Enable binlog without replication (standalone mode). Auto-enabled for leader role |
MDDB_BINLOG_PATH | alongside DB file | Custom binlog file path |
MDDB_BINLOG_MAX_SIZE | 256MB | Maximum binlog segment size |
MDDB_BINLOG_MAX_AGE | 24h | Maximum binlog retention time |
MDDB_REPLICATION_RETRY_INTERVAL | 5s | Follower reconnection interval |
MDDB_REPLICATION_MAX_LAG | 30s | Maximum lag before follower reports unhealthy |
Role Behavior
| Setting | Writes | Reads | Binlog | Replication |
|---|---|---|---|---|
| (empty / standalone) | Yes | Yes | Off | Off |
leader | Yes | Yes | On | Serves followers |
follower | No (read-only) | Yes | Off | Pulls from leader |
Follower: Disabled Subsystems
When running as a follower, these subsystems are automatically disabled (they come from the leader's binlog):
- Embedding Worker - Embeddings are replicated from the leader
- TTL Cleanup - TTL deletions arrive via binlog
- Index Queue - Metadata indexing arrives via binlog
Binlog
The binary replication log records every write operation (Put/Delete) to BoltDB. It serves as the change stream that followers consume.
Entry Format
[lsn:8][type:1][timestamp:8][bucketNameLen:2][bucketName:N][keyLen:4][key:N][valueLen:4][value:N][checksum:4]
| Field | Size | Description |
|---|---|---|
| LSN | 8 bytes | Monotonically increasing Log Sequence Number |
| Type | 1 byte | 1=Put, 2=Delete |
| Timestamp | 8 bytes | Unix nanoseconds |
| BucketName | 2+N bytes | Target BoltDB bucket |
| Key | 4+N bytes | Document key |
| Value | 4+N bytes | Document value (empty for Delete) |
| Checksum | 4 bytes | CRC32 of the entry |
Performance
- Buffered I/O: 256KB write buffer with periodic flush (100ms)
- Async flush: Writes are buffered; fsync happens periodically or on demand
- Compact format: Binary serialization (~30% smaller than JSON)
- Subscriber channels: Real-time push to followers via Go channels
gRPC Replication Service
The leader exposes these gRPC methods on the standard gRPC port:
service MDDBReplication { rpc RequestSnapshot(SnapshotRequest) returns (stream SnapshotChunk); rpc StreamBinlog(StreamBinlogRequest) returns (stream BinlogEntryProto); rpc ReplicationStatus(ReplicationStatusRequest) returns (ReplicationStatusResponse); rpc AcknowledgeLSN(AcknowledgeLSNRequest) returns (AcknowledgeLSNResponse);
}
| Method | Description |
|---|---|
RequestSnapshot | Full BoltDB snapshot streamed in 1MB chunks (uses bolt.Tx.WriteTo()) |
StreamBinlog | Historical entries from a given LSN + real-time tailing |
ReplicationStatus | Node role, LSN, lag, follower info |
AcknowledgeLSN | Follower confirms applied LSN (for retention) |
Follower Sync Flow
flowchart TD A[Follower starts] --> B{Has local DB?} B -->|No| C[RequestSnapshot from leader] B -->|Yes| D[StreamBinlog from last LSN] C --> E[Write snapshot to temp file] E --> F[Replace local DB] F --> G[Reload in-memory state] G --> D D --> H{LSN too old?} H -->|Yes| C H -->|No| I[Apply entries to local BoltDB] I --> J[AcknowledgeLSN] J --> D D --> K{Connection lost?} K -->|Yes| L[Wait retry interval] L --> D
Initial Sync
- Follower connects with
StreamBinlog(fromLSN=0) - If leader responds with
FailedPrecondition(LSN too old), follower requests full snapshot - Leader streams BoltDB snapshot via
bolt.Tx.WriteTo()in 1MB chunks (non-blocking read-only transaction) - Follower saves to temp file, replaces local DB, reloads in-memory state
- Follower resumes
StreamBinlogfrom the snapshot's LSN
Reconnection
- Follower retries every
MDDB_REPLICATION_RETRY_INTERVAL(default 5s) - If binlog still has the needed LSN: incremental catch-up
- If LSN is too old (binlog rotated): full snapshot re-sync
Follower State Machine
stateDiagram-v2 [*] --> STARTING : Initialization STARTING --> CONNECTING : Connect to gRPC Leader CONNECTING --> REQUEST_SYNC : Connected CONNECTING --> CONNECTING : Retry on failure REQUEST_SYNC --> SYNC_SNAPSHOT : Has no DB or LSN too old REQUEST_SYNC --> SYNC_BINLOG : Has DB and LSN valid SYNC_SNAPSHOT --> RELOAD_DB : Snapshot downloaded RELOAD_DB --> SYNC_BINLOG : DB reloaded in memory SYNC_BINLOG --> APPLY_TRANSACTIONS : Receive StreamBinlog chunk APPLY_TRANSACTIONS --> ACKNOWLEDGE : Saved to BoltDB ACKNOWLEDGE --> SYNC_BINLOG : Loop SYNC_BINLOG --> DISCONNECTED : Connection lost DISCONNECTED --> CONNECTING : Wait retry interval
Monitoring
HTTP Endpoint
GET /v1/replication/status
Leader response:
{ "node_id": "leader-1", "role": "leader", "current_lsn": 45230, "binlog_oldest_lsn": 40000, "binlog_size_bytes": 15728640, "healthy": true, "followers": [ { "follower_id": "follower-1", "address": "10.0.0.2:11034", "confirmed_lsn": 45228, "lag_ms": 12, "last_seen_at": 1709500000, "status": "healthy" } ], "uptime_seconds": 86400
}
Follower response:
{ "node_id": "follower-1", "role": "follower", "current_lsn": 45228, "leader_addr": "leader:11024", "replication_lag_ms": 12, "healthy": true, "followers": [], "uptime_seconds": 3600
}
Prometheus Metrics
| Metric | Type | Description |
|---|---|---|
mddb_replication_role | gauge | 1=leader, 2=follower, 0=standalone |
mddb_replication_lsn | gauge | Current LSN on this node |
mddb_replication_lag_ms | gauge | Follower replication lag in ms |
mddb_binlog_entries_total | counter | Total binlog entries written |
mddb_binlog_size_bytes | gauge | Current binlog file size |
Health Check
The /health endpoint includes replication status. A follower reports unhealthy if lag exceeds MDDB_REPLICATION_MAX_LAG (default 30s).
Web Panel
The admin panel includes a Cluster tab (under Administration) that shows:
- Node role and status badges
- Real-time LSN and binlog statistics
- Connected followers with lag indicators
- Lag history chart (last 5 minutes)
What Gets Replicated
| Subsystem | Replicated | Notes |
|---|---|---|
| Documents (CRUD) | Yes | All Put/Delete to docs bucket |
| Revisions | Yes | All revision entries |
| Metadata indices | Yes | idxmeta, bykey buckets |
| Vector embeddings | Yes | vectors bucket + in-memory index reload |
| Full-text index | Yes | FTS tokens in BoltDB |
| Webhooks | Yes | Config in BoltDB, follower reloads in-memory |
| Schemas | Yes | Config in BoltDB, follower reloads in-memory |
| Auth (users/groups) | Yes | Auth data in BoltDB |
| TTL metadata | Yes | TTL expiry stored with documents |
Limitations
- Single leader: Only one node can accept writes. Multi-leader is not supported.
- Eventual consistency: Followers may serve slightly stale reads during replication lag.
- No automatic failover: Promoting a follower to leader requires manual reconfiguration.
- Binlog retention: If the leader's binlog is rotated before a follower catches up, a full snapshot re-sync is required.
- gRPC only: Replication uses gRPC (port 11024). Both leader and follower must have gRPC ports accessible.
Examples
Load Balancer Setup (Nginx)
Route writes to leader, reads to any node:
upstream mddb_read { server leader:11023; server follower-1:11033; server follower-2:11043;
} upstream mddb_write { server leader:11023;
} server { listen 80; # Write endpoints -> leader only location ~ ^/v1/(add|delete|import-url|set-ttl|webhooks|schema|truncate|restore) { proxy_pass http://mddb_write; } # Read endpoints -> any node location /v1/ { proxy_pass http://mddb_read; }
}
Kubernetes StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata: name: mddb-follower
spec: replicas: 3 selector: matchLabels: app: mddb-follower template: metadata: labels: app: mddb-follower spec: containers: - name: mddb image: tradik/mddb:latest env: - name: MDDB_REPLICATION_ROLE value: "follower" - name: MDDB_REPLICATION_LEADER_ADDR value: "mddb-leader.default.svc.cluster.local:11024" - name: MDDB_DB_PATH value: "/data/mddb.db" ports: - containerPort: 11023 - containerPort: 11024 volumeMounts: - name: data mountPath: /data volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 10Gi
Manual Failover
If the leader goes down:
MDDB_REPLICATION_ROLE=leader \
MDDB_DB_PATH=/data/mddb.db \
MDDB_HTTP_PORT=11023 \
MDDB_GRPC_PORT=11024 \
./mddbd Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Follower stuck at LSN 0 | Can't connect to leader | Check MDDB_REPLICATION_LEADER_ADDR and firewall rules |
| "full snapshot required" | Binlog rotated | Follower will auto-snapshot; increase MDDB_BINLOG_MAX_SIZE |
| High replication lag | Slow network or overloaded follower | Check network, reduce write load, add followers |
| Follower reports unhealthy | Lag > MDDB_REPLICATION_MAX_LAG | Investigate lag cause; increase max lag if acceptable |
| "binlog not enabled" | Leader not configured | Set MDDB_REPLICATION_ROLE=leader |