The Knowledge Base
Your AI Agent Needs

Built-in MCP server, vector search, RAG, and hybrid retrieval. Plugs into Claude, ChatGPT, Cursor, Windsurf, and any MCP client — single ~29MB binary, zero config

77 MCP Tools v2.10.0 RAG-Ready UDS + mTLS Docker Ready

Download v2.10.0 View on GitHub

Works with

Claude ChatGPT Cursor Windsurf Ollama DeepSeek Manus Bielik.ai

Why MDDB? #

One binary. Every protocol. AI-native from day one.

MDDB is an AI-native embedded document database written in Go, using BoltDB for storage. It accepts Markdown, plain text, HTML, PDF, DOCX, ODT, RTF, LaTeX, YAML, and Wikipedia XML dump documents, auto-converting everything to Markdown. All documents are stored with full revision history.

Core capabilities

Built-in MCP server with 77 tools — connects to Claude, Cursor, Windsurf, ChatGPT, Ollama, DeepSeek, Manus via stdio and HTTP transport
Geospatial search: R-tree and geohash indexes, radius and bounding-box queries, composable with full-text and vector search, optional postcode lookup
Transport options: TCP host:port or Unix Domain Socket (unix:/path.sock) for zero-network local deployments (PHP-FPM sidecars, cron, same-host panel)
TLS and mutual TLS (mTLS): user-supplied server cert/key, optional client CA bundle for certificate-based client authentication (MDDB_TLS_CLIENT_CA)
Vector/semantic search with 7 index algorithms: Flat (exact), HNSW, IVF, PQ, OPQ, SQ, BQ — supports OpenAI, Ollama, Cohere, Voyage AI embeddings + per-collection int8/int4 quantization (4-8x compression)
Full-text search with 7 modes (simple, boolean, phrase, wildcard, proximity, range, fuzzy), TF-IDF, BM25, BM25F, PMISparse scoring, multi-language stemming (18 languages), typo tolerance, metadata pre-filtering
Hybrid search combining sparse (BM25) and dense (vector) with alpha blending or Reciprocal Rank Fusion
Zero-shot document classification using embedding cosine similarity
Custom MCP tools defined in YAML for domain-specific AI workflows
RAG pipeline: auto-embed on ingest, semantic retrieval, MCP exposure to LLMs
Memory RAG: conversational memory with session management, semantic recall, and summarization
Multi-protocol: HTTP/JSON REST API, gRPC/Protobuf, GraphQL, WebSocket streaming (via mddb-chat), MCP (Streamable HTTP / SSE / stdio)
Multi-format upload: .md, .txt, .html, .pdf, .docx, .odt, .rtf, .tex, .yaml with auto-conversion to Markdown
Wikipedia XML dump import: stream multi-GB MediaWiki dumps with wikitext-to-Markdown conversion, namespace filtering, batch processing
URL import: fetch and store documents from any URL
Document TTL with auto-expiry and background cleanup
Full revision history for every document
JWT authentication and RBAC authorization (opt-in)
Compliance-first (v2.9.15+): ISO 27001 / SOC 2 hardening built-in — audit log, AES-256-GCM at-rest encryption, HTTP+gRPC rate limiter, incident webhooks, and a single MDDB_PRODUCTION=true switch that refuses to boot without every control in place
Leader-follower replication with binlog streaming
Automation: triggers, crons, webhooks, sentiment analysis, template variables
Multi-language support: same document key, multiple locales
Prometheus metrics and Grafana dashboard support
~29MB single binary, embedded BoltDB, zero external dependencies
Docker image: ~29MB Alpine-based with health checks
Aggregations: metadata facets (value counts) and date histograms with optional pre-filtering
Per-collection storage backends: BoltDB (default), in-memory (ephemeral), S3/MinIO — configurable via API and web panel
Web admin panel (React-based) with REST/GraphQL API toggle
PHP and Python client libraries

MDDB

Claude

Cursor

ChatGPT

Windsurf

Ollama

AI Integration

77 Built-in MCP Tools

MCP 2025-11-25 compliant. Tool annotations for auto-approve, 5 built-in prompts, completion, structured output, logging. Three transports: stdio, Streamable HTTP, SSE.

MCP 2025-11-25
Tool Annotations
Prompts
Streamable HTTP

BM25TF-IDFPMISparse

FlatHNSWIVFPQOPQSQBQ

Hybrid RAG

Hybrid Search Engine

Full-text with 7 search modes (simple, boolean, phrase, wildcard, proximity, range, fuzzy) + vector search with 7 index types. Multi-language stemming for 18 languages. Combine with alpha blending or Reciprocal Rank Fusion. Metadata pre-filtering, typo tolerance.

Vector Search
Full-Text (7 modes, 18 langs)
Hybrid RAG
Zero-Shot Classification

RESTHTTP/JSON

gRPCProtobuf

GraphQLFlexible queries

WebSocketmddb-chat

Protocols

Multi-Protocol Access

REST for easy debugging, gRPC for high-throughput pipelines, GraphQL for flexible frontend queries, and WebSocket streaming via mddb-chat for LLM chat pipelines. All endpoints, one stack.

REST API
gRPC
GraphQL
WebSocket

.md

.txt

.html

.pdf

.docx

.odt

.rtf

.tex

.yaml

URL

Wiki XML

Markdown

Ingestion

Multi-Format Pipeline

Upload .md, .txt, .html, .pdf, .docx, .odt, .rtf, .tex, .yaml, import from any URL, or stream Wikipedia XML dumps (.xml.bz2). Everything auto-converts to Markdown. Background embedding worker indexes documents as they arrive.

13 Formats
Wikipedia Import
URL Import
Auto-Embed
TTL Expiry

Geosearch R-tree + geohash, radius/bbox

Unix Socket Zero-network local transport

mTLS Mutual TLS with client CAs

Revisions Full document history

Replication Leader-follower binlog

Auth & RBAC JWT, API keys, per-collection

ISO 27001 / SOC 2 Audit log, AES-256-GCM, rate limiter, incident webhooks (v2.9.15+)

Automations Triggers, crons, webhooks

Aggregations Facets & histograms

Multi-Language Same key, many locales

Docker ~29MB Alpine image

Zero Config Single binary, no deps

Storage Backends BoltDB, In-Memory, S3

Telemetry Prometheus + Grafana

🎨 Web Admin Panel

Modern React-based UI for managing documents, users, and search with REST/GraphQL API toggle

📊 Dashboard Real-time statistics and metrics

🔍 Search Metadata, vector, and full-text search

✏️ Editor Live markdown preview and editing

👥 Users User and group management

🔄 API Toggle Switch between REST and GraphQL

🔐 Auth JWT tokens and API keys

Quick Start #

bash — docker

# Run MDDB server
docker run -d \
  -p 11023:11023 \
  -p 11024:11024 \
  -v mddb-data:/data \
  tradik/mddb:latest

# Test the API
curl http://localhost:11023/health

bash — docker compose

# Clone repository
git clone https://github.com/tradik/mddb.git
cd mddb

# Start all services (MDDB + Panel + MCP)
make docker-up

# Services available:
# - MDDB HTTP: http://localhost:11023
# - MDDB gRPC: localhost:11024
# - Web Panel: http://localhost:3000
# - MCP Server: http://localhost:9000

bash — binary

# Download binary
wget https://github.com/tradik/mddb/releases/download/v2.10.0/mddbd-v2.10.0-linux-amd64.tar.gz
tar -xzf mddbd-v2.10.0-linux-amd64.tar.gz

# Run server
./mddbd

# Server starts on http://localhost:11023

bash — build from source

# Clone repository
git clone https://github.com/tradik/mddb.git
cd mddb

# Build
make build

# Run
./bin/mddbd

Download #

Latest Release

v{v}

Released: June 13, 2026

Linux AMD64 ~29MB Linux ARM64 ~29MB macOS AMD64 ~29MB macOS ARM64 ~29MB FreeBSD AMD64 ~29MB

📦 Debian/Ubuntu (.deb) 📦 RedHat/CentOS (.rpm) 🐧 Snap (Linux)

Package Managers

🍺 Homebrew macOS & Linux

brew tap tradik/tap
brew install tradik/tap/mddb

View Tap on GitHub →

🐧 Snap Linux

sudo snap install mddbd
sudo snap install mddb-cli

View on Snap Store →

🐳 Docker All platforms

docker pull tradik/mddb:latest

View on Docker Hub →

What's New in v2.10.0

Encryption key rotation (ISO 27001 A.8.24 / SOC 2 CC6.7) — new wire format V2 prefixes ciphertext with a 1-byte keyID so the encryptor can hold a primary plus any number of read-only previous keys. MDDB_ENCRYPTION_KEY_ID stamps new writes; MDDB_ENCRYPTION_KEYS_PREVIOUS ([{"id":1,"key":"..."}]) keeps historical entries readable. Admin endpoint POST /v1/encryption/rotate re-seals every encrypted document under the new primary in the background. V1 (2.9.15) ciphertexts continue to decrypt — non-breaking upgrade.
Audit log export to SIEM / syslog (ISO 27001 A.8.15 / SOC 2 CC7.2) — MDDB_AUDIT_EXPORT_WEBHOOK_URL mirrors every audit event to a SIEM (Splunk HEC, Datadog Logs, ELK) with custom auth headers from MDDB_AUDIT_EXPORT_WEBHOOK_HEADER. MDDB_AUDIT_EXPORT_SYSLOG_ADDR sends RFC 5424 framed messages over UDP/TCP to a syslog collector. Per-sink delivered/failed/dropped counters at GET /v1/audit/exporters; rendered in the Security panel.
Backup path jail — /v1/backup and /v1/restore now confine every user-supplied path to MDDB_BACKUP_DIR (default ./backups). Path traversal (../, absolute paths, symlink escape) is rejected with 400. Closes a path-traversal gap that previously allowed admins to read or overwrite arbitrary files.
Encryption admin panel — new sidebar entry Encryption rendering primary keyID, configured previous keyIDs, per-collection coverage (withPrimary / withLegacy / plaintext), the rotation job table, and a "Start rotation" control with optional collection scope. Security dashboard adds an Audit log export card with masked webhook URLs and per-sink health counters.
Bug fix — SetCollectionConfigRequest was missing the encrypted field, so REST/gRPC/MCP clients flipping the flag silently dropped it. Now plumbed through correctly.

View Full Changelog →

v2.9.12 highlights

Per-query boost / demote for FTS and Hybrid search — attach a boost map keyed by "metaKey:metaValue" to multiply or divide matching documents' scores at query time. Positive values boost (5.0 = 5×), negative values demote (-2.0 = ½×). Combines multiplicatively across matches with a 0.001 floor. No reindex. Wired through HTTP, gRPC, MCP tools and the web panel (collapsible UI in both FTS and Hybrid views).
Async bulk ingest with job tracking — four new endpoints under /v1/bulk-ingest-job* for long-running imports: submit returns HTTP 202 with a job ID, GET polls status, DELETE cancels pending, list returns all jobs newest-first. Single FIFO worker drains 500-doc chunks; orphan jobs from a crashed run are flipped to failed on startup. Optional callbackUrl fires a webhook on completion. Four companion MCP tools (bulk_ingest_submit / _status / _list / _cancel).
Prefix autocomplete — GET /v1/autocomplete returns top-N terms starting with the given prefix, ranked by document frequency. Scans the existing FTS inverted index (no new storage), with optional field scoping via field and a 10 000-entry scan cap to keep pathological prefixes fast. The FTS search input in the panel gains a debounced (150 ms) suggestion dropdown with doc-count badges. MCP autocomplete tool mirrors the HTTP API.
Proto backwards-compatible bump — FTSRequest gets field boost = 8 and HybridSearchRequest gets field boost = 15. Regenerated via buf for Go, Python, Node.js and PHP. Pre-2.9.12 clients keep working — they simply don't set the new field.
Coverage >90% on the new surface — fts_boost.go at 100% across all six functions; fts_autocomplete.go ~90% average; bulk_ingest_job.go ~87% average (shutdown-only paths untested by design).

View Full Changelog →

Documentation #

Getting Started

Quick StartGet up and running in 5 minutes
API DocumentationInteractiveSwagger UI for all endpoints
OpenAPI SpecMachine-readable API specification
ExamplesCode examples and patterns
Docker GuideContainer deployment
ArchitectureSystem design and internals

Search & AI

Search AlgorithmsNewTF-IDF, BM25, BM25F, Flat, HNSW, IVF, PQ, OPQ, SQ, BQ
PMISparseNewBM25 + PPMI query expansion (Tradik Limited)
GeosearchNewR-tree + geohash radius/bbox queries
RAG PipelineNewWordPress → MDDB → LLM guide
LLM ConnectionsNewClaude, ChatGPT, Ollama, DeepSeek, Manus, Bielik.ai
MCP Server ConfigNewAPI keys, rate limits, logging, access modes
Custom MCP ToolsNewYAML-defined website-specific AI tools
IntegrationsNewDocling, Langflow, OpenSearch pipelines

Security & Operations

Authentication & RBACNewJWT tokens, API keys, role-based access
Auth Quick Start5-minute authentication setup
ReplicationNewLeader-follower, Docker Compose, monitoring
Telemetry & MonitoringNewPrometheus metrics, Grafana dashboards
Health ChecksDocker & Kubernetes monitoring
License AuditDependency license compliance (Trivy)

Protocols & Advanced

gRPC GuideHigh-performance protocol
Schema ValidationEnforce metadata structure per collection
AutomationsNewTriggers, crons, webhooks, sentiment
Temporal TrackingNewEvent history, hot-docs leaderboard, activity histograms
Spell CorrectionNewSymSpell FTS corrections, custom dictionaries
WordPress AI AgentNewBuild a chatbot for your WP site
Chat WidgetNewEmbeddable AI chatbot powered by MDDB

Code Examples #

bash — mddb

# Add document with TTL (expires in 1 hour)
curl -X POST http://localhost:11023/v1/add \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "blog",
    "key": "hello-world",
    "lang": "en_US",
    "ttl": 3600,
    "meta": {
      "category": ["tutorial"],
      "author": ["John Doe"]
    },
    "contentMd": "# Hello World\n\nWelcome to MDDB!"
  }'

# Import directly from URL
curl -X POST http://localhost:11023/v1/import-url \
  -d '{"collection":"docs", "url":"https://example.com/guide.md", "lang":"en_US"}'

bash — vector search

# Semantic search - flat (exact, default)
curl -X POST http://localhost:11023/v1/vector-search \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "kb",
    "query": "how do I cancel my subscription?",
    "topK": 5,
    "threshold": 0.7,
    "includeContent": true
  }'

# HNSW - fast approximate nearest neighbor
curl -X POST http://localhost:11023/v1/vector-search \
  -d '{"collection":"kb", "query":"refund", "topK":5, "algorithm":"hnsw"}'

# IVF - clustered search for large collections
curl -X POST http://localhost:11023/v1/vector-search \
  -d '{"collection":"kb", "query":"refund", "topK":5, "algorithm":"ivf"}'

# PQ - compressed, memory-efficient search
curl -X POST http://localhost:11023/v1/vector-search \
  -d '{"collection":"kb", "query":"refund", "topK":5, "algorithm":"pq"}'

bash — full-text search

# Simple search with BM25 scoring
curl -X POST http://localhost:11023/v1/fts \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "blog",
    "query": "markdown database tutorial",
    "limit": 10,
    "algorithm": "bm25"
  }'

# Boolean search (AND, OR, NOT, +required, -excluded)
curl -X POST http://localhost:11023/v1/fts \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "blog",
    "query": "rust AND performance NOT garbage",
    "mode": "boolean"
  }'

# Phrase search — exact word sequence
curl -X POST http://localhost:11023/v1/fts \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "blog",
    "query": "\"machine learning\"",
    "mode": "phrase"
  }'

# Proximity search — terms within 5 words
curl -X POST http://localhost:11023/v1/fts \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "blog",
    "query": "\"database performance\"~5",
    "mode": "proximity",
    "distance": 5
  }'

bash — metadata search

curl -X POST http://localhost:11023/v1/search \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "blog",
    "filterMeta": {
      "category": ["tutorial"],
      "status": ["published"]
    },
    "sort": "updatedAt",
    "limit": 10
  }'

python3 — mddb client

from mddb import MDDB

# TCP connection
db = MDDB.connect('localhost:11023', 'write').collection('kb')

# Unix Domain Socket (MDDB 2.9.14+, zero-network local transport)
# db = MDDB.connect('unix:/tmp/mddb.sock', 'write').collection('kb')

# Add document
db.add('faq', 'en_US', {'category': ['billing']}, '# Billing FAQ\n\nCancel in Settings.')

# Vector search (RAG pipeline step 1)
results = db.vector_search('how to cancel?', top_k=3, include_content=True)

# Full-text search
results = db.fts_search('billing refund', limit=10)

# Webhooks
db.register_webhook('https://app.com/hook', ['doc.added'], 'kb')

# Import from URL
db.import_url('https://example.com/docs.md', 'en_US')

# TTL
db.set_ttl('temp-key', 'en', 3600)  # expires in 1h

php — mddb client

<?php
require_once 'mddb.php';

// TCP connection
$db = mddb::connect('localhost:11023', 'write');

// Unix Domain Socket (MDDB 2.9.14+)
// $db = mddb::connect('unix:/tmp/mddb.sock', 'write');

// Add document
$db->collection('blog')->add('hello', 'en_US', ['category' => ['tutorial']], '# Hello');

// Vector search
$results = $db->collection('kb')->vectorSearch('cancel subscription', 5, 0.7, true);

// Full-text search
$results = $db->collection('blog')->ftsSearch('database tutorial', 10);

// Webhooks
$db->registerWebhook('https://app.com/hook', ['doc.added', 'doc.updated']);

// Import from URL
$db->collection('docs')->importUrl('https://example.com/post.md', 'en_US');

// TTL
$db->collection('cache')->setTtl('temp', 'en', 3600);

json — mcp.json

# Docker (Easiest) - for Windsurf / Claude Desktop
# Configure ~/.windsurf/mcp.json:
{
  "mcpServers": {
    "mddb": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm", "--network", "host",
        "-e", "MDDB_GRPC_ADDRESS=localhost:11024",
        "-e", "MDDB_REST_BASE_URL=http://localhost:11023",
        "-e", "MDDB_MCP_STDIO=true",
        "tradik/mddb:latest"
      ]
    }
  }
}

# MCP provides 77 built-in tools including:
# - add_document, search_documents, semantic_search
# - full_text_search, hybrid_search, cross_search
# - classify_document, find_duplicates
# - geo_search, geo_within, geo_encode, geo_decode, geo_stats
# - automation, webhooks, schemas, backups
# - collection config, revisions, aggregate, and more

🔄 Leader-Follower Replication #

Scale reads horizontally with binlog-based streaming replication. A single leader handles writes and streams transactions in real-time to read-only followers via gRPC.

Read Replication Guide

Community & Support #

🐙