Skip to main content

The Knowledge Base
Your AI Agent Needs

Built-in MCP server, vector search, RAG, and hybrid retrieval. Plugs into Claude, ChatGPT, Cursor, Windsurf, and any MCP client — single ~29MB binary, zero config

67 MCP Toolsv2.9.11RAG-ReadyUDS + mTLSDocker Ready
Works with
ClaudeChatGPTCursorWindsurfOllamaDeepSeekManusBielik.ai

Why MDDB? #

One binary. Every protocol. AI-native from day one.

MDDB is an AI-native embedded document database written in Go, using BoltDB for storage. It accepts Markdown, plain text, HTML, PDF, DOCX, ODT, RTF, LaTeX, YAML, and Wikipedia XML dump documents, auto-converting everything to Markdown. All documents are stored with full revision history.

Core capabilities

  • Built-in MCP server with 67 tools — connects to Claude, Cursor, Windsurf, ChatGPT, Ollama, DeepSeek, Manus via stdio and HTTP transport
  • Geospatial search: R-tree and geohash indexes, radius and bounding-box queries, composable with full-text and vector search, optional postcode lookup
  • Transport options: TCP host:port or Unix Domain Socket (unix:/path.sock) for zero-network local deployments (PHP-FPM sidecars, cron, same-host panel)
  • TLS and mutual TLS (mTLS): user-supplied server cert/key, optional client CA bundle for certificate-based client authentication (MDDB_TLS_CLIENT_CA)
  • Vector/semantic search with 7 index algorithms: Flat (exact), HNSW, IVF, PQ, OPQ, SQ, BQ — supports OpenAI, Ollama, Cohere, Voyage AI embeddings + per-collection int8/int4 quantization (4-8x compression)
  • Full-text search with 7 modes (simple, boolean, phrase, wildcard, proximity, range, fuzzy), TF-IDF, BM25, BM25F, PMISparse scoring, multi-language stemming (18 languages), typo tolerance, metadata pre-filtering
  • Hybrid search combining sparse (BM25) and dense (vector) with alpha blending or Reciprocal Rank Fusion
  • Zero-shot document classification using embedding cosine similarity
  • Custom MCP tools defined in YAML for domain-specific AI workflows
  • RAG pipeline: auto-embed on ingest, semantic retrieval, MCP exposure to LLMs
  • Memory RAG: conversational memory with session management, semantic recall, and summarization
  • Triple protocol: HTTP/JSON REST API, gRPC/Protobuf, GraphQL
  • Multi-format upload: .md, .txt, .html, .pdf, .docx, .odt, .rtf, .tex, .yaml with auto-conversion to Markdown
  • Wikipedia XML dump import: stream multi-GB MediaWiki dumps with wikitext-to-Markdown conversion, namespace filtering, batch processing
  • URL import: fetch and store documents from any URL
  • Document TTL with auto-expiry and background cleanup
  • Full revision history for every document
  • JWT authentication and RBAC authorization (opt-in)
  • Leader-follower replication with binlog streaming
  • Automation: triggers, crons, webhooks, sentiment analysis, template variables
  • Multi-language support: same document key, multiple locales
  • Prometheus metrics and Grafana dashboard support
  • ~29MB single binary, embedded BoltDB, zero external dependencies
  • Docker image: ~29MB Alpine-based with health checks
  • Aggregations: metadata facets (value counts) and date histograms with optional pre-filtering
  • Per-collection storage backends: BoltDB (default), in-memory (ephemeral), S3/MinIO — configurable via API and web panel
  • Web admin panel (React-based) with REST/GraphQL API toggle
  • PHP and Python client libraries
MDDB
Claude
Cursor
ChatGPT
Windsurf
Ollama
AI Integration

67 Built-in MCP Tools

MCP 2025-11-25 compliant. Tool annotations for auto-approve, 5 built-in prompts, completion, structured output, logging. Three transports: stdio, Streamable HTTP, SSE.

  • MCP 2025-11-25
  • Tool Annotations
  • Prompts
  • Streamable HTTP
BM25TF-IDFPMISparse
+
FlatHNSWIVFPQOPQSQBQ
=
Hybrid RAG
Search

Hybrid Search Engine

Full-text with 7 search modes (simple, boolean, phrase, wildcard, proximity, range, fuzzy) + vector search with 7 index types. Multi-language stemming for 18 languages. Combine with alpha blending or Reciprocal Rank Fusion. Metadata pre-filtering, typo tolerance.

  • Vector Search
  • Full-Text (7 modes, 18 langs)
  • Hybrid RAG
  • Zero-Shot Classification
RESTHTTP/JSON
gRPCProtobuf
GraphQLFlexible queries
Protocols

Triple Protocol Access

REST for easy debugging, gRPC for high-throughput pipelines, GraphQL for flexible frontend queries. All endpoints, one server.

  • REST API
  • gRPC
  • GraphQL
.md
.txt
.html
.pdf
.docx
.odt
.rtf
.tex
.yaml
URL
Wiki XML
Markdown
Ingestion

Multi-Format Pipeline

Upload .md, .txt, .html, .pdf, .docx, .odt, .rtf, .tex, .yaml, import from any URL, or stream Wikipedia XML dumps (.xml.bz2). Everything auto-converts to Markdown. Background embedding worker indexes documents as they arrive.

  • 13 Formats
  • Wikipedia Import
  • URL Import
  • Auto-Embed
  • TTL Expiry
Geosearch R-tree + geohash, radius/bbox
Unix Socket Zero-network local transport
mTLS Mutual TLS with client CAs
Revisions Full document history
Replication Leader-follower binlog
Auth & RBAC JWT, API keys, per-collection
Automations Triggers, crons, webhooks
Aggregations Facets & histograms
Multi-Language Same key, many locales
Docker ~29MB Alpine image
Zero Config Single binary, no deps
Storage Backends BoltDB, In-Memory, S3
Telemetry Prometheus + Grafana

🎨 Web Admin Panel

Modern React-based UI for managing documents, users, and search with REST/GraphQL API toggle

MDDB Web Admin Panel
📊 DashboardReal-time statistics and metrics
🔍 SearchMetadata, vector, and full-text search
✏️ EditorLive markdown preview and editing
👥 UsersUser and group management
🔄 API ToggleSwitch between REST and GraphQL
🔐 AuthJWT tokens and API keys

Quick Start #

bash — docker
# Run MDDB server
docker run -d \ -p 11023:11023 \ -p 11024:11024 \ -v mddb-data:/data \ tradik/mddb:latest # Test the API
curl http://localhost:11023/health
bash — docker compose
# Clone repository
git clone https://github.com/tradik/mddb.git
cd mddb # Start all services (MDDB + Panel + MCP)
make docker-up # Services available:
# - MDDB HTTP: http://localhost:11023
# - MDDB gRPC: localhost:11024
# - Web Panel: http://localhost:3000
# - MCP Server: http://localhost:9000
bash — binary
# Download binary
wget https://github.com/tradik/mddb/releases/download/v2.9.11/mddbd-v2.9.11-linux-amd64.tar.gz
tar -xzf mddbd-v2.9.11-linux-amd64.tar.gz # Run server
./mddbd # Server starts on http://localhost:11023
bash — build from source
# Clone repository
git clone https://github.com/tradik/mddb.git
cd mddb # Build
make build # Run
./bin/mddbd

Download #

Package Managers

🍺 Homebrew macOS & Linux

brew tap tradik/tap
brew install tradik/tap/mddb
View Tap on GitHub →

🐧 Snap Linux

sudo snap install mddbd
sudo snap install mddb-cli
View on Snap Store →

🐳 Docker All platforms

docker pull tradik/mddb:latest
View on Docker Hub →

What's New in v2.9.11

  • Unix Domain Socket transport — HTTP and gRPC servers now accept MDDB_HTTP_ADDR=unix:/tmp/mddb.sock / MDDB_GRPC_ADDR=unix:/tmp/mddb-grpc.sock. Zero-network local transport with 0600 filesystem perms. Python and PHP clients gain matching unix: scheme support — stdlib-only on Python, libcurl CURLOPT_UNIX_SOCKET_PATH on PHP.
  • Mutual TLS (mTLS) — new MDDB_TLS_CLIENT_CA + MDDB_TLS_CLIENT_AUTH=require|request configure client-cert verification on the HTTPS listener. tls.Config.MinVersion pinned to TLS 1.2.
  • MCP tool count audit — discovered the long-claimed "72 tools" was a documentation-only number that never matched code. Audited mcpBuiltinTools() vs the dispatch switch and surfaced one orphan tool (aggregate) that had a working handler but no declaration, so MCP clients could not see or call it. Declared it. Authoritative count is now 67 tools.
  • Landing page Mermaid fix — replication diagram was rendering "Syntax error in text" on mermaid 10.9.5. Switched to <div class="mermaid">, upgraded to mermaid@11, explicit mermaid.run().
  • Geosearch docs 404 fixdocs/GEOSEARCH.md was missing SSG frontmatter, so /docs/geosearch/ never built. Frontmatter added.
  • Landing page split into includes — the monolithic index.html (~1500 lines) is now composed of index-section-*.html files via Go template {{template ...}}. Easier to edit, easier to review.
View Full Changelog →

Documentation #

Getting Started

Search & AI

Security & Operations

Protocols & Advanced

Code Examples #

bash — mddb
# Add document with TTL (expires in 1 hour)
curl -X POST http://localhost:11023/v1/add \ -H "Content-Type: application/json" \ -d '{ "collection": "blog", "key": "hello-world", "lang": "en_US", "ttl": 3600, "meta": { "category": ["tutorial"], "author": ["John Doe"] }, "contentMd": "# Hello World\n\nWelcome to MDDB!" }' # Import directly from URL
curl -X POST http://localhost:11023/v1/import-url \ -d '{"collection":"docs", "url":"https://example.com/guide.md", "lang":"en_US"}'
bash — vector search
# Semantic search - flat (exact, default)
curl -X POST http://localhost:11023/v1/vector-search \ -H "Content-Type: application/json" \ -d '{ "collection": "kb", "query": "how do I cancel my subscription?", "topK": 5, "threshold": 0.7, "includeContent": true }' # HNSW - fast approximate nearest neighbor
curl -X POST http://localhost:11023/v1/vector-search \ -d '{"collection":"kb", "query":"refund", "topK":5, "algorithm":"hnsw"}' # IVF - clustered search for large collections
curl -X POST http://localhost:11023/v1/vector-search \ -d '{"collection":"kb", "query":"refund", "topK":5, "algorithm":"ivf"}' # PQ - compressed, memory-efficient search
curl -X POST http://localhost:11023/v1/vector-search \ -d '{"collection":"kb", "query":"refund", "topK":5, "algorithm":"pq"}'
bash — full-text search
# Simple search with BM25 scoring
curl -X POST http://localhost:11023/v1/fts \ -H "Content-Type: application/json" \ -d '{ "collection": "blog", "query": "markdown database tutorial", "limit": 10, "algorithm": "bm25" }' # Boolean search (AND, OR, NOT, +required, -excluded)
curl -X POST http://localhost:11023/v1/fts \ -H "Content-Type: application/json" \ -d '{ "collection": "blog", "query": "rust AND performance NOT garbage", "mode": "boolean" }' # Phrase search — exact word sequence
curl -X POST http://localhost:11023/v1/fts \ -H "Content-Type: application/json" \ -d '{ "collection": "blog", "query": "\"machine learning\"", "mode": "phrase" }' # Proximity search — terms within 5 words
curl -X POST http://localhost:11023/v1/fts \ -H "Content-Type: application/json" \ -d '{ "collection": "blog", "query": "\"database performance\"~5", "mode": "proximity", "distance": 5 }'
python3 — mddb client
from mddb import MDDB # TCP connection
db = MDDB.connect('localhost:11023', 'write').collection('kb') # Unix Domain Socket (MDDB 2.9.11+, zero-network local transport)
# db = MDDB.connect('unix:/tmp/mddb.sock', 'write').collection('kb') # Add document
db.add('faq', 'en_US', {'category': ['billing']}, '# Billing FAQ\n\nCancel in Settings.') # Vector search (RAG pipeline step 1)
results = db.vector_search('how to cancel?', top_k=3, include_content=True) # Full-text search
results = db.fts_search('billing refund', limit=10) # Webhooks
db.register_webhook('https://app.com/hook', ['doc.added'], 'kb') # Import from URL
db.import_url('https://example.com/docs.md', 'en_US') # TTL
db.set_ttl('temp-key', 'en', 3600) # expires in 1h
php — mddb client
<?php
require_once 'mddb.php'; // TCP connection
$db = mddb::connect('localhost:11023', 'write'); // Unix Domain Socket (MDDB 2.9.11+)
// $db = mddb::connect('unix:/tmp/mddb.sock', 'write'); // Add document
$db->collection('blog')->add('hello', 'en_US', ['category' => ['tutorial']], '# Hello'); // Vector search
$results = $db->collection('kb')->vectorSearch('cancel subscription', 5, 0.7, true); // Full-text search
$results = $db->collection('blog')->ftsSearch('database tutorial', 10); // Webhooks
$db->registerWebhook('https://app.com/hook', ['doc.added', 'doc.updated']); // Import from URL
$db->collection('docs')->importUrl('https://example.com/post.md', 'en_US'); // TTL
$db->collection('cache')->setTtl('temp', 'en', 3600);
json — mcp.json
# Docker (Easiest) - for Windsurf / Claude Desktop
# Configure ~/.windsurf/mcp.json:
{ "mcpServers": { "mddb": { "command": "docker", "args": [ "run", "-i", "--rm", "--network", "host", "-e", "MDDB_GRPC_ADDRESS=localhost:11024", "-e", "MDDB_REST_BASE_URL=http://localhost:11023", "-e", "MDDB_MCP_STDIO=true", "tradik/mddb:latest" ] } }
} # MCP provides 67 built-in tools including:
# - add_document, search_documents, semantic_search
# - full_text_search, hybrid_search, cross_search
# - classify_document, find_duplicates
# - geo_search, geo_within, geo_encode, geo_decode, geo_stats
# - automation, webhooks, schemas, backups
# - collection config, revisions, aggregate, and more

🔄 Leader-Follower Replication #

Scale reads horizontally with binlog-based streaming replication. A single leader handles writes and streams transactions in real-time to read-only followers via gRPC.

graph LR C[Clients] -->|Writes/Reads| L[Leader] C -->|Reads| F1[Follower 1] C -->|Reads| F2[Follower 2] L -->|gRPC binlog stream| F1 L -->|gRPC binlog stream| F2
Read Replication Guide