MDDB API Documentation
Note: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Table of Contents
- Overview
- Configuration
- Endpoints
- POST /v1/add
- POST /v1/add-batch
- POST /v1/ingest
- POST /v1/upload
- POST /v1/get
- POST /v1/search
- POST /v1/vector-search
- POST /v1/vector-reindex
- GET /v1/vector-stats
- POST /v1/classify
- PATCH /v1/update
- GET /v1/doc-meta
- POST /v1/fts
- POST /v1/fts-reindex
- GET /v1/fts-languages
- POST /v1/synonyms
- GET /v1/synonyms
- DELETE /v1/synonyms
- POST /v1/export
- GET /v1/backup
- POST /v1/restore
- POST /v1/truncate
- GET /v1/stats
- POST /v1/schema/set
- POST /v1/schema/get
- POST /v1/schema/delete
- POST /v1/schema/list
- POST /v1/validate
- POST /v1/auth/login
- POST /v1/auth/api-key
- GET /v1/auth/api-keys
- DELETE /v1/auth/api-keys/:keyHash
- POST /v1/delete
- POST /v1/delete-batch
- POST /v1/delete-collection
- POST /v1/hybrid-search
- POST /v1/cross-search
- POST /v1/find-duplicates
- POST /v1/aggregate
- GET /v1/collection-config
- GET /v1/collection-configs
- GET /v1/embedding-configs
- GET/PUT/DELETE /v1/embedding-configs/:id
- POST /v1/embedding-configs/set-default
- GET/POST/DELETE /v1/stopwords
- GET/POST /v1/webhooks
- POST /v1/webhooks/delete
- POST /v1/revisions
- POST /v1/revisions/restore
- GET/POST /v1/automation
- GET/PUT/DELETE /v1/automation/:id
- GET /v1/automation-logs
- POST /v1/import-url
- POST /v1/import-wiki
- POST /v1/set-ttl
- GET /v1/meta-keys
- GET /v1/checksum
- GET /v1/system/info
- GET /v1/config
- GET /v1/endpoints
- POST /v1/memory/session
- POST /v1/memory/message
- POST /v1/memory/recall
- POST /v1/memory/summarize
- POST /v1/memory/sessions
- POST /v1/memory/history
- GET /health
- POST /v1/auth/register
- GET /v1/auth/me
- GET/POST /v1/auth/permissions
- GET /v1/auth/users
- DELETE /v1/auth/users/:username
- GET/POST /v1/auth/groups
- GET/PUT/DELETE /v1/auth/groups/:name
- GET/POST /v1/auth/group-permissions
- Data Models
- Error Handling
Overview
MDDB is a lightweight markdown database server built with Go and BoltDB. It provides a RESTful API for storing, retrieving, and managing markdown documents with metadata.
Base URL: http://localhost:11023
API Version: v1
Configuration
The server can be configured using environment variables:
| Variable | Default | Description |
|---|---|---|
MDDB_ADDR | :11023 | Server address and port |
MDDB_MODE | wr | Access mode: read, write, or wr (read+write). Also: --mode flag, database.mode in YAML |
MDDB_PATH | mddb.db | Path to the BoltDB database file. Also: --db flag, database.path in YAML |
MDDB_EMBEDDING_PROVIDER | none | Embedding provider: openai, ollama, voyage, or none |
MDDB_EMBEDDING_API_KEY | API key for OpenAI or Voyage AI | |
MDDB_EMBEDDING_API_URL | (per provider) | API base URL (see Vector Search) |
MDDB_EMBEDDING_MODEL | (per provider) | Embedding model name |
MDDB_EMBEDDING_DIMENSIONS | (per provider) | Vector dimensions |
MDDB_FTS_STEMMING | true | Enable stemming for FTS |
MDDB_FTS_DEFAULT_LANG | en | Default language for FTS stemming and stop words (18 languages supported) |
MDDB_FTS_SYNONYMS | true | Enable synonym expansion for FTS |
MDDB_COMPRESSION_ENABLED | true | Enable adaptive compression (Snappy/Zstd) |
MDDB_COMPRESSION_SMALL_THRESHOLD | 1024 | Snappy compression threshold (bytes) |
MDDB_COMPRESSION_MEDIUM_THRESHOLD | 10240 | Zstd compression threshold (bytes) |
Access Modes
read: Read-only mode. Write operations will return403 Forbiddenwrite: Write-only mode (not commonly used)wr: Read and write mode (recommended for most use cases)
Endpoints
POST /v1/add
Add or update a markdown document in a collection.
Request Body:
{ "collection": "blog", "key": "homepage", "lang": "en_GB", "meta": { "category": ["blog", "featured"], "author": ["John Doe"], "tags": ["golang", "database"] }, "contentMd": "# Welcome\n\nThis is the homepage content."
}
Response:
{ "id": "blog|homepage|en_gb", "key": "homepage", "lang": "en_GB", "meta": { "category": ["blog", "featured"], "author": ["John Doe"], "tags": ["golang", "database"] }, "contentMd": "# Welcome\n\nThis is the homepage content.", "addedAt": 1699296000, "updatedAt": 1699296000
}
Features:
- Creates a new document or updates an existing one
- Automatically generates a deterministic ID based on collection, key, and lang
- Maintains revision history
- Updates metadata indices
- Tracks
addedAt(first creation) andupdatedAt(last modification) timestamps
cURL Example:
curl -X POST http://localhost:11023/v1/add \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "key": "homepage", "lang": "en_GB", "meta": { "category": ["blog"] }, "contentMd": "# Welcome to my blog" }'
POST /v1/add-batch
Add multiple documents to a collection in a single request. Uses the optimized batch processor for high throughput. Fires all post-commit hooks (embedding, FTS indexing, webhooks, TTL, automation triggers).
Request Body:
{ "collection": "blog", "documents": [ { "key": "post1", "lang": "en", "contentMd": "# Post 1\n\nFirst post content.", "meta": { "category": ["blog"], "author": ["John Doe"] }, "saveRevision": true }, { "key": "post2", "lang": "en", "contentMd": "# Post 2\n\nSecond post content.", "meta": { "category": ["tutorial"] } } ]
}
Parameters:
collection(required): Collection namedocuments(required): Array of documents to addkey(required): Document keylang(required): Language codecontentMd(required): Markdown contentmeta(optional): Metadata key-value pairssaveRevision(optional): Whether to save a revision for this document
Response:
{ "added": 1, "updated": 1, "failed": 0, "errors": []
}
cURL Example:
curl -X POST http://localhost:11023/v1/add-batch \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "documents": [ {"key": "p1", "lang": "en", "contentMd": "# Hello"}, {"key": "p2", "lang": "en", "contentMd": "# World"} ] }'
POST /v1/ingest
Bulk ingest endpoint with advanced features for scraping pipelines and data import workflows. Supports URL key derivation, YAML frontmatter extraction, content deduplication, auto-metadata injection, and collection auto-configuration.
Request Body:
{ "collection": "imported", "documents": [ { "url": "https://example.com/page1", "lang": "en", "contentMd": "# Page 1\n\nContent here.", "scraper": "my-crawler", "scrapedAt": 1709500000, "ttl": 86400 }, { "url": "https://example.com/page2", "lang": "en", "contentMd": "---\ntitle: Page 2\ncategory: tutorial\n---\n# Page 2\n\nMore content.", "extractFrontmatter": true } ], "options": { "skipDuplicates": true, "autoConfigureCollection": true }
}
Parameters:
collection(required): Collection namedocuments(required): Array of documents to ingesturl(optional): Source URL β used for key derivation and auto-injected assource_urlmetadatakey(optional): Document key β if empty, derived from URLlang(required): Language codecontentMd(required): Markdown contentmeta(optional): Metadata key-value pairsextractFrontmatter(optional): Parse YAML frontmatter from content and merge into metadatascrapedAt(optional): Unix timestamp of when the content was scraped β auto-injected asscraped_atmetadatascraper(optional): Scraper identifier β auto-injected asscrapermetadatattl(optional): Time-to-live in seconds
options(optional): Ingest optionsskipDuplicates(optional): Skip documents whose content hasn't changed (CRC32 hash comparison)skipEmbeddings(optional): Skip embedding generation for this batchskipFts(optional): Skip FTS indexing for this batchskipWebhooks(optional): Skip webhook firing for this batchautoConfigureCollection(optional): Auto-configure collection as "scraping" type if it doesn't existsaveRevision(optional): Save revision history for all documents in this batch
Response:
{ "added": 2, "updated": 0, "skipped": 0, "failed": 0, "errors": [], "collection": "imported", "durationMs": 45
}
Features:
- URL key derivation: If
keyis empty, a deterministic key is derived from the URL path - Frontmatter extraction: When
extractFrontmatteris true, YAML frontmatter is parsed from content and merged into metadata (request metadata takes priority over frontmatter) - Auto-metadata injection:
source_url,scraped_at, andscraperfields are auto-injected into document metadata - Content deduplication: With
skipDuplicates, existing documents with identical content (CRC32 hash) are skipped - Collection auto-configuration: With
autoConfigureCollection, the collection is created with type "scraping" if it doesn't exist - Selective hook control: Skip embeddings, FTS, or webhooks per batch via options
cURL Example:
curl -X POST http://localhost:11023/v1/ingest \ -H 'Content-Type: application/json' \ -d '{ "collection": "imported", "documents": [ {"url": "https://example.com/page1", "lang": "en", "contentMd": "# Hello", "scraper": "my-crawler"}, {"url": "https://example.com/page2", "lang": "en", "contentMd": "# World", "extractFrontmatter": true} ], "options": {"autoConfigureCollection": true, "skipDuplicates": true} }'
POST /v1/upload
Upload files via multipart/form-data. Files are auto-converted to Markdown and stored as documents. Supports single and batch upload.
Content-Type: multipart/form-data
Form Fields:
fileorfiles[](required): One or more files to upload. Supported formats:.md,.txt,.html,.htm,.pdf,.docx,.odt,.rtf,.yaml,.yml,.log,.lex,.tex,.latexcollection(required): Target collection namelang(required): Document language code (e.g.en_US,pl_PL)key(optional): Document key β if empty, derived from filename (lowercase, spacesβhyphens, extension stripped)meta(optional): JSON-encoded metadata map, e.g.{"category":["docs"]}ttl(optional): Time-to-live in seconds (0 = no expiry)maxSize(optional): Per-file size limit in bytes (default: 10MB, max: 100MB)
Format Conversion:
| Format | Extension | Conversion |
|---|---|---|
| Markdown | .md | Stored as-is, frontmatter extracted |
| Plain text | .txt | Stored as-is, frontmatter extracted |
| HTML | .html, .htm | Converted to Markdown (headings, links, lists, bold/italic preserved) |
.pdf | Text extracted (text-based PDFs only; scanned/image PDFs not supported β use Docling) | |
| DOCX | .docx | Text extracted with headings and list structure preserved |
| ODT | .odt | OpenDocument text extracted with headings preserved |
| RTF | .rtf | Rich Text Format β text extracted, formatting stripped |
| LaTeX | .tex, .latex | Converted to Markdown (sections, formatting, environments, math preserved) |
| YAML | .yaml, .yml | Wrapped in code block for structured data |
| Log | .log | Wrapped in code block |
| LEX | .lex | Wrapped in code block |
Auto-injected Metadata:
upload_format: Original file format (e.g.pdf,html,docx)upload_filename: Original filenameupload_converted:"true"if file was converted from non-markdown format
Single File Response:
{ "key": "report-2026-q1", "format": "pdf", "converted": true, "document": { "id": "doc|docs|report-2026-q1", "key": "report-2026-q1", "lang": "en_US", "meta": { "upload_format": ["pdf"], "upload_filename": ["report-2026-q1.pdf"], "upload_converted": ["true"] }, "contentMd": "# Q1 2026 Report\n\nExtracted text content...", "addedAt": 1710000000, "updatedAt": 1710000000 }
}
Batch Response (multiple files):
{ "added": 3, "updated": 0, "failed": 0, "errors": [], "results": [ {"key": "doc1", "format": "pdf", "converted": true, "document": {...}}, {"key": "doc2", "format": "html", "converted": true, "document": {...}}, {"key": "doc3", "format": "txt", "converted": false, "document": {...}} ]
}
cURL Examples:
curl -X POST http://localhost:11023/v1/upload \ -F "[email protected]" \ -F "collection=docs" \ -F "lang=en_US" curl -X POST http://localhost:11023/v1/upload \ -F "[email protected]" \ -F "collection=docs" \ -F "key=user-manual" \ -F "lang=en_US" \ -F 'meta={"category":["documentation"],"type":["manual"]}' curl -X POST http://localhost:11023/v1/upload \ -F "files[][email protected]" \ -F "files[][email protected]" \ -F "files[][email protected]" \ -F "collection=docs" \ -F "lang=en_US" curl -X POST http://localhost:11023/v1/upload \ -F "[email protected]" \ -F "collection=docs" \ -F "lang=en_US" \ -F "maxSize=52428800"
MCP Tool: upload_file β accepts base64-encoded file content with filename for format detection.
POST /v1/get
Retrieve a specific document by collection, key, and language.
Request Body:
{ "collection": "blog", "key": "homepage", "lang": "en_GB", "env": { "year": "2024", "siteName": "My Blog" }
}
Response:
{ "id": "blog|homepage|en_gb", "key": "homepage", "lang": "en_GB", "meta": { "category": ["blog"] }, "contentMd": "# Welcome to My Blog in 2024", "addedAt": 1699296000, "updatedAt": 1699296000
}
Features:
- Retrieves the latest version of a document
- Supports templating via
envparameter - Template variables in content are replaced:
%%varName%%β value fromenv
Template Example:
If your content contains:
And you provide:
{ "env": { "year": "2024", "siteName": "My Blog" }
}
The response will contain:
cURL Example:
curl -X POST http://localhost:11023/v1/get \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "key": "homepage", "lang": "en_GB", "env": {"year": "2024"} }'
POST /v1/search
Search for documents in a collection with optional metadata filtering and sorting.
Request Body:
{ "collection": "blog", "filterMeta": { "category": ["blog", "tutorial"], "author": ["John Doe"] }, "sort": "updatedAt", "asc": false, "limit": 10, "offset": 0
}
Parameters:
collection(required): Collection namefilterMeta(optional): Metadata filters (AND between keys, OR between values)sort(optional): Sort field -addedAt,updatedAt, orkeyasc(optional): Sort order -truefor ascending,falsefor descendinglimit(optional): Maximum number of results (default: 50)offset(optional): Number of results to skip (default: 0)
Response:
[ { "id": "blog|post1|en_gb", "key": "post1", "lang": "en_GB", "meta": { "category": ["blog"], "author": ["John Doe"] }, "contentMd": "# Post 1", "addedAt": 1699296000, "updatedAt": 1699296100 }, { "id": "blog|post2|en_gb", "key": "post2", "lang": "en_GB", "meta": { "category": ["tutorial"], "author": ["John Doe"] }, "contentMd": "# Post 2", "addedAt": 1699295000, "updatedAt": 1699296200 }
]
Filtering Logic:
- Multiple values for the same key are combined with OR
- Multiple keys are combined with AND
- Example:
{"category": ["blog", "tutorial"], "author": ["John"]}means:- (category = "blog" OR category = "tutorial") AND (author = "John")
cURL Example:
curl -X POST http://localhost:11023/v1/search \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "filterMeta": {"category": ["blog"]}, "sort": "addedAt", "asc": true, "limit": 10 }'
POST /v1/vector-search
Perform semantic (vector) search using natural language queries. Documents are automatically embedded when added (if an embedding provider is configured). The search finds documents by meaning, not just exact metadata matches.
Request Body:
{ "collection": "docs", "query": "how to authenticate users", "topK": 5, "threshold": 0.3, "filterMeta": { "category": ["tutorial"] }, "includeContent": true
}
Parameters:
collection(required): Collection namequery(required*): Natural language search query (will be embedded server-side)queryVector(optional*): Pre-computed embedding vector (use instead ofquery)topK(optional): Maximum results to return (default: 5)threshold(optional): Minimum similarity score 0.0-1.0 (default: 0.0)filterMeta(optional): Metadata pre-filter (same logic as/v1/search)includeContent(optional): IncludecontentMdin results (default: false)
* Either query or queryVector is required.
Response:
{ "results": [ { "document": { "id": "docs|auth-guide|en_us", "key": "auth-guide", "lang": "en_US", "meta": {"category": ["tutorial"]}, "contentMd": "# Authentication Guide\n...", "addedAt": 1709136000, "updatedAt": 1709136000 }, "score": 0.89, "rank": 1 }, { "document": { "id": "docs|login-flow|en_us", "key": "login-flow", "lang": "en_US", "meta": {"category": ["tutorial"]}, "contentMd": "# Login Flow\n...", "addedAt": 1709135000, "updatedAt": 1709135000 }, "score": 0.74, "rank": 2 } ], "total": 2, "model": "text-embedding-3-small", "dimensions": 1536
}
Response Fields:
results: Array of matched documents with similarity scoresdocument: Full document objectscore: Cosine similarity score (0.0-1.0, higher = more similar)rank: Position in results (1-based)
total: Number of results returnedmodel: Embedding model useddimensions: Vector dimensionality
How It Works:
- When a document is added via
/v1/add, its content is automatically embedded in the background - The query text is embedded using the same model
- Cosine similarity is computed between the query vector and all document vectors
- Results are ranked by similarity score
- If
filterMetais provided, only documents matching the metadata filter are searched (hybrid search)
cURL Example:
curl -X POST http://localhost:11023/v1/vector-search \ -H 'Content-Type: application/json' \ -d '{ "collection": "docs", "query": "how to authenticate users", "topK": 5, "includeContent": true }'
POST /v1/vector-reindex
Re-embed all documents in a collection. Useful after changing the embedding provider/model, or for initial indexing of existing documents.
Request Body:
{ "collection": "docs", "force": false
}
Parameters:
collection(required): Collection nameforce(optional): Iftrue, re-embed all documents regardless of content changes. Iffalse, skip documents whose content hasn't changed (default: false)
Response:
{ "embedded": 42, "skipped": 8, "failed": 0, "errors": []
}
cURL Example:
curl -X POST http://localhost:11023/v1/vector-reindex \ -H 'Content-Type: application/json' \ -d '{"collection": "docs", "force": false}'
GET /v1/vector-stats
Get embedding/vector search statistics.
Response:
{ "enabled": true, "provider": "text-embedding-3-small", "model": "text-embedding-3-small", "dimensions": 1536, "index_ready": true, "collections": { "docs": { "total_documents": 50, "embedded_documents": 48 }, "blog": { "total_documents": 120, "embedded_documents": 120 } }
}
cURL Example:
curl http://localhost:11023/v1/vector-stats
Vector Search Configuration
Embedding Providers
| Provider | MDDB_EMBEDDING_PROVIDER | Default Model | Default Dimensions | API Key Required |
|---|---|---|---|---|
| OpenAI | openai | text-embedding-3-small | 1536 | Yes |
| Voyage AI (Anthropic) | voyage | voyage-3 | 1024 | Yes |
| Ollama (local) | ollama | nomic-embed-text | 768 | No |
| Disabled | none or empty | - | - | - |
Provider-Specific Configuration
OpenAI:
MDDB_EMBEDDING_PROVIDER=openai
MDDB_EMBEDDING_API_KEY=sk-...
MDDB_EMBEDDING_API_URL=https://api.openai.com/v1 # default
MDDB_EMBEDDING_MODEL=text-embedding-3-small # default
MDDB_EMBEDDING_DIMENSIONS=1536 # default
Voyage AI (Anthropic):
MDDB_EMBEDDING_PROVIDER=voyage
MDDB_EMBEDDING_API_KEY=pa-...
MDDB_EMBEDDING_API_URL=https://api.voyageai.com/v1 # default
MDDB_EMBEDDING_MODEL=voyage-3 # default
MDDB_EMBEDDING_DIMENSIONS=1024 # default
Ollama (local, no API key needed):
MDDB_EMBEDDING_PROVIDER=ollama
MDDB_EMBEDDING_API_URL=http://localhost:11434 # default
MDDB_EMBEDDING_MODEL=nomic-embed-text # default
MDDB_EMBEDDING_DIMENSIONS=768 # default
Performance Benchmarks (Apple M2)
| Documents | Dimensions | Search Latency | Throughput |
|---|---|---|---|
| 1,000 | 768 | ~0.9 ms | ~1,064 qps |
| 1,000 | 1,536 | ~1.8 ms | ~544 qps |
| 5,000 | 768 | ~4.8 ms | ~210 qps |
| 10,000 | 768 | ~9.7 ms | ~104 qps |
| 10,000 | 1,536 | ~19 ms | ~52 qps |
| 50,000 | 768 | ~50 ms | ~20 qps |
| 50,000 | 1,536 | ~96 ms | ~10 qps |
Metadata pre-filtering significantly reduces search time (e.g., filtering to 10% of 10K docs: ~1.1 ms vs ~9.7 ms).
POST /v1/fts
Perform full-text search across document content. Supports multiple search modes: simple, boolean, phrase, wildcard, proximity, and range filtering. Uses TF-IDF, BM25, BM25F, or PMISparse scoring with optional stemming, synonyms, and typo tolerance.
Request Body:
{ "collection": "blog", "query": "markdown database tutorial", "limit": 10, "algorithm": "bm25f", "fuzzy": 1, "mode": "auto", "disableStem": false, "disableSynonyms": false, "fieldWeights": { "content": 1.0, "meta.title": 3.0, "meta.tags": 2.0 }, "rangeMeta": [ {"field": "addedAt", "gte": "2024-01-01", "lte": "2024-12-31"} ]
}
Parameters:
collection(required): Collection namequery(required): Search query textlimit(optional): Maximum results (default: 50)algorithm(optional):"tfidf"(default),"bm25","bm25f", or"pmisparse"β used for simple modemode(optional): Search mode β"auto"(default),"simple","boolean","phrase","wildcard","proximity"distance(optional): Proximity distance in words (default: 5) β only used with mode=proximityfuzzy(optional): Typo tolerance β0(off, default),1(1 edit),2(2 edits) β used for simple modelang(optional): Language code for query tokenization (e.g.,"pl","de","fr"). Uses language-specific stemmer and stop words. Falls back to server default if omitted (default:"en", configurable viaMDDB_FTS_DEFAULT_LANG)disableStem(optional): Disable stemming for this query (default: false)disableSynonyms(optional): Disable synonym expansion for this query (default: false)fieldWeights(optional, BM25F only): Map of field name to weight. Defaults: content=1.0, meta.title=3.0, meta.tags=2.0, meta.category=2.0, meta.description=1.5filterMeta(optional): Metadata pre-filter β{"key": ["value1", "value2"]}rangeMeta(optional): Array of range filters on metadata or timestamps
Search Modes:
- simple: Standard full-text search with TF-IDF/BM25/BM25F/PMISparse scoring
- boolean: Boolean operators β
rust AND performance,rust OR golang,NOT java,+required -excluded - phrase: Exact phrase matching β
"machine learning algorithms"(consecutive terms) - wildcard: Pattern matching β
prog*(any suffix),te?t(single char) - proximity: Terms within N words β
"rust systems"withdistance: 5 - auto: Auto-detects mode from query syntax (default)
Range Filter Object:
field(required): Metadata key name, or"addedAt"/"updatedAt"for timestampsgte(optional): Greater than or equal (supports unix timestamps, ISO dates, numeric strings)lte(optional): Less than or equalgt(optional): Greater than (strict)lt(optional): Less than (strict)
Response:
{ "results": [ { "document": { "id": "blog|post1|en_gb", "key": "post1", "lang": "en_GB", "meta": {"category": ["tutorial"]}, "contentMd": "# Markdown Database Tutorial..." }, "score": 2.3456, "matchedTerms": ["markdown", "databas", "tutori"] } ], "total": 1, "algorithm": "bm25", "mode": "simple", "lang": "en", "stemmingActive": true, "synonymsActive": true
}
cURL Examples:
curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"blog","query":"markdown database","algorithm":"bm25","limit":10}' curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"blog","query":"rust AND performance NOT java","mode":"boolean"}' curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"blog","query":"\"machine learning\"","mode":"phrase"}' curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"blog","query":"prog*","mode":"wildcard"}' curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"blog","query":"rust systems","mode":"proximity","distance":5}' curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"shop","query":"widget","rangeMeta":[{"field":"price","gte":"10","lte":"100"}]}' curl -X POST http://localhost:11023/v1/fts \ -H 'Content-Type: application/json' \ -d '{"collection":"articles","query":"programowanie wydajne","lang":"pl","algorithm":"bm25"}'
POST /v1/fts-reindex
Reindex all documents in a collection using their stored lang field for language-aware FTS processing.
Query Parameters:
collection(required): Collection name to reindex
cURL Example:
curl -X POST "http://localhost:11023/v1/fts-reindex?collection=articles"
Response:
{ "reindexed": 150, "collection": "articles"
}
GET /v1/fts-languages
Returns all supported languages for multi-language FTS.
cURL Example:
curl http://localhost:11023/v1/fts-languages
Response:
{ "languages": [ {"code": "ar", "name": "Arabic"}, {"code": "da", "name": "Danish"}, {"code": "de", "name": "German"}, {"code": "en", "name": "English"} ], "defaultLang": "en"
}
POST /v1/synonyms
Add or update synonyms for a term in a collection.
Request Body:
{ "collection": "docs", "term": "big", "synonyms": ["large", "huge", "enormous"]
}
Response:
{ "status": "ok"
}
cURL Example:
curl -X POST http://localhost:11023/v1/synonyms \ -H 'Content-Type: application/json' \ -d '{"collection":"docs","term":"big","synonyms":["large","huge","enormous"]}'
GET /v1/synonyms
List all synonyms for a collection.
Query Parameters:
collection(required): Collection name
Response:
{ "collection": "docs", "synonyms": { "big": ["large", "huge", "enormous"], "fast": ["quick", "rapid", "swift"] }
}
cURL Example:
curl "http://localhost:11023/v1/synonyms?collection=docs"
DELETE /v1/synonyms
Delete all synonyms for a term in a collection.
Request Body:
{ "collection": "docs", "term": "big"
}
Response:
{ "status": "ok"
}
cURL Example:
curl -X DELETE http://localhost:11023/v1/synonyms \ -H 'Content-Type: application/json' \ -d '{"collection":"docs","term":"big"}'
POST /v1/export
Export documents from a collection in NDJSON or ZIP format.
Request Body:
{ "collection": "blog", "filterMeta": { "category": ["blog"] }, "format": "ndjson"
}
Parameters:
collection(required): Collection namefilterMeta(optional): Metadata filters (same as search)format(required): Export format -ndjsonorzip
Response (NDJSON):
{"id":"blog|post1|en_gb","key":"post1","lang":"en_GB","meta":{"category":["blog"]},"contentMd":"# Post 1","addedAt":1699296000,"updatedAt":1699296100}
{"id":"blog|post2|en_gb","key":"post2","lang":"en_GB","meta":{"category":["blog"]},"contentMd":"# Post 2","addedAt":1699295000,"updatedAt":1699296200}
Response (ZIP):
Binary ZIP file containing markdown files named as {key}.{lang}.md
cURL Examples:
NDJSON export:
curl -X POST http://localhost:11023/v1/export \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "filterMeta": {"category": ["blog"]}, "format": "ndjson" }' > export.ndjson
ZIP export:
curl -X POST http://localhost:11023/v1/export \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "format": "zip" }' > export.zip
GET /v1/backup
Create a backup of the database file.
Query Parameters:
to(optional): Backup file name (default:backup-{timestamp}.db)
Response:
{ "backup": "backup-1699296000.db"
}
cURL Example:
curl "http://localhost:11023/v1/backup?to=backup-$(date +%s).db"
Notes:
- Creates a copy of the entire BoltDB database file
- Backup is created in the same directory as the database
- Does not interrupt server operations
POST /v1/restore
Restore the database from a backup file.
Request Body:
{ "from": "backup-1699296000.db"
}
Response:
{ "restored": "backup-1699296000.db"
}
cURL Example:
curl -X POST http://localhost:11023/v1/restore \ -H 'Content-Type: application/json' \ -d '{"from": "backup-1699296000.db"}'
β οΈ Warning:
- This operation replaces the current database
- The server briefly closes and reopens the database connection
- All current data will be replaced with the backup
POST /v1/truncate
Truncate revision history and optionally clear cache.
Request Body:
{ "collection": "blog", "keepRevs": 3, "dropCache": true
}
Parameters:
collection(required): Collection namekeepRevs(required): Number of recent revisions to keep per document (0 = delete all history)dropCache(optional): Whether to drop cache (placeholder for future use)
Response:
{ "status": "truncated"
}
cURL Example:
curl -X POST http://localhost:11023/v1/truncate \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "keepRevs": 3, "dropCache": true }'
Use Cases:
- Reduce database size by removing old revisions
- Keep only recent history for auditing
- Clean up after bulk imports
GET /v1/stats
Get server and database statistics.
Request: No body required (GET request)
Response:
{ "databasePath": "mddb.db", "databaseSize": 16384, "mode": "wr", "collections": [ { "name": "blog", "documentCount": 42, "revisionCount": 156, "metaIndexCount": 84 } ], "totalDocuments": 42, "totalRevisions": 156, "totalMetaIndices": 84, "uptime": ""
}
Response Fields:
databasePath: Path to the database filedatabaseSize: Database file size in bytesmode: Access mode (read, write, wr)collections: Array of collection statisticsname: Collection namedocumentCount: Number of documents in collectionrevisionCount: Number of revisions in collectionmetaIndexCount: Number of metadata indices in collection
totalDocuments: Total documents across all collectionstotalRevisions: Total revisions across all collectionstotalMetaIndices: Total metadata indices across all collections
cURL Example:
curl http://localhost:11023/v1/stats
CLI Example:
mddb-cli stats
Use Cases:
- Monitor database growth
- Check collection sizes before operations
- Verify indexing status
- Performance monitoring and capacity planning
POST /v1/schema/set
Set or update the validation schema for a collection. Schema validation is opt-in per collection. See the Schema Validation Guide for full details on supported rules.
Request Body:
{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } }
}
Response:
{ "status": "ok"
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/set \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "schema": { "required": ["category"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial"] } } } }'
POST /v1/schema/get
Retrieve the current validation schema for a collection.
Request Body:
{ "collection": "blog"
}
Response (schema exists):
{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } }
}
Response (no schema):
{ "collection": "blog", "schema": null
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/get \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}'
POST /v1/schema/delete
Delete the validation schema for a collection, disabling validation. Existing documents are not affected.
Request Body:
{ "collection": "blog"
}
Response:
{ "status": "ok"
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/delete \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}'
POST /v1/schema/list
List all collections that have a validation schema defined.
Request Body: Empty or {}.
Response:
{ "schemas": [ { "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" } } } }, { "collection": "products", "schema": { "required": ["price", "sku"], "properties": { "price": { "type": "number" }, "sku": { "type": "string", "pattern": "^SKU-[0-9]+$" } } } } ]
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/list \ -H 'Content-Type: application/json' \ -d '{}'
POST /v1/validate
Validate a document's metadata against the collection schema without persisting anything. Useful for dry-run checks.
Request Body:
{ "collection": "blog", "meta": { "category": ["blog"], "author": ["Jane Doe"], "tags": ["golang", "tutorial"] }
}
Response (valid):
{ "valid": true, "errors": []
}
Response (invalid):
{ "valid": false, "errors": [ "value \"pending\" for key \"status\" is not in allowed enum values [draft, published, archived]", "key \"tags\" has 6 values, exceeds maxItems 5" ]
}
cURL Example:
curl -X POST http://localhost:11023/v1/validate \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "meta": { "category": ["blog"], "author": ["Jane Doe"] } }'
POST /v1/auth/login
Authenticate with username and password to receive a JWT token. The token must be included in the Authorization header for subsequent authenticated requests.
Request Body:
{ "username": "admin", "password": "secret"
}
Response:
{ "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...", "expiresAt": 1709481200
}
cURL Example:
curl -X POST http://localhost:11023/v1/auth/login \ -H 'Content-Type: application/json' \ -d '{"username":"admin","password":"secret"}'
Error Responses:
401 Unauthorized- Invalid credentials400 Bad Request- Invalid request format
POST /v1/auth/api-key
Create a new API key for programmatic access. Requires JWT authentication via Authorization header.
Authentication: JWT token required
Request Body:
{ "description": "CI/CD pipeline", "expiresAt": 0
}
Parameters:
description(string, optional): Human-readable label for the API keyexpiresAt(int64, optional): Unix timestamp when key expires (0 = never expires)
Response:
{ "key": "mddb_live_abc123def456...", "description": "CI/CD pipeline", "createdAt": 1709394600, "expiresAt": 0
}
cURL Example:
TOKEN=$(curl -s -X POST http://localhost:11023/v1/auth/login \ -H 'Content-Type: application/json' \ -d '{"username":"admin","password":"secret"}' | jq -r .token) curl -X POST http://localhost:11023/v1/auth/api-key \ -H "Authorization: Bearer $TOKEN" \ -H 'Content-Type: application/json' \ -d '{"description":"Production deployment","expiresAt":0}'
Important Notes:
- The full API key is only shown once in the response
- Save the key securely - it cannot be retrieved again
- API keys are hashed with SHA256 before storage
- Use the key in subsequent requests via the
X-API-Keyheader
Error Responses:
401 Unauthorized- Missing or invalid JWT token400 Bad Request- Invalid request format500 Internal Server Error- Failed to create API key
GET /v1/auth/api-keys
List all API keys for the authenticated user. Returns metadata about each key (not the actual key values).
Authentication: JWT token required
Response:
{ "keys": [ { "keyHash": "abc123def456...", "description": "Production deployment", "createdAt": 1709394600, "expiresAt": 0 }, { "keyHash": "xyz789ghi012...", "description": "Development testing", "createdAt": 1709395200, "expiresAt": 1740931200 } ]
}
Response Fields:
keyHash(string): SHA256 hash of the API key (use this to delete the key)description(string): Key descriptioncreatedAt(int64): Unix timestamp of creationexpiresAt(int64): Unix timestamp of expiry (0 = never expires)
cURL Example:
curl -H "Authorization: Bearer $TOKEN" \ http://localhost:11023/v1/auth/api-keys
Error Responses:
401 Unauthorized- Missing or invalid JWT token500 Internal Server Error- Failed to retrieve API keys
DELETE /v1/auth/api-keys/:keyHash
Delete an API key by its hash. Users can only delete their own API keys.
Authentication: JWT token required
URL Parameters:
keyHash(string, required): The SHA256 hash of the API key (from GET /v1/auth/api-keys)
Response:
{ "status": "deleted"
}
cURL Example:
curl -H "Authorization: Bearer $TOKEN" \ http://localhost:11023/v1/auth/api-keys curl -X DELETE -H "Authorization: Bearer $TOKEN" \ http://localhost:11023/v1/auth/api-keys/abc123def456...
Error Responses:
401 Unauthorized- Missing or invalid JWT token403 Forbidden- Attempting to delete another user's API key404 Not Found- API key not found400 Bad Request- Missing keyHash parameter
Using API Keys
Once you have an API key, use it to authenticate requests instead of JWT tokens:
With HTTP Header:
curl -H "X-API-Key: mddb_live_abc123def456..." \ http://localhost:11023/v1/search \ -H 'Content-Type: application/json' \ -d '{"collection":"blog","filterMeta":{"status":["published"]}}'
With CLI:
mddb-cli --api-key mddb_live_abc123def456... search blog -f "status=published"
API Key vs JWT Token:
- JWT Tokens: Short-lived (default 24h), obtained via login, ideal for interactive sessions
- API Keys: Long-lived or permanent, ideal for automation, CI/CD, and third-party integrations
POST /v1/classify
Zero-shot document classification using embedding similarity. Ranks candidate labels by their semantic similarity to a document or text.
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
collection | string | No* | Collection name (for doc reference) |
key | string | No* | Document key (for doc reference) |
lang | string | No | Language code (default: "en") |
text | string | No* | Raw text to classify |
labels | string[] | Yes | Candidate labels (max 100) |
topK | int | No | Return top K labels (0 = all) |
multi | bool | No | Return all labels above threshold |
threshold | float | No | Minimum similarity score (default: 0.0) |
*Provide either text OR collection+key (with optional lang).
Example Request:
curl -X POST http://localhost:11023/v1/classify \ -d '{ "text": "Go is a statically typed, compiled language designed at Google", "labels": ["programming", "cooking", "sports", "music"] }'
Example Response:
{ "results": [ {"label": "programming", "score": 0.87}, {"label": "music", "score": 0.21}, {"label": "sports", "score": 0.18}, {"label": "cooking", "score": 0.12} ], "model": "text-embedding-3-small", "dimensions": 1536
}
Notes:
- Requires an embedding provider to be configured
- For document references, reuses existing embedding from vector store if available
- Labels are embedded in a single batch API call for efficiency
PATCH /v1/update
Partially update a document's metadata and/or content independently without re-sending the entire document.
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
collection | string | Yes | Collection name |
key | string | Yes | Document key |
lang | string | Yes | Language code |
meta | object | No | New metadata (replaces all). Use {} to clear |
contentMd | string | No | New content (replaces existing) |
ttl | int | No | New TTL in seconds (0 = remove) |
Example:
curl -X PATCH http://localhost:11023/v1/update \ -d '{"collection":"blog","key":"p1","lang":"en","meta":{"tag":["go","updated"]}}' curl -X PATCH http://localhost:11023/v1/update \ -d '{"collection":"blog","key":"p1","lang":"en","contentMd":"# Updated content"}'
GET /v1/doc-meta
Get document metadata without content. Lightweight read.
Query Parameters:
| Parameter | Required | Description |
|---|---|---|
collection | Yes | Collection name |
key | Yes | Document key |
lang | No | Language code (default: "en") |
Example:
curl "http://localhost:11023/v1/doc-meta?collection=blog&key=p1&lang=en"
POST /v1/delete
Delete a document from a collection.
Request Body:
{ "collection": "blog", "key": "homepage", "lang": "en"
}
Response:
{ "status": "deleted", "collection": "blog", "key": "homepage", "lang": "en"
}
POST /v1/delete-batch
Delete multiple documents in a single request.
Request Body:
{ "collection": "blog", "documents": [ { "key": "post-1", "lang": "en" }, { "key": "post-2", "lang": "en" } ]
}
Response:
{ "deleted": 2, "not_found": 0, "failed": 0, "errors": null
}
POST /v1/delete-collection
Delete all documents in a collection.
Request Body:
{ "collection": "blog"
}
Response:
{ "status": "ok", "collection": "blog"
}
POST /v1/hybrid-search
Hybrid search combining full-text (sparse) and vector (dense) results using alpha blending or reciprocal rank fusion (RRF).
Request Body:
{ "collection": "blog", "query": "how to deploy", "topK": 10, "algorithm": "bm25", "vectorAlgorithm": "flat", "alpha": 0.5, "strategy": "alpha", "rrfK": 60, "fuzzy": 0, "threshold": 0.0, "distanceMetric": "cosine", "filterMeta": { "category": ["tutorial"] }, "includeContent": false, "disableStem": false, "disableSynonyms": false
}
| Field | Type | Default | Description |
|---|---|---|---|
collection | string | Required. Collection name | |
query | string | Required. Search query | |
topK | integer | 10 | Max results |
algorithm | string | "bm25" | FTS algorithm: bm25, bm25f |
vectorAlgorithm | string | "flat" | Vector algorithm: flat, hnsw, ivf, pq, sq |
alpha | number | 0.5 | Weight blending (0=FTS only, 1=vector only) |
strategy | string | "alpha" | Fusion strategy: alpha or rrf |
rrfK | integer | 60 | RRF parameter k |
fuzzy | integer | 0 | Typo tolerance: 0, 1, or 2 |
threshold | number | 0.0 | Min vector similarity 0β1 |
distanceMetric | string | "cosine" | cosine, dot_product, euclidean |
filterMeta | object | Metadata key-value filter | |
includeContent | boolean | false | Include full content |
disableStem | boolean | false | Disable stemming |
disableSynonyms | boolean | false | Disable synonym expansion |
Response:
{ "results": [ { "document": { "id": "...", "key": "...", "lang": "...", "meta": {} }, "combinedScore": 0.85, "ftsScore": 0.7, "vectorScore": 0.95, "matchedTerms": ["deploy"], "rank": 1 } ], "total": 1, "strategy": "alpha", "alpha": 0.5, "ftsAlgorithm": "bm25", "vectorAlgorithm": "flat", "distanceMetric": "cosine", "searchStats": { "durationMs": 12 }
}
POST /v1/cross-search
Vector search across multiple collections using a text query, pre-computed vector, or another document's embedding.
Request Body:
{ "query": "machine learning basics", "targetCollections": ["articles", "tutorials"], "topK": 10, "threshold": 0.5, "algorithm": "flat", "distanceMetric": "cosine", "includeContent": false
}
Alternative source modes (use one):
query(string) β text to embedsourceCollection+sourceDocIDβ use an existing document's embeddingqueryVector(array of numbers) β pre-computed vector
| Field | Type | Default | Description |
|---|---|---|---|
targetCollections | string[] | all | Collections to search |
topK | integer | 10 | Max results |
threshold | number | 0.0 | Min similarity |
algorithm | string | "flat" | Vector algorithm |
distanceMetric | string | "cosine" | Distance metric |
filterMeta | object | Metadata filter | |
includeContent | boolean | false | Include content |
Response:
{ "results": [ { "collection": "tutorials", "document": { "key": "ml-intro", "lang": "en", "meta": {} }, "score": 0.92, "rank": 1 } ], "total": 1, "targetCollections": ["articles", "tutorials"], "algorithm": "flat", "distanceMetric": "cosine", "searchStats": { "durationMs": 8, "collectionsSearched": 2 }
}
POST /v1/find-duplicates
Detect exact and similar documents in a collection using content hashing and vector embeddings.
Request Body:
{ "collection": "blog", "mode": "both", "threshold": 0.9, "maxDocs": 5000, "distanceMetric": "cosine", "includeContent": false
}
| Field | Type | Default | Description |
|---|---|---|---|
collection | string | Required. Collection name | |
mode | string | "both" | exact, similar, or both |
threshold | number | 0.9 | Similarity threshold 0β1 |
maxDocs | integer | 5000 | Max documents to process |
distanceMetric | string | "cosine" | Distance metric |
includeContent | boolean | false | Include document content |
Response:
{ "collection": "blog", "mode": "both", "threshold": 0.9, "distanceMetric": "cosine", "totalDocuments": 150, "totalEmbedded": 148, "exactGroups": [ { "groupId": 1, "type": "exact", "documents": [ { "docId": "blog|p1|en", "key": "p1", "contentHash": "abc123" }, { "docId": "blog|p2|en", "key": "p2", "contentHash": "abc123" } ] } ], "similarGroups": [], "exactDuplicates": 2, "similarPairs": 0
}
POST /v1/aggregate
Compute metadata facets and date histograms for a collection. Supports optional metadata pre-filtering.
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
collection | string | Yes | Collection name |
filterMeta | object | No | Metadata pre-filter (same as /v1/search) |
facets | array | No | Facet aggregation requests |
facets[].field | string | Yes | Metadata key to aggregate (e.g. "category") |
facets[].orderBy | string | No | "count" (default, descending) or "value" (alphabetical) |
histograms | array | No | Date histogram requests |
histograms[].field | string | Yes | "addedAt" or "updatedAt" |
histograms[].interval | string | No | "day", "week", "month" (default), "year" |
maxFacetSize | int | No | Max values per facet (default: 50) |
Example:
curl -X POST http://localhost:11023/v1/aggregate \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "facets": [ {"field": "category"}, {"field": "author", "orderBy": "value"} ], "histograms": [ {"field": "addedAt", "interval": "month"} ] }'
Example with metadata pre-filter:
curl -X POST http://localhost:11023/v1/aggregate \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "filterMeta": {"status": ["published"]}, "facets": [{"field": "tags"}] }'
Response:
{ "collection": "blog", "totalDocs": 42, "facets": { "category": [ {"value": "tutorial", "count": 15}, {"value": "news", "count": 12}, {"value": "release", "count": 8} ], "author": [ {"value": "Alice", "count": 20}, {"value": "Bob", "count": 22} ] }, "histograms": { "addedAt": [ {"key": "2026-01", "from": 1767225600, "to": 1769904000, "count": 10}, {"key": "2026-02", "from": 1769904000, "to": 1772323200, "count": 18}, {"key": "2026-03", "from": 1772323200, "to": 1775001600, "count": 14} ] }, "durationMs": 3
}
GET /v1/collection-config
Get configuration for a specific collection.
Query Parameters:
| Parameter | Required | Description |
|---|---|---|
collection | Yes | Collection name |
Example:
curl "http://localhost:11023/v1/collection-config?collection=blog"
Response:
{ "collection": "blog", "config": { "type": "default", "description": "Blog posts", "icon": "", "color": "", "customMeta": {} }, "configured": true
}
Also supports PUT to set config and DELETE to remove config for a collection.
PUT /v1/collection-config
Set or update collection configuration including storage backend.
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
collection | string | Yes | Collection name |
type | string | No | Collection type (default, website, images, audio, documents) |
description | string | No | Collection description |
icon | string | No | Emoji icon |
color | string | No | Hex color code |
customMeta | object | No | Custom key-value metadata |
storageBackend | string | No | Storage backend: boltdb (default), memory, s3 |
storageConfig | object | No | Backend-specific settings (required for s3) |
storageConfig fields (for S3):
| Field | Type | Required | Description |
|---|---|---|---|
endpoint | string | Yes | S3 endpoint (e.g. s3.amazonaws.com, minio:9000) |
bucket | string | Yes | S3 bucket name |
region | string | No | AWS region (e.g. us-east-1) |
accessKey | string | No | Access key |
secretKey | string | No | Secret key |
prefix | string | No | Key prefix within bucket (e.g. mddb/) |
useTLS | bool | No | Use HTTPS (default: false) |
Example β In-Memory backend:
curl -X PUT http://localhost:11023/v1/collection-config \ -H 'Content-Type: application/json' \ -d '{ "collection": "scratch", "type": "default", "storageBackend": "memory" }'
Example β S3 backend:
curl -X PUT http://localhost:11023/v1/collection-config \ -H 'Content-Type: application/json' \ -d '{ "collection": "archive", "type": "documents", "storageBackend": "s3", "storageConfig": { "endpoint": "s3.amazonaws.com", "bucket": "my-mddb-archive", "region": "us-east-1", "accessKey": "AKIAIOSFODNN7EXAMPLE", "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY", "prefix": "mddb/", "useTLS": true } }'
Response:
{ "status": "ok", "collection": "archive"
}
Note: The
memorybackend is ephemeral β all data is lost on server restart. Thes3backend requiresendpointandbucketinstorageConfig. The defaultboltdbbackend uses the embedded database.
GET /v1/collection-configs
List all collection configurations.
Example:
curl http://localhost:11023/v1/collection-configs
Response:
{ "configs": [ { "collection": "blog", "config": { "type": "default", "description": "Blog posts" } } ], "total": 1
}
GET /v1/embedding-configs
List all configured embedding models.
Example:
curl http://localhost:11023/v1/embedding-configs
Response:
{ "configs": [ { "id": "cfg_abc123", "name": "OpenAI Ada", "provider": "openai", "model": "text-embedding-3-small", "dimensions": 1536, "apiKey": "sk-...", "apiUrl": "", "isDefault": true, "createdAt": 1709100000 } ]
}
Also supports POST to create a new embedding config.
GET/PUT/DELETE /v1/embedding-configs/:id
Manage a specific embedding configuration.
- GET returns the config object
- PUT updates the config (fields:
name,provider,model,dimensions,apiKey,apiUrl,isDefault) - DELETE removes the config (returns 204 No Content)
Example:
curl http://localhost:11023/v1/embedding-configs/cfg_abc123
POST /v1/embedding-configs/set-default
Set a specific embedding configuration as default.
Request Body:
{ "id": "cfg_abc123"
}
Response:
{ "message": "default config updated"
}
GET/POST/DELETE /v1/stopwords
Manage FTS stop words for a collection.
GET β List stop words:
curl "http://localhost:11023/v1/stopwords?collection=blog"
Response:
{ "collection": "blog", "entries": [ { "word": "the", "isDefault": true }, { "word": "myword", "isDefault": false } ], "total": 2, "defaults": 1, "custom": 1
}
POST β Add stop words:
{ "collection": "blog", "words": ["myword", "another"]
}
DELETE β Remove stop words:
{ "collection": "blog", "words": ["myword"]
}
GET/POST /v1/webhooks
List or register webhooks.
GET β List all webhooks:
curl http://localhost:11023/v1/webhooks
POST β Register a webhook:
{ "url": "https://example.com/hook", "events": ["doc.added", "doc.updated", "doc.deleted"], "collection": "blog"
}
Response (webhook object):
{ "id": "wh_abc123", "url": "https://example.com/hook", "events": ["doc.added", "doc.updated", "doc.deleted"], "collection": "blog", "createdAt": 1709100000
}
POST /v1/webhooks/delete
Delete a webhook by ID.
Request Body:
{ "id": "wh_abc123"
}
Response:
{ "status": "deleted", "id": "wh_abc123"
}
POST /v1/revisions
List document revision history.
Request Body:
{ "collection": "blog", "key": "homepage", "lang": "en"
}
Response:
{ "collection": "blog", "key": "homepage", "lang": "en", "revisions": [ { "timestamp": 1709200000, "updatedAt": 1709200000, "contentMd": "# Old content", "meta": { "author": ["Jane"] } } ], "total": 1
}
POST /v1/revisions/restore
Restore a document to a previous revision.
Request Body:
{ "collection": "blog", "key": "homepage", "lang": "en", "timestamp": 1709200000
}
Response: Restored document object.
GET/POST /v1/automation
List or create automation rules (triggers, crons, webhooks).
GET β List rules:
curl http://localhost:11023/v1/automation
Response:
{ "rules": [ { "id": "auto_abc", "name": "Alert on new docs", "type": "trigger", "searchType": "fts", "query": "urgent", "threshold": 0.8, "webhookUrl": "https://example.com/alert" } ], "total": 1
}
POST β Create a rule:
{ "name": "Daily report", "type": "cron", "schedule": "0 9 * * *", "searchType": "vector", "query": "status report", "webhookUrl": "https://example.com/report"
}
Response: Created rule object (201 Created).
GET/PUT/DELETE /v1/automation/:id
Manage a specific automation rule.
- GET returns the rule object
- PUT updates the rule
- DELETE removes the rule
POST /v1/automation/:id/test β Test a trigger rule:
{ "trigger": { "id": "auto_abc", "name": "...", "searchType": "fts", "query": "urgent" }, "matches": [...], "total": 3
}
GET /v1/automation-logs
Get automation execution logs with pagination.
Query Parameters:
| Parameter | Required | Default | Description |
|---|---|---|---|
limit | No | 50 | Max results per page |
cursor | No | Pagination cursor | |
ruleId | No | Filter by rule ID | |
status | No | Filter by status |
Example:
curl "http://localhost:11023/v1/automation-logs?limit=10&ruleId=auto_abc"
Response:
{ "logs": [...], "total": 25, "nextCursor": "...", "hasMore": true
}
POST /v1/import-url
Import a markdown document from a URL. Automatically extracts YAML frontmatter.
Request Body:
{ "collection": "articles", "url": "https://example.com/post.md", "key": "imported-post", "lang": "en", "meta": { "source": ["web"] }, "ttl": 86400
}
| Field | Type | Description |
|---|---|---|
collection | string | Required. Target collection |
url | string | Required. URL to fetch |
key | string | Document key (derived from URL path if empty) |
lang | string | Required. Language code |
meta | object | Metadata (merged with frontmatter) |
ttl | integer | Time-to-live in seconds |
Response: Saved document object.
POST /v1/import-wiki
Import Wikipedia (MediaWiki) XML dumps. Supports .xml and .xml.bz2 compressed files. Streams the XML β does not load the entire file into memory.
Multipart Form Upload:
curl -X POST http://localhost:11023/v1/import-wiki \ -F "[email protected]" \ -F "collection=wikipedia" \ -F "lang=en" \ -F "skipRedirects=true" \ -F "skipFts=true"
Raw Stream (octet-stream):
curl -X POST "http://localhost:11023/v1/import-wiki?collection=wikipedia&lang=en&skipRedirects=true&skipFts=true" \ -H "Content-Type: application/x-bzip2" \ --data-binary @enwiki-20260101-pages-articles.xml.bz2
| Field | Type | Description |
|---|---|---|
collection | string | Required. Target collection |
lang | string | Required. Language code (e.g. en, de, pl) |
namespaces | string | Comma-separated namespace IDs to import (default: 0 = articles only) |
skipRedirects | bool | Skip redirect pages (default: false) |
skipFts | bool | Skip FTS indexing during import for speed (default: false). Run /v1/fts-reindex after. |
maxPages | int | Maximum pages to import (default: unlimited) |
batchSize | int | Pages per batch commit (default: 500) |
Response:
{ "imported": 1234567, "skipped": 456789, "failed": 0, "collection": "wikipedia", "durationMs": 3600000
}
Metadata stored per document:source=wikipedia, wiki_id, wiki_title, wiki_ns, wiki_rev_id, wiki_timestamp, wiki_contributor, wiki_redirect (if applicable).
POST /v1/set-ttl
Set or remove document time-to-live.
Request Body:
{ "collection": "blog", "key": "temp-post", "lang": "en", "ttl": 3600
}
| Field | Type | Description |
|---|---|---|
collection | string | Required. Collection name |
key | string | Required. Document key |
lang | string | Required. Language code |
ttl | integer | Required. Seconds until expiry; 0 to remove TTL |
Response: Updated document object with expiresAt field.
GET /v1/meta-keys
List all unique metadata keys and their distinct values for a collection.
Query Parameters:
| Parameter | Required | Description |
|---|---|---|
collection | Yes | Collection name |
Example:
curl "http://localhost:11023/v1/meta-keys?collection=blog"
Response:
{ "meta": { "author": ["John", "Jane"], "category": ["blog", "tutorial"], "tags": ["golang", "database"] }
}
GET /v1/checksum
Get CRC32 checksum of a collection for integrity verification.
Query Parameters:
| Parameter | Required | Description |
|---|---|---|
collection | Yes | Collection name |
Example:
curl "http://localhost:11023/v1/checksum?collection=blog"
Response:
{ "collection": "blog", "checksum": "a1b2c3d4", "documentCount": 42
}
GET /v1/system/info
Returns system information including OS, memory, CPU, and network details.
Example:
curl http://localhost:11023/v1/system/info
Response:
{ "hostname": "server-1", "os": "linux", "arch": "amd64", "numCPU": 4, "goVersion": "go1.26.2", "version": "2.9.11", "uptimeSeconds": 3600, "memoryTotal": 134217728, "memoryUsed": 67108864, "numGoroutines": 12, "cpuUsagePercent": 15.3
}
GET /v1/config
Returns server configuration overview.
Example:
curl http://localhost:11023/v1/config
Response:
{ "version": "2.9.11", "databasePath": "mddb.db", "mode": "wr", "protocols": { "http": { "enabled": true, "addr": ":11023" }, "grpc": { "enabled": true, "addr": ":11024" }, "mcp": { "enabled": true, "addr": ":11025" } }, "authEnabled": false, "metricsEnabled": true, "vectorConfig": { "enabled": true, "provider": "openai", "model": "text-embedding-3-small", "dimensions": 1536 }, "automationsEnabled": true, "searchStatsEnabled": true
}
GET /v1/endpoints
Returns list of all available endpoints across HTTP, gRPC, and MCP protocols.
Example:
curl http://localhost:11023/v1/endpoints
Response:
{ "http": [ { "method": "POST", "path": "/v1/add", "description": "Add document", "requiresAuth": true } ], "grpc": [ { "name": "AddDocument", "description": "Add a document" } ], "mcp": [ { "name": "add_document", "description": "Add a document" } ]
}
GET /health
Health check endpoint. Also available at /v1/health.
Response:
{ "status": "healthy", "mode": "wr"
}
Returns 503 with "status": "unhealthy" if the database is not accessible.
POST /v1/auth/register
Register a new user. Requires admin privileges.
Request Body:
{ "username": "newuser", "password": "secret123"
}
Response:
{ "username": "newuser", "createdAt": 1709100000
}
GET /v1/auth/me
Get current authenticated user information.
Response:
{ "username": "admin", "admin": true, "createdAt": 1709000000
}
GET/POST /v1/auth/permissions
Get or set user permissions on collections.
GET β Query parameter username:
curl "http://localhost:11023/v1/auth/permissions?username=john"
POST β Set permission:
{ "username": "john", "collection": "blog", "read": true, "write": true, "admin": false
}
Response (POST): { "status": "ok" }
GET /v1/auth/users
List all users. Requires admin privileges.
Response:
{ "users": [ { "username": "admin", "createdAt": 1709000000, "disabled": false, "admin": true, "groups": ["admins"] } ]
}
DELETE /v1/auth/users/:username
Delete a user account. Requires admin privileges.
Example:
curl -X DELETE http://localhost:11023/v1/auth/users/john
Response: { "status": "deleted" }
GET/POST /v1/auth/groups
List all groups (GET) or create a new group (POST). Requires admin privileges.
POST β Create group:
{ "name": "editors", "description": "Content editors", "members": ["john", "jane"]
}
Response (POST): Created group object (201 Created).
GET/PUT/DELETE /v1/auth/groups/:name
Manage a specific group. Requires admin privileges.
- GET returns the group object
- PUT updates description and members
- DELETE removes the group
GET/POST /v1/auth/group-permissions
Get or set permissions for a group on collections.
GET β Query parameter group:
curl "http://localhost:11023/v1/auth/group-permissions?group=editors"
POST β Set group permission:
{ "group": "editors", "collection": "blog", "read": true, "write": true, "admin": false
}
Response (POST): { "status": "permission set" }
Data Models
Document
{ "id": string, // Auto-generated: "collection|key|lang" "key": string, // Document key (e.g., "homepage") "lang": string, // Language code (e.g., "en_GB") "meta": { // Metadata (multi-value) "key1": ["value1", "value2"], "key2": ["value3"] }, "contentMd": string, // Markdown content "addedAt": int64, // Unix timestamp (first creation) "updatedAt": int64 // Unix timestamp (last update)
}
Metadata
- Metadata is stored as
map[string][]string(key β array of values) - Each metadata key can have multiple values
- Metadata is automatically indexed for fast searching
- Common metadata keys:
category,author,tags,status, etc.
Error Handling
Error Response Format
{ "error": "error message description"
}
HTTP Status Codes
| Code | Description |
|---|---|
200 | Success |
400 | Bad Request - Invalid JSON or missing required fields |
403 | Forbidden - Write operation in read-only mode |
404 | Not Found - Document doesn't exist |
500 | Internal Server Error |
Common Errors
Missing required fields:
{ "error": "missing fields"
}
Document not found:
{ "error": "not found"
}
Read-only mode:
{ "error": "read-only mode"
}
Best Practices
1. Document Keys
- Use descriptive, URL-friendly keys
- Keep keys consistent within a collection
- Example:
homepage,about-us,blog-post-1
2. Language Codes
- Use standard language codes (ISO 639-1 + ISO 3166-1)
- Examples:
en_US,en_GB,pl_PL,de_DE
3. Metadata
- Keep metadata keys consistent across documents
- Use arrays even for single values (for consistency)
- Index frequently queried fields
4. Collections
- Group related documents in collections
- Use collections like database tables
- Examples:
blog,pages,products,docs
5. Revisions
- Regularly truncate old revisions to save space
- Keep enough history for your audit requirements
- Consider keeping 5-10 recent revisions
6. Backups
- Schedule regular backups
- Store backups in a different location
- Test restore procedures periodically
Performance Tips
- Indexing: Metadata is automatically indexed - use it for filtering
- Pagination: Always use
limitandoffsetfor large result sets - Batch Operations: Use export/import for bulk operations
- Revisions: Truncate old revisions regularly to keep database size manageable
- Read Mode: Use read-only mode for read-heavy workloads with separate write instances
Memory RAG Endpoints
Conversational memory system for RAG applications. Store, search, and recall conversation history with semantic search.
POST /v1/memory/session
Create a new memory/conversation session.
Request:
{ "userId": "user-1", "scenario": "customer_support", "title": "Session about search API", "meta": {"department": "engineering"}, "ttl": 2592000
}
| Field | Type | Required | Description |
|---|---|---|---|
userId | string | Yes | User identifier |
scenario | string | No | Session context/scenario |
title | string | No | Human-readable title (auto-generated if empty) |
meta | object | No | Additional metadata key-value pairs |
ttl | int | No | TTL in seconds (default: 30 days) |
Response:
{ "sessionId": "a1b2c3d4e5f6...", "userId": "user-1", "scenario": "customer_support", "title": "Session about search API", "createdAt": 1743400000, "expiresAt": 1745992000
}
POST /v1/memory/message
Add a message to an existing session. Messages are automatically embedded for semantic recall.
Request:
{ "sessionId": "a1b2c3d4e5f6...", "role": "user", "content": "How does vector search work?", "meta": {"topic": "search", "source": "docs"}
}
| Field | Type | Required | Description |
|---|---|---|---|
sessionId | string | Yes | Session ID from /v1/memory/session |
role | string | Yes | user, assistant, system, or tool |
content | string | Yes | Message content (markdown supported) |
meta | object | No | Extra metadata (topic, source, tool_call, etc.) |
Response:
{ "messageId": "memory_messages|...", "sessionId": "a1b2c3d4e5f6...", "role": "user", "createdAt": 1743400100, "embedded": true
}
POST /v1/memory/recall
Semantically recall relevant messages from past conversations using hybrid search (vector + keyword).
Request:
{ "query": "How does vector search work?", "userId": "user-1", "sessionId": "", "role": "assistant", "topK": 10, "threshold": 0.5, "strategy": "hybrid", "alpha": 0.5, "includeContent": true, "filterMeta": {}
}
| Field | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Natural language recall query |
userId | string | No | Filter to sessions belonging to this user |
sessionId | string | No | Filter to a specific session |
role | string | No | Filter by message role |
topK | int | No | Number of results (default: 10) |
threshold | float | No | Min similarity score 0-1 |
strategy | string | No | hybrid (default), semantic, keyword |
alpha | float | No | Weight 0-1 (0=keyword, 1=semantic) |
includeContent | bool | No | Include full message content |
filterMeta | object | No | Additional metadata filters |
Response:
{ "results": [ { "document": {"id": "...", "key": "...", "meta": {...}, "contentMd": "..."}, "score": 0.87, "rank": 1, "sessionId": "a1b2c3d4e5f6...", "role": "assistant", "matchStrategy": "hybrid" } ], "total": 5, "strategy": "hybrid", "query": "How does vector search work?"
}
POST /v1/memory/summarize
Generate and store a summary of a session's conversation.
Request:
{ "sessionId": "a1b2c3d4e5f6...", "userId": "user-1"
}
Response:
{ "summaryId": "memory_summaries|...", "sessionId": "a1b2c3d4e5f6...", "summary": "# Session Summary: a1b2c3d4\n\nMessages: 5\n\n## Conversation\n\n...", "createdAt": 1743401000, "messages": 5
}
POST /v1/memory/sessions
List memory sessions with optional filtering.
Request:
{ "userId": "user-1", "scenario": "customer_support", "limit": 50, "offset": 0, "sort": "createdAt", "asc": false
}
Response:
{ "sessions": [ { "sessionId": "a1b2c3d4e5f6...", "userId": "user-1", "scenario": "customer_support", "title": "Session about search API", "createdAt": 1743400000, "updatedAt": 1743401000, "expiresAt": 1745992000, "messageCount": 12 } ], "total": 3
}
POST /v1/memory/history
Get the full message history for a session, ordered chronologically.
Request:
{ "sessionId": "a1b2c3d4e5f6...", "limit": 100, "offset": 0
}
Response:
{ "messages": [ {"id": "...", "key": "...", "meta": {"role": ["user"], "sessionId": ["..."]}, "contentMd": "How does vector search work?", "addedAt": 1743400100}, {"id": "...", "key": "...", "meta": {"role": ["assistant"], "sessionId": ["..."]}, "contentMd": "Vector search uses embeddings...", "addedAt": 1743400110} ], "total": 2
}