Schema Validation
Note: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Table of Contents
Overview
Schema Validation lets you enforce structure on document metadata within a collection. It is opt-in per collection and disabled by default -- a collection without a schema accepts any metadata.
Schemas use a JSON Schema subset focused on metadata validation. When a schema is set for a collection, every call to /v1/add (or the gRPC Add RPC) will validate the document's meta field against the schema before persisting. If validation fails, the request is rejected with a descriptive error.
Key points:
- Schemas are scoped to a single collection.
- Only the
metafield is validated;contentMd,key, andlangare unaffected. - Deleting a schema immediately disables validation for that collection.
- Schemas are stored in the database and survive restarts.
Quick Start
Step 1 -- Set a schema
curl -X POST http://localhost:11023/v1/schema/set \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } } }'
Step 2 -- Add a valid document
curl -X POST http://localhost:11023/v1/add \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "key": "hello", "lang": "en_US", "meta": { "category": ["blog"], "author": ["John Doe"], "tags": ["golang", "database"] }, "contentMd": "# Hello World" }'
Response: the document is stored successfully.
Step 3 -- See a validation error
curl -X POST http://localhost:11023/v1/add \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "key": "bad-post", "lang": "en_US", "meta": { "tags": ["golang"] }, "contentMd": "# Missing required fields" }'
Response:
{ "error": "schema validation failed: missing required metadata key \"category\"; missing required metadata key \"author\""
}
Supported Rules
required
An array of metadata key names that MUST be present in every document.
{ "required": ["category", "author"]
}
If a document is added without the category key in its meta, validation fails.
properties
Defines per-key rules. Each key maps to an object that MAY contain type, enum, pattern, minItems, and maxItems.
{ "properties": { "status": { "type": "string" }, "priority": { "type": "integer" }, "score": { "type": "number" }, "featured": { "type": "boolean" } }
}
Supported types:
| Type | Description | Example valid values |
|---|---|---|
string | Any string value | ["hello"], ["foo", "bar"] |
number | Numeric value (integer or float) | ["3.14"], ["42"] |
integer | Integer value only | ["42"], ["-1"] |
boolean | Boolean value | ["true"], ["false"] |
Since metadata values in MDDB are always stored as []string, type validation checks that each value in the array can be parsed as the declared type.
enum
Restricts a key's values to a fixed set of allowed strings.
{ "properties": { "status": { "type": "string", "enum": ["draft", "published", "archived"] } }
}
A document with "status": ["pending"] would fail validation because "pending" is not in the allowed set.
pattern
A regular expression that every value for the key MUST match.
{ "properties": { "slug": { "type": "string", "pattern": "^[a-z0-9-]+$" } }
}
A document with "slug": ["Hello World"] would fail because the value does not match the pattern.
minItems / maxItems
Controls the number of values allowed per metadata key.
{ "properties": { "tags": { "type": "string", "minItems": 1, "maxItems": 5 }, "category": { "type": "string", "minItems": 1, "maxItems": 1 } }
}
minItems: Minimum number of values (e.g., at least 1 tag).maxItems: Maximum number of values (e.g., exactly 1 category).
A document with "tags": [] would fail the minItems: 1 check. A document with "category": ["blog", "news"] would fail the maxItems: 1 check.
HTTP API
POST /v1/schema/set
Set or update the validation schema for a collection.
Request Body:
{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } }
}
Response:
{ "status": "ok"
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/set \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "schema": { "required": ["category"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial"] } } } }'
POST /v1/schema/get
Retrieve the current schema for a collection.
Request Body:
{ "collection": "blog"
}
Response (schema exists):
{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } }
}
Response (no schema):
{ "collection": "blog", "schema": null
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/get \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}'
POST /v1/schema/delete
Delete the schema for a collection, disabling validation.
Request Body:
{ "collection": "blog"
}
Response:
{ "status": "ok"
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/delete \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}'
POST /v1/schema/list
List all collections that have a schema defined.
Request Body: Empty or {}.
Response:
{ "schemas": [ { "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" } } } }, { "collection": "products", "schema": { "required": ["price", "sku"], "properties": { "price": { "type": "number" }, "sku": { "type": "string", "pattern": "^SKU-[0-9]+$" } } } } ]
}
cURL Example:
curl -X POST http://localhost:11023/v1/schema/list \ -H 'Content-Type: application/json' \ -d '{}'
POST /v1/validate
Validate a document's metadata against the collection schema without persisting anything. Useful for dry-run checks before adding documents.
Request Body:
{ "collection": "blog", "meta": { "category": ["blog"], "author": ["Jane Doe"], "tags": ["golang", "tutorial"] }
}
Response (valid):
{ "valid": true, "errors": []
}
Response (invalid):
{ "valid": false, "errors": [ "value \"pending\" for key \"status\" is not in allowed enum values [draft, published, archived]", "key \"tags\" has 6 values, exceeds maxItems 5" ]
}
Response (no schema set for collection):
{ "valid": true, "errors": []
}
cURL Example:
curl -X POST http://localhost:11023/v1/validate \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "meta": { "category": ["blog"], "author": ["Jane Doe"] } }'
gRPC API
The following RPCs are available on the mddb.MDDB service (port 11024):
| RPC | Request | Response | Description |
|---|---|---|---|
SetSchema | SetSchemaRequest | SetSchemaResponse | Set or update a collection schema |
GetSchema | GetSchemaRequest | GetSchemaResponse | Retrieve a collection schema |
DeleteSchema | DeleteSchemaRequest | DeleteSchemaResponse | Delete a collection schema |
ListSchemas | ListSchemasRequest | ListSchemasResponse | List all collection schemas |
ValidateDocument | ValidateDocumentRequest | ValidateDocumentResponse | Validate metadata against schema |
grpcurl Examples
grpcurl -plaintext -d '{ "collection": "blog", "schema": { "required": ["category"], "properties": { "category": {"type": "string", "enum_values": ["blog", "tutorial"]} } }
}' localhost:11024 mddb.MDDB/SetSchema grpcurl -plaintext -d '{"collection": "blog"}' \ localhost:11024 mddb.MDDB/GetSchema grpcurl -plaintext -d '{"collection": "blog"}' \ localhost:11024 mddb.MDDB/DeleteSchema grpcurl -plaintext -d '{}' \ localhost:11024 mddb.MDDB/ListSchemas grpcurl -plaintext -d '{ "collection": "blog", "meta": { "category": {"values": ["blog"]}, "author": {"values": ["John Doe"]} }
}' localhost:11024 mddb.MDDB/ValidateDocument
CLI
The mddb-cli tool provides schema management commands:
Set a schema
mddb-cli schema set blog '{ "required": ["category", "author"], "properties": { "category": {"type": "string", "enum": ["blog", "tutorial", "news"]}, "author": {"type": "string"} }
}'
Get a schema
mddb-cli schema get blog
Delete a schema
mddb-cli schema delete blog
List all schemas
mddb-cli schema list
Validate metadata
mddb-cli validate blog '{"category": ["blog"], "author": ["John Doe"]}'
MCP Tools
The following MCP tools are available for schema validation:
| Tool | Description |
|---|---|
set_schema | Set or update the validation schema for a collection |
get_schema | Retrieve the current schema for a collection |
delete_schema | Delete the schema for a collection |
list_schemas | List all collections with schemas |
validate_document | Validate metadata against a collection schema |
Example MCP Usage
When connected to an LLM via MCP, you can use natural language:
- "Set a schema for the blog collection that requires category and author fields"
- "Show me the schema for the products collection"
- "Validate this metadata against the blog schema: category=tutorial, author=Jane"
- "List all collections that have schemas"
- "Remove the schema from the blog collection"
Disabling Validation
To disable validation for a collection, delete its schema:
curl -X POST http://localhost:11023/v1/schema/delete \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}' mddb-cli schema delete blog grpcurl -plaintext -d '{"collection": "blog"}' \ localhost:11024 mddb.MDDB/DeleteSchema
Once the schema is deleted, all documents will be accepted regardless of their metadata content. Existing documents are NOT re-validated or removed.
Error Messages
When validation fails on /v1/add, the response includes a descriptive error:
Missing required key
{ "error": "schema validation failed: missing required metadata key \"category\""
}
Invalid type
{ "error": "schema validation failed: value \"not-a-number\" for key \"priority\" is not a valid integer"
}
Enum violation
{ "error": "schema validation failed: value \"pending\" for key \"status\" is not in allowed enum values [draft, published, archived]"
}
Pattern mismatch
{ "error": "schema validation failed: value \"Hello World\" for key \"slug\" does not match pattern \"^[a-z0-9-]+$\""
}
minItems / maxItems violation
{ "error": "schema validation failed: key \"tags\" has 0 values, below minItems 1"
}
{ "error": "schema validation failed: key \"category\" has 3 values, exceeds maxItems 1"
}
Multiple errors
When multiple rules are violated, all errors are combined:
{ "error": "schema validation failed: missing required metadata key \"author\"; value \"invalid\" for key \"status\" is not in allowed enum values [draft, published, archived]; key \"tags\" has 6 values, exceeds maxItems 5"
}
See Also
- API Documentation - Full HTTP/JSON API reference
- gRPC Guide - gRPC API reference
- Examples - More code examples