Schema Validation

Note: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Table of Contents

Overview

Schema Validation lets you enforce structure on document metadata within a collection. It is opt-in per collection and disabled by default -- a collection without a schema accepts any metadata.

Schemas use a JSON Schema subset focused on metadata validation. When a schema is set for a collection, every call to /v1/add (or the gRPC Add RPC) will validate the document's meta field against the schema before persisting. If validation fails, the request is rejected with a descriptive error.

Key points:

  • Schemas are scoped to a single collection.
  • Only the meta field is validated; contentMd, key, and lang are unaffected.
  • Deleting a schema immediately disables validation for that collection.
  • Schemas are stored in the database and survive restarts.

Quick Start

Step 1 -- Set a schema

curl -X POST http://localhost:11023/v1/schema/set \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } } }'

Step 2 -- Add a valid document

curl -X POST http://localhost:11023/v1/add \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "key": "hello", "lang": "en_US", "meta": { "category": ["blog"], "author": ["John Doe"], "tags": ["golang", "database"] }, "contentMd": "# Hello World" }'

Response: the document is stored successfully.

Step 3 -- See a validation error

curl -X POST http://localhost:11023/v1/add \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "key": "bad-post", "lang": "en_US", "meta": { "tags": ["golang"] }, "contentMd": "# Missing required fields" }'

Response:

{ "error": "schema validation failed: missing required metadata key \"category\"; missing required metadata key \"author\""
}

Supported Rules

required

An array of metadata key names that MUST be present in every document.

{ "required": ["category", "author"]
}

If a document is added without the category key in its meta, validation fails.

properties

Defines per-key rules. Each key maps to an object that MAY contain type, enum, pattern, minItems, and maxItems.

{ "properties": { "status": { "type": "string" }, "priority": { "type": "integer" }, "score": { "type": "number" }, "featured": { "type": "boolean" } }
}

Supported types:

TypeDescriptionExample valid values
stringAny string value["hello"], ["foo", "bar"]
numberNumeric value (integer or float)["3.14"], ["42"]
integerInteger value only["42"], ["-1"]
booleanBoolean value["true"], ["false"]

Since metadata values in MDDB are always stored as []string, type validation checks that each value in the array can be parsed as the declared type.

enum

Restricts a key's values to a fixed set of allowed strings.

{ "properties": { "status": { "type": "string", "enum": ["draft", "published", "archived"] } }
}

A document with "status": ["pending"] would fail validation because "pending" is not in the allowed set.

pattern

A regular expression that every value for the key MUST match.

{ "properties": { "slug": { "type": "string", "pattern": "^[a-z0-9-]+$" } }
}

A document with "slug": ["Hello World"] would fail because the value does not match the pattern.

minItems / maxItems

Controls the number of values allowed per metadata key.

{ "properties": { "tags": { "type": "string", "minItems": 1, "maxItems": 5 }, "category": { "type": "string", "minItems": 1, "maxItems": 1 } }
}
  • minItems: Minimum number of values (e.g., at least 1 tag).
  • maxItems: Maximum number of values (e.g., exactly 1 category).

A document with "tags": [] would fail the minItems: 1 check. A document with "category": ["blog", "news"] would fail the maxItems: 1 check.

HTTP API

POST /v1/schema/set

Set or update the validation schema for a collection.

Request Body:

{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } }
}

Response:

{ "status": "ok"
}

cURL Example:

curl -X POST http://localhost:11023/v1/schema/set \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "schema": { "required": ["category"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial"] } } } }'

POST /v1/schema/get

Retrieve the current schema for a collection.

Request Body:

{ "collection": "blog"
}

Response (schema exists):

{ "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" }, "tags": { "type": "string", "minItems": 1, "maxItems": 5 } } }
}

Response (no schema):

{ "collection": "blog", "schema": null
}

cURL Example:

curl -X POST http://localhost:11023/v1/schema/get \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}'

POST /v1/schema/delete

Delete the schema for a collection, disabling validation.

Request Body:

{ "collection": "blog"
}

Response:

{ "status": "ok"
}

cURL Example:

curl -X POST http://localhost:11023/v1/schema/delete \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}'

POST /v1/schema/list

List all collections that have a schema defined.

Request Body: Empty or {}.

Response:

{ "schemas": [ { "collection": "blog", "schema": { "required": ["category", "author"], "properties": { "category": { "type": "string", "enum": ["blog", "tutorial", "news"] }, "author": { "type": "string" } } } }, { "collection": "products", "schema": { "required": ["price", "sku"], "properties": { "price": { "type": "number" }, "sku": { "type": "string", "pattern": "^SKU-[0-9]+$" } } } } ]
}

cURL Example:

curl -X POST http://localhost:11023/v1/schema/list \ -H 'Content-Type: application/json' \ -d '{}'

POST /v1/validate

Validate a document's metadata against the collection schema without persisting anything. Useful for dry-run checks before adding documents.

Request Body:

{ "collection": "blog", "meta": { "category": ["blog"], "author": ["Jane Doe"], "tags": ["golang", "tutorial"] }
}

Response (valid):

{ "valid": true, "errors": []
}

Response (invalid):

{ "valid": false, "errors": [ "value \"pending\" for key \"status\" is not in allowed enum values [draft, published, archived]", "key \"tags\" has 6 values, exceeds maxItems 5" ]
}

Response (no schema set for collection):

{ "valid": true, "errors": []
}

cURL Example:

curl -X POST http://localhost:11023/v1/validate \ -H 'Content-Type: application/json' \ -d '{ "collection": "blog", "meta": { "category": ["blog"], "author": ["Jane Doe"] } }'

gRPC API

The following RPCs are available on the mddb.MDDB service (port 11024):

RPCRequestResponseDescription
SetSchemaSetSchemaRequestSetSchemaResponseSet or update a collection schema
GetSchemaGetSchemaRequestGetSchemaResponseRetrieve a collection schema
DeleteSchemaDeleteSchemaRequestDeleteSchemaResponseDelete a collection schema
ListSchemasListSchemasRequestListSchemasResponseList all collection schemas
ValidateDocumentValidateDocumentRequestValidateDocumentResponseValidate metadata against schema

grpcurl Examples

grpcurl -plaintext -d '{ "collection": "blog", "schema": { "required": ["category"], "properties": { "category": {"type": "string", "enum_values": ["blog", "tutorial"]} } }
}' localhost:11024 mddb.MDDB/SetSchema grpcurl -plaintext -d '{"collection": "blog"}' \ localhost:11024 mddb.MDDB/GetSchema grpcurl -plaintext -d '{"collection": "blog"}' \ localhost:11024 mddb.MDDB/DeleteSchema grpcurl -plaintext -d '{}' \ localhost:11024 mddb.MDDB/ListSchemas grpcurl -plaintext -d '{ "collection": "blog", "meta": { "category": {"values": ["blog"]}, "author": {"values": ["John Doe"]} }
}' localhost:11024 mddb.MDDB/ValidateDocument

CLI

The mddb-cli tool provides schema management commands:

Set a schema

mddb-cli schema set blog '{ "required": ["category", "author"], "properties": { "category": {"type": "string", "enum": ["blog", "tutorial", "news"]}, "author": {"type": "string"} }
}'

Get a schema

mddb-cli schema get blog

Delete a schema

mddb-cli schema delete blog

List all schemas

mddb-cli schema list

Validate metadata

mddb-cli validate blog '{"category": ["blog"], "author": ["John Doe"]}'

MCP Tools

The following MCP tools are available for schema validation:

ToolDescription
set_schemaSet or update the validation schema for a collection
get_schemaRetrieve the current schema for a collection
delete_schemaDelete the schema for a collection
list_schemasList all collections with schemas
validate_documentValidate metadata against a collection schema

Example MCP Usage

When connected to an LLM via MCP, you can use natural language:

  • "Set a schema for the blog collection that requires category and author fields"
  • "Show me the schema for the products collection"
  • "Validate this metadata against the blog schema: category=tutorial, author=Jane"
  • "List all collections that have schemas"
  • "Remove the schema from the blog collection"

Disabling Validation

To disable validation for a collection, delete its schema:

curl -X POST http://localhost:11023/v1/schema/delete \ -H 'Content-Type: application/json' \ -d '{"collection": "blog"}' mddb-cli schema delete blog grpcurl -plaintext -d '{"collection": "blog"}' \ localhost:11024 mddb.MDDB/DeleteSchema

Once the schema is deleted, all documents will be accepted regardless of their metadata content. Existing documents are NOT re-validated or removed.


Error Messages

When validation fails on /v1/add, the response includes a descriptive error:

Missing required key

{ "error": "schema validation failed: missing required metadata key \"category\""
}

Invalid type

{ "error": "schema validation failed: value \"not-a-number\" for key \"priority\" is not a valid integer"
}

Enum violation

{ "error": "schema validation failed: value \"pending\" for key \"status\" is not in allowed enum values [draft, published, archived]"
}

Pattern mismatch

{ "error": "schema validation failed: value \"Hello World\" for key \"slug\" does not match pattern \"^[a-z0-9-]+$\""
}

minItems / maxItems violation

{ "error": "schema validation failed: key \"tags\" has 0 values, below minItems 1"
}
{ "error": "schema validation failed: key \"category\" has 3 values, exceeds maxItems 1"
}

Multiple errors

When multiple rules are violated, all errors are combined:

{ "error": "schema validation failed: missing required metadata key \"author\"; value \"invalid\" for key \"status\" is not in allowed enum values [draft, published, archived]; key \"tags\" has 6 values, exceeds maxItems 5"
}

See Also