Embedding Providers Guide
MDDB supports multiple embedding providers for vector search functionality. You can configure embeddings either through environment variables or via the Admin Panel UI.
Supported Providers
1. OpenAI
API URL:https://api.openai.com/v1Authentication: API Key required
Documentation: https://platform.openai.com/docs/guides/embeddings
Popular Models
| Model | Dimensions | Use Case | Cost |
|---|---|---|---|
text-embedding-3-small | 1536 | Fast, cost-effective, general purpose | $ |
text-embedding-3-large | 3072 | Highest quality, best performance | $$$ |
text-embedding-ada-002 | 1536 | Legacy model (v2) | $$ |
Environment Variables
export MDDB_EMBEDDING_PROVIDER=openai
export MDDB_EMBEDDING_API_KEY=sk-...
export MDDB_EMBEDDING_MODEL=text-embedding-3-small
export MDDB_EMBEDDING_DIMENSIONS=1536
2. Cohere
API URL:https://api.cohere.ai/v1Authentication: API Key required
Documentation: https://docs.cohere.com/docs/embeddings
Popular Models
| Model | Dimensions | Use Case | Languages |
|---|---|---|---|
embed-english-v3.0 | 1024 | English text | English only |
embed-multilingual-v3.0 | 1024 | Multilingual support | 100+ languages |
embed-english-light-v3.0 | 384 | Fast, smaller embeddings | English only |
embed-multilingual-light-v3.0 | 384 | Fast, smaller, multilingual | 100+ languages |
Features
- โ Best multilingual support
- โ Semantic search optimized
- โ Built-in compression options
Environment Variables
export MDDB_EMBEDDING_PROVIDER=cohere
export MDDB_EMBEDDING_API_KEY=cohere_api_key...
export MDDB_EMBEDDING_MODEL=embed-english-v3.0
export MDDB_EMBEDDING_DIMENSIONS=1024
3. Voyage AI
API URL:https://api.voyageai.com/v1Authentication: API Key required
Documentation: https://docs.voyageai.com/
Popular Models
| Model | Dimensions | Use Case | Specialty |
|---|---|---|---|
voyage-3 | 1024 | Latest, best quality | General purpose |
voyage-large-2 | 1536 | High accuracy | Long documents |
voyage-code-2 | 1536 | Code embeddings | Programming code |
voyage-law-2 | 1024 | Legal documents | Legal text |
Features
- โ Specialized in embeddings (not general LLM)
- โ Very high quality
- โ Domain-specific models (code, law)
- โ Competitive pricing
Environment Variables
export MDDB_EMBEDDING_PROVIDER=voyage
export MDDB_EMBEDDING_API_KEY=pa-...
export MDDB_EMBEDDING_MODEL=voyage-3
export MDDB_EMBEDDING_DIMENSIONS=1024
4. Ollama (Local)
API URL:http://localhost:11434 (default)
Authentication: None (local server)
Documentation: https://ollama.ai/
Popular Models
| Model | Dimensions | Size | Quality |
|---|---|---|---|
nomic-embed-text | 768 | ~275MB | Good, fast |
mxbai-embed-large | 1024 | ~670MB | Better quality |
all-minilm | 384 | ~45MB | Small, very fast |
snowflake-arctic-embed | 1024 | ~669MB | High quality |
Features
- โ Fully local, no API costs
- โ Privacy-focused
- โ Works offline
- โ Multiple open-source models
Setup
- Install Ollama: https://ollama.ai/download
- Pull model:
ollama pull nomic-embed-text - Run server:
ollama serve(usually auto-starts)
Environment Variables
export MDDB_EMBEDDING_PROVIDER=ollama
export MDDB_EMBEDDING_API_URL=http://localhost:11434
export MDDB_EMBEDDING_MODEL=nomic-embed-text
export MDDB_EMBEDDING_DIMENSIONS=768
Configuration Methods
Method 1: Environment Variables (Legacy)
Set environment variables before starting mddbd:
export MDDB_EMBEDDING_PROVIDER=openai
export MDDB_EMBEDDING_API_KEY=sk-...
export MDDB_EMBEDDING_MODEL=text-embedding-3-small
export MDDB_EMBEDDING_DIMENSIONS=1536 ./mddbd
Method 2: Admin Panel (Recommended)
Open mddb-panel:
http://localhost:11024Navigate to Administration โ Embedding Models
Click Add Model
Fill in configuration:
- ID: Unique identifier (e.g.,
openai-small,cohere-multilingual) - Name: Display name (e.g.,
OpenAI Small,Cohere Multilingual) - Provider: Select from dropdown
- Model: Model name
- Dimensions: Vector dimensions
- API Key: Your API key (for OpenAI, Cohere, Voyage)
- API URL: Custom URL or leave empty for default
- Set as default: Check to make this the active model
- ID: Unique identifier (e.g.,
Click Create
Import Current Config
If you're using environment variables and want to migrate to database config:
- Open Administration โ Embedding Models
- If no configs exist, you'll see "Import Current Configuration"
- Click Import Current Config to save your env var config to the database
Comparison Matrix
| Provider | Cost | Quality | Speed | Multilingual | Local | API Key Required |
|---|---|---|---|---|---|---|
| OpenAI | $$ | Excellent | Fast | Good | No | Yes |
| Cohere | $$ | Excellent | Fast | Best | No | Yes |
| Voyage | $$ | Excellent | Fast | Good | No | Yes |
| Ollama | Free | Good | Fastest | Fair | Yes | No |
Choosing a Provider
Use OpenAI if:
- โ You want the most popular, well-supported option
- โ You're already using OpenAI for other services
- โ You need reliable, high-quality embeddings
- โ English is your primary language
Use Cohere if:
- โ You need best-in-class multilingual support (100+ languages)
- โ You're working with non-English content
- โ You want semantic search optimized embeddings
- โ You need smaller models (light versions)
Use Voyage AI if:
- โ You want specialized, domain-specific models (code, law)
- โ You need the highest quality embeddings
- โ You're working with technical or legal documents
- โ You value a company focused solely on embeddings
Use Ollama if:
- โ You want 100% free, no API costs
- โ Privacy is critical (data never leaves your server)
- โ You need to work offline
- โ You have sufficient local compute resources
- โ You prefer open-source solutions
API Pricing (Approximate)
| Provider | Model | Price per 1M tokens |
|---|---|---|
| OpenAI | text-embedding-3-small | $0.02 |
| OpenAI | text-embedding-3-large | $0.13 |
| Cohere | embed-english-v3.0 | $0.10 |
| Cohere | embed-multilingual-v3.0 | $0.10 |
| Voyage | voyage-3 | $0.10 |
| Voyage | voyage-large-2 | $0.12 |
| Ollama | any model | FREE |
Prices as of 2026-03. Check provider websites for current pricing.
Best Practices
1. Choose Consistent Dimensions
- Once you embed documents with a specific dimension, stick with it
- Changing dimensions requires re-embedding all documents
- Higher dimensions = better quality but slower search
2. Monitor Costs
- Track API usage via provider dashboards
- Consider caching embeddings for frequently accessed documents
- Use smaller models for development/testing
3. Test Before Production
- Compare quality across providers with your specific data
- Measure search relevance for your use case
- Benchmark performance (speed vs quality)
4. Security
- Never commit API keys to git
- Use environment variables or secure secret management
- Rotate API keys regularly
5. Switching Providers
- Database configs allow easy switching between models
- Test new provider with subset of data first
- Re-embed all documents when switching providers
Troubleshooting
"No active embedding configuration"
- Set environment variables OR configure in Admin Panel
- Ensure API key is valid and has credits
- Check server logs for detailed error messages
"Dimensions mismatch"
- All documents in a collection must use same dimensions
- Clear existing embeddings before switching models
- Consider creating new collection for different model
"API rate limit exceeded"
- Slow down embedding worker (reduce batch size)
- Upgrade API plan with provider
- Consider switching to local Ollama
Ollama connection failed
- Ensure Ollama is running:
ollama serve - Check API URL is correct (default:
http://localhost:11434) - Verify model is pulled:
ollama list
Examples
OpenAI Configuration
{ "id": "openai-small", "name": "OpenAI Small", "provider": "openai", "model": "text-embedding-3-small", "dimensions": 1536, "apiKey": "sk-...", "apiUrl": "https://api.openai.com/v1", "isDefault": true
}
Cohere Multilingual
{ "id": "cohere-multi", "name": "Cohere Multilingual", "provider": "cohere", "model": "embed-multilingual-v3.0", "dimensions": 1024, "apiKey": "cohere_api_key...", "apiUrl": "https://api.cohere.ai/v1", "isDefault": true
}
Voyage for Code
{ "id": "voyage-code", "name": "Voyage Code", "provider": "voyage", "model": "voyage-code-2", "dimensions": 1536, "apiKey": "pa-...", "apiUrl": "https://api.voyageai.com/v1", "isDefault": true
}
Ollama Local
{ "id": "ollama-nomic", "name": "Ollama Nomic", "provider": "ollama", "model": "nomic-embed-text", "dimensions": 768, "apiKey": "", "apiUrl": "http://localhost:11434", "isDefault": true
}
Classification
All embedding providers support zero-shot classification via POST /v1/classify. This feature embeds candidate labels and computes similarity to your documents โ no training data required. See Search Algorithms for details.
Related Documentation
Support
- GitHub Issues: https://github.com/tradik/mddb/issues
- Discussions: https://github.com/tradik/mddb/discussions
- Documentation: https://github.com/tradik/mddb/docs
Last updated: 2026-03-02