Bulk Import Guide
Overview
The load-md-folder.sh script allows you to bulk import markdown files from a folder into MDDB. It's perfect for migrating existing documentation, importing blog posts, or loading large collections of markdown content.
Features
- Automatic Key Generation - Creates unique keys from filenames
- Frontmatter Support - Extracts YAML-style metadata from file headers
- Recursive Scanning - Process entire directory trees
- Progress Tracking - Real-time progress with statistics
- Dry Run Mode - Preview imports without making changes
- Error Handling - Graceful failure handling with detailed reporting
- Metadata Enrichment - Add custom metadata to all imported files
- Multi-language Support - Specify language code for all documents
Installation
The script is located in the scripts/ directory and requires:
- Bash shell
mddb-clicommand available in PATH- Running MDDB server
chmod +x scripts/load-md-folder.sh
Basic Usage
Simple Import
Import all .md files from a folder:
./scripts/load-md-folder.sh ./docs blog
This will:
- Scan
./docsfor.mdfiles - Import them into the
blogcollection - Use default language
en_US - Generate keys from filenames
Recursive Import
Process all subfolders:
./scripts/load-md-folder.sh ./content articles --recursive
Or use the short form:
./scripts/load-md-folder.sh ./content articles -r
Custom Language
Specify a different language code:
./scripts/load-md-folder.sh ./docs-pl blog --lang pl_PL
Short form:
./scripts/load-md-folder.sh ./docs-pl blog -l pl_PL
Advanced Usage
Adding Metadata
Add custom metadata to all imported files:
./scripts/load-md-folder.sh ./posts blog \ --meta "author=John Doe" \ --meta "status=published" \ --meta "category=tutorial"
Short form:
./scripts/load-md-folder.sh ./posts blog \ -m "author=John Doe" \ -m "status=published"
Dry Run
Preview what would be imported without making changes:
./scripts/load-md-folder.sh ./docs blog --dry-run
This shows:
- Which files would be imported
- Generated keys
- Extracted metadata
- Final metadata combination
Verbose Output
See detailed information during import:
./scripts/load-md-folder.sh ./docs blog --verbose
Shows:
- Each file being processed
- Generated key for each file
- Metadata for each file
- Success/failure status
Custom Server
Connect to a different MDDB server:
./scripts/load-md-folder.sh ./docs blog \ --server http://production-server:11023
Or use environment variable:
MDDB_SERVER=http://production-server:11023 \ ./scripts/load-md-folder.sh ./docs blog
Batch Size
Control progress update frequency:
./scripts/load-md-folder.sh ./docs blog --batch-size 50
Default is 10 files per progress update.
Frontmatter Support
The script automatically extracts metadata from YAML-style frontmatter:
---
title: Getting Started
author: John Doe
tags: tutorial, beginner
category: documentation
date: 2024-01-15
--- Your content here...
This frontmatter will be converted to metadata:
title=Getting Startedauthor=John Doetags=tutorial, beginnercategory=documentationdate=2024-01-15
Frontmatter Format
Supported format:
---
key: value
another_key: another value
tags: value1, value2
---
Requirements:
- Must start with
---on first line - Must end with
---on its own line - Use
key: valueformat - Values can contain spaces (quotes optional)
Key Generation
Keys are automatically generated from filenames:
| Filename | Generated Key |
|---|---|
Getting Started.md | getting-started |
API_Reference.md | api-reference |
2024-01-15-blog-post.md | 2024-01-15-blog-post |
My Document (v2).md | my-document-v2 |
Rules:
- Convert to lowercase
- Replace spaces and special characters with hyphens
- Remove consecutive hyphens
- Trim leading/trailing hyphens
Metadata Combination
Metadata is combined from multiple sources:
Automatic metadata:
source=folder-importfilename=original-filename.md
Frontmatter metadata (extracted from file)
Custom metadata (from
--metaflags)
Example:
---
author: Jane
category: tutorial
--- ./scripts/load-md-folder.sh ./docs blog -m "status=published" source=folder-import,filename=tutorial.md,author=Jane,category=tutorial,status=published
Examples
Migrate Documentation
./scripts/load-md-folder.sh ./docs documentation \ --recursive \ --meta "version=2.0" \ --meta "status=published" \ --verbose
Import Blog Posts
./scripts/load-md-folder.sh ./blog-posts blog \ --lang en_US \ --meta "author=John Doe" \ --meta "type=blog-post"
Multi-language Content
./scripts/load-md-folder.sh ./content/en articles -l en_US -r ./scripts/load-md-folder.sh ./content/pl articles -l pl_PL -r ./scripts/load-md-folder.sh ./content/de articles -l de_DE -r
Preview Before Import
./scripts/load-md-folder.sh ./docs blog --dry-run ./scripts/load-md-folder.sh ./docs blog
Large Import with Progress
./scripts/load-md-folder.sh ./large-docs blog \ --recursive \ --batch-size 100 \ --verbose
Output
Progress Display
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ MDDB Folder Loader
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Checking server connectivity...
โ Server is running Configuration: Folder: ./docs Collection: blog Language: en_US Server: http://localhost:11023 Recursive: true Scanning for markdown files...
Found 150 markdown file(s) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Loading Files
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Progress: [########################## ] 52% (78/150 files)
Summary
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Summary
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Results: Total files: 150 Successful: 148 Failed: 2 Duration: 45s Throughput: 3.29 files/sec โ Import completed with some failures
Error Handling
Common Errors
Server not running:
โ Cannot connect to MDDB server at http://localhost:11023 Make sure the server is running
Folder not found:
Error: Folder does not exist: ./nonexistent
No markdown files:
No markdown files found in ./empty-folder
Failed Imports
If some files fail to import:
- The script continues processing remaining files
- Failed files are counted in the summary
- Exit code is 1 (failure) if any imports failed
- Exit code is 0 (success) if all imports succeeded
Environment Variables
| Variable | Description | Default |
|---|---|---|
MDDB_SERVER | Server URL | http://localhost:11023 |
MDDB_CLI | CLI command path | mddb-cli |
Example:
export MDDB_SERVER=http://production:11023
export MDDB_CLI=/usr/local/bin/mddb-cli ./scripts/load-md-folder.sh ./docs blog
Performance Tips
Batch Size: Increase for large imports to reduce output
./scripts/load-md-folder.sh ./docs blog -b 100Disable Verbose: For faster imports
./scripts/load-md-folder.sh ./docs blogUse Extreme Mode: Enable on server for better performance
MDDB_EXTREME=true mddbdLocal Server: Import to local server, then backup/restore to production
Troubleshooting
Script not executable
chmod +x scripts/load-md-folder.sh
CLI not found
make build-cli
make install-all MDDB_CLI=/path/to/mddb-cli ./scripts/load-md-folder.sh ./docs blog
Server connection refused
mddb-cli stats make docker-up
make run
Frontmatter not parsed
Ensure frontmatter format:
- Starts with
---on line 1 - Ends with
---on its own line - Uses
key: valueformat
Integration with CI/CD
GitHub Actions
name: Import Documentation on: push: paths: - 'docs/**/*.md' jobs: import: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Install MDDB CLI run: | wget https://github.com/tradik/mddb/releases/latest/download/mddb-cli-latest-linux-amd64.tar.gz tar xzf mddb-cli-latest-linux-amd64.tar.gz sudo mv mddb-cli /usr/local/bin/ - name: Import Documentation env: MDDB_SERVER: ${{ secrets.MDDB_SERVER }} run: | ./scripts/load-md-folder.sh ./docs documentation -r -m "version=${{ github.sha }}"
GitLab CI
import-docs: stage: deploy script: - chmod +x scripts/load-md-folder.sh - ./scripts/load-md-folder.sh ./docs documentation -r only: - main
Best Practices
- Always dry run first on production data
- Use meaningful collection names that reflect content type
- Add version metadata for tracking changes
- Use recursive mode for organized folder structures
- Include frontmatter in markdown files for rich metadata
- Test with small batches before large imports
- Monitor server resources during large imports
- Backup database before major imports