Geosearch
MDDB 2.9.11 ships a full geospatial search subsystem: radius and bounding-box queries, two pluggable index algorithms, composition with full-text and vector search, and a Leaflet-backed panel UI. This document describes the data model, the HTTP/MCP/gRPC surfaces, and the operational trade-offs.
Scale envelope: tested up to 100 000 points per collection with sub-millisecond R-tree queries. Fine for "venues/posts near X" workloads. Not a replacement for PostGIS on multi-million-point datasets.
Data model
Coordinates attach to documents via reserved metadata keys. The index extracts them in priority order at write time:
- Explicit lat/lng โ
geo_lat+geo_lngas float64 strings (decimal degrees, WGS84). Canonical and fastest. - Geohash โ
geo_hash(1โ12 character geohash string). MDDB decodes it to the centroid of the cell. Useful when your upstream system already has a geohash. - Postcode โ
geo_postcode+geo_country, resolved through an opt-in in-memory postcode โ (lat, lng) lookup loaded from a CSV file per country. Silent no-op if the lookup is not populated.
---
title: "Joe's Coffee"
geo_lat: "52.5200"
geo_lng: "13.4050"
--- Great espresso in the heart of Berlin.
Or, equivalently:
---
title: "Joe's Coffee"
geo_hash: "u33d8s7"
---
Or, with postcode fallback (requires geo-reindex with loadPostcodes):
---
title: "Joe's Coffee"
geo_postcode: "10115"
geo_country: "DE"
---
Reserved keys: geo_lat, geo_lng, geo_hash, geo_postcode,
geo_country. Do not use these for unrelated metadata or the index
will pick them up.
Index algorithms
Two algorithms share the same geo BoltDB bucket. They are two
in-memory views of the same persisted data, rebuilt independently at
startup and kept in sync by the write hooks in main.go.
Pick between them at query time via the algorithm field on
/v1/geo-search.
rtree (default)
Implementation: tidwall/rtree, a
pure-Go R-tree with [2]float64 bounding boxes. Each point is stored
as a zero-area bbox keyed by docID. The in-memory structure mirrors
the vector index pattern: an RWMutex-protected per-collection tree
plus a secondary map[docID]geoPoint so we can delete by docID
without scanning.
Strong for: radius queries, bounding-box queries, moderate update frequency. Handles poles and the anti-meridian cleanly (the index does not, but haversine scoring does).
geohash
Implementation: geo_hash.go + geohash_index.go. Points are encoded
at a fixed precision (geohashIndexPrecision = 8, ~40 m cell) and
kept in a sorted slice per collection. Queries walk the precision
down until the cell is larger than the search radius, binary-search
the slice for that prefix range, and haversine-filter the candidates.
Strong for: BoltDB-native workloads that want to compose with prefix scans on the same hash. Slightly slower than the R-tree for bbox queries (falls back to a linear scan). Useful as a sanity check against the R-tree results on the same data.
The encoding is the canonical 32-char alphabet
0123456789bcdefghjkmnpqrstuvwxyz, compatible with
geohash.org and most client libraries.
Endpoints
All endpoints accept JSON and return JSON. Write endpoints are gated
by the usual read-only mode middleware.
POST /v1/geo-search
Radius search. Returns results sorted by ascending distance.
{ "collection": "venues", "lat": 52.52, "lng": 13.405, "radiusMeters": 5000, "topK": 10, "algorithm": "rtree", "filterMeta": {"category": ["coffee"]}
}
Response:
{ "results": [ { "document": {"id": "...", "key": "joes-coffee", "meta": {...}}, "distanceMeters": 342.7, "rank": 1 } ], "total": 1, "radiusMeters": 5000, "algorithm": "rtree"
}
POST /v1/geo-within
Axis-aligned bbox search. No ordering is applied.
{ "collection": "venues", "minLat": 52.5, "maxLat": 52.6, "minLng": 13.3, "maxLng": 13.5
}
POST /v1/geo-reindex
Force-rebuild both in-memory indexes from the persisted geo
bucket, optionally loading one or more postcode CSVs first.
Write-gated.
{ "collection": "venues", "loadPostcodes": [ {"country": "PL", "csvPath": "/var/lib/mddb/postcodes/pl.csv"}, {"country": "GB", "csvPath": "/var/lib/mddb/postcodes/gb.csv"} ]
}
CSV format: postcode,lat,lng (three columns, no header, UTF-8).
MDDB never ships postcode datasets โ operators provide their own.
GET /v1/geo-stats
Per-collection point counts + loaded postcode dataset sizes.
POST /v1/geo-encode ยท POST /v1/geo-decode
Ad-hoc conversion helpers. Useful for building UIs or debugging.
// geo-encode
{"lat": 52.52, "lng": 13.405, "precision": 8}
// โ {"geohash": "u33dc1j2", "precision": 8} // geo-decode
{"geohash": "u33dc1j2"}
// โ {"lat": 52.5199..., "lng": 13.4049..., "minLat": ..., "maxLat": ..., "minLng": ..., "maxLng": ...}
Composition with FTS and vector search
POST /v1/hybrid-search grows an optional geo field that spatially
pre-filters the FTS + vector candidate set before rank fusion. This is
the easiest way to write a query like "coffee shops within 5 km of me,
ranked by semantic relevance".
{ "collection": "venues", "query": "coffee", "geo": {"lat": 52.52, "lng": 13.405, "radiusMeters": 5000}, "strategy": "alpha", "alpha": 0.6
}
Each result item gains a distanceMeters field in the composed
response.
GraphQL
GraphQL is not a supported protocol for geosearch in 2.9.10. The
GraphQL subsystem in the project is currently a pre-existing stub โ
every query resolver panics with not implemented โ and wiring it
up is tracked separately. Until that follow-up PR lands, use REST,
gRPC, or MCP for geo queries.
MCP tools
All geo endpoints are exposed to LLM clients via MCP. Tool names:
geo_search, geo_within, geo_stats, geo_encode, geo_decode.
All are annotated readOnlyHint: true and work in read-only mode.
Panel UI
The panel ships a "Geo Search" tab with a Leaflet + OpenStreetMap map. Click the map to set the query center, drag the slider to change the radius, pick the algorithm and hit Search. Results are drawn as pins and listed to the right; clicking a pin opens the document in the shared viewer.
No map-provider key is needed โ OpenStreetMap tiles are used directly
with their public attribution. If you need a different tile source
(Mapbox, Stamen, Carto), edit the tileLayer URL in
services/mddb-panel/src/components/GeoPanel.jsx.
Operational notes
- Startup latency: both indexes load asynchronously from the
geobucket. Queries return HTTP 503 untilIsReady()flips. Startup time is roughly linear in the number of points; 100 000 points take ~250 ms on a modern laptop. - Replication: the
geobucket participates in the standard Binlog replication stream. Follower nodes receive geo upserts and deletes automatically; no extra wiring needed. - Memory: each point costs ~80 bytes in the R-tree plus ~40 bytes in the geohash slice. 100 k points โ 12 MB RSS for both indexes combined.
- Benchmark:
go test -bench BenchmarkGeoIndex -benchmem. See services/mddbd/geo_index_test.go for the harness.
Limitations
- Anti-meridian crossing is not supported. Queries that would cross ยฑ180ยฐ longitude should be split into two halves by the caller.
- 3D / altitude is not supported. MDDB is strictly 2D.
- Automatic postcode downloads โ MDDB does not ship or fetch any postcode datasets. Bring your own CSV.
- Scale ceiling โ beyond ~500 000 points per collection the in-memory R-tree starts to dominate process RSS. For bigger datasets, use PostGIS or a dedicated spatial DB.