skobkin/meshmap-lite

Fork 1

Add ?bbox= viewport filtering to /api/v1/topology/edges #78

New issue

Open

opened 2026-06-08 15:51:46 +03:00 by skobkin · 1 comment

skobkin commented

2026-06-08 15:51:46 +03:00

Owner

Summary

Add viewport (?bbox=) filtering to GET /api/v1/topology/edges so the "Show all topology" client (see #72) can fetch only the edges visible in the current map view.

Why

Follow-up to #72. Without viewport filtering, the endpoint returns every edge in the mesh up to the soft cap (web.map.topology_max_edges, default 2000). For larger deployments the client either hits the cap and sees a misleading "(2000+)" count, or we raise the cap and inflate every payload. bbox is the obvious fix: only request edges whose
endpoints fall inside the current Leaflet viewport, then refetch on moveend (debounced). The user gets a complete local view cheaply.

Concerns

Filed as "design and measure before implementing" because the current SQLite schema and indexes are not obviously a good fit:

Efficient lat BETWEEN ? AND ? AND lng BETWEEN ? against node positions, joined to topology edges, may need a new index, a denormalized position table, or a small in-memory spatial index rebuilt on topology changes.
Geo queries on SQLite are workable (with rtree or composite indexes) but should be benchmarked against representative dataset sizes (hundreds to low-thousands of nodes / thousands of edges) before committing to an approach.

Open questions

Bbox is a rectangle, but the topology graph can have long-range links. Should the client also pass a buffer (e.g. extend bbox by 10%) so partially-visible edges aren't dropped on pan?
Does the endpoint already apply topology_evidence_max_age server-side, or is that purely client-side? (Needs verification during design.)
Should we still apply the web.map.topology_max_edges cap on top of bbox for safety, and signal truncated when the cap is hit within the viewport?

Acceptance criteria

☐ ?bbox=lat1,lng1,lat2,lng2 query parameter supported on /api/v1/topology/edges.
☐ Invalid bbox values return 400 with a clear error.
☐ Endpoint still respects web.map.topology_cache_ttl and web.map.topology_max_edges.
☐ openapi.yaml documents the new parameter.
☐ httptest covers: valid bbox returns subset; bbox outside data range returns empty; malformed bbox returns 400.
☐ Benchmark or documented numbers showing the query is fast enough for a 5k-edge, 1k-node dataset.

#72 — original feature this unlocks at scale

## Summary Add viewport (?bbox=) filtering to GET /api/v1/topology/edges so the "Show all topology" client (see #72) can fetch only the edges visible in the current map view. ## Why Follow-up to #72. Without viewport filtering, the endpoint returns every edge in the mesh up to the soft cap (web.map.topology_max_edges, default 2000). For larger deployments the client either hits the cap and sees a misleading "(2000+)" count, or we raise the cap and inflate every payload. bbox is the obvious fix: only request edges whose endpoints fall inside the current Leaflet viewport, then refetch on moveend (debounced). The user gets a complete local view cheaply. ## Concerns Filed as "design and measure before implementing" because the current SQLite schema and indexes are not obviously a good fit: - Efficient lat BETWEEN ? AND ? AND lng BETWEEN ? against node positions, joined to topology edges, may need a new index, a denormalized position table, or a small in-memory spatial index rebuilt on topology changes. - Geo queries on SQLite are workable (with rtree or composite indexes) but should be benchmarked against representative dataset sizes (hundreds to low-thousands of nodes / thousands of edges) before committing to an approach. ## Open questions 1. Bbox is a rectangle, but the topology graph can have long-range links. Should the client also pass a buffer (e.g. extend bbox by 10%) so partially-visible edges aren't dropped on pan? 2. Does the endpoint already apply topology_evidence_max_age server-side, or is that purely client-side? (Needs verification during design.) 3. Should we still apply the web.map.topology_max_edges cap on top of bbox for safety, and signal truncated when the cap is hit within the viewport? ## Acceptance criteria ☐ ?bbox=lat1,lng1,lat2,lng2 query parameter supported on /api/v1/topology/edges. ☐ Invalid bbox values return 400 with a clear error. ☐ Endpoint still respects web.map.topology_cache_ttl and web.map.topology_max_edges. ☐ openapi.yaml documents the new parameter. ☐ httptest covers: valid bbox returns subset; bbox outside data range returns empty; malformed bbox returns 400. ☐ Benchmark or documented numbers showing the query is fast enough for a 5k-edge, 1k-node dataset. ## Related - #72 — original feature this unlocks at scale

skobkin added the

enhancement

label

2026-06-08 15:51:46 +03:00

skobkin self-assigned this

2026-06-08 15:51:46 +03:00

skobkin referenced this issue

2026-06-08 16:03:50 +03:00

feat: show all topology overlay on the map (closes #72) #77

skobkin-agent commented

2026-06-11 06:10:51 +03:00

Collaborator

The issue is already well-shaped (acceptance criteria, related work, open questions). Concrete suggestions for the how, grounded in the existing Go/SQLite stack.

SQLite path (recommended for v1)

The body itself flags "SQLite + rtree or composite indexes" as the path. Two viable index strategies:

Index	Pros	Cons	Verdict
A. Composite B-tree on `(lat, lng)` with the bbox query rewritten to a pair of `lat BETWEEN ? AND ?` + `lng BETWEEN ? AND ?`	No schema extension; works on the existing table; SQLite is well-optimized for ordered scans when the index is selective	Two-range query is wider than a true spatial index; on a 5k-edge dataset the planner will probably pick a full scan unless the bbox is small	Good enough for v1 with a small dataset; revisit if profiling shows the bbox query is hot
*B. SQLite RTree** (the `rtree` virtual table, separate from the main `nodes` table)	True spatial index; well-documented performance characteristics for bbox queries	Extra table to keep in sync; migration cost; the bbox query has to join `rtree` back to the main `nodes` table	Best for the "thousands of nodes" use case the body calls out

Recommendation: start with (A), but build the bbox query as a function so (B) is a drop-in if the planner complains. Something like:

// internal/api/topology.go
func edgesInBbox(ctx context.Context, store ReadStore, bbox Bbox) ([]Edge, error) {
    return store.Query(`
        SELECT e.id, e.from_node, e.to_node, e.evidence
        FROM edges e
        JOIN nodes nf ON nf.id = e.from_node
        JOIN nodes nt ON nt.id = e.to_node
        WHERE nf.lat BETWEEN ? AND ? AND nf.lng BETWEEN ? AND ?
          AND nt.lat BETWEEN ? AND ? AND nt.lng BETWEEN ? AND ?
    `, bbox.S, bbox.W, bbox.N, bbox.E, bbox.S, bbox.W, bbox.N, bbox.E)
}

Adding the idx_nodes_lat_lng index is one line in the migration; the query above is then index-friendly.

The three open questions, one answer each

Buffer (10%) on bbox? Yes, do it client-side, not server-side. The client knows the viewport zoom level and can apply a buffer that scales (buffer = max(0.1, 0.05 / zoom) for example). Server stays dumb. This also matches the body: "long-range links" are a display concern, not a data concern.
Does the endpoint already apply topology_evidence_max_age server-side? Worth verifying with a quick grep in the topology handler. If it's client-side, the bbox-filtered query should still respect it — add a WHERE evidence.created_at > ? clause and pass the cutoff as a parameter. One-line change.
Apply topology_max_edges cap on top of bbox? Yes, but the response should also include a truncated: true flag with the visible-vs-total count, so the client can show "(N visible, M total in viewport)" rather than a misleading "(2000+)". One extra field in the response struct.

Buffer expansion math (client-side)

Zoom	Suggested buffer
0–4 (continent)	50% (huge links might cross a hemisphere)
5–8 (country/region)	20%
9–12 (city)	10%
13+ (street)	5% (very local — long links are obviously out of view)

These are starting numbers; tune from telemetry. The point is the buffer is in the client's code, so changing it doesn't touch the API.

httptest cases (mapped to the acceptance criteria)

func TestEdgesBboxFiltering(t *testing.T) {
    // 1. valid bbox returns subset
    edges, _ := api.TopologyEdges(testServer, "?bbox=10.0,20.0,30.0,40.0")
    assert(subset(edges, allEdges))
    assert(all(edges, e => 10 ≤ e.lat ≤ 30 && 20 ≤ e.lng ≤ 40))

    // 2. bbox outside data range returns empty
    edges, _ = api.TopologyEdges(testServer, "?bbox=-80.0,-170.0,-70.0,-160.0")
    assert(len(edges) == 0)

    // 3. malformed bbox returns 400
    for _, q := range []string{
        "?bbox=10,20,30",          // 3 coords, not 4
        "?bbox=foo,bar,30,40",     // non-numeric
        "?bbox=10,20,30,40,50",    // 5 coords
        "?bbox=",                  // empty
    } {
        _, code := api.TopologyEdges(testServer, q)
        assert(code == 400)
    }

    // 4. ?bbox= invalid (south > north) returns 400
    _, code := api.TopologyEdges(testServer, "?bbox=40,20,10,30")
    assert(code == 400)
}

These pin down the contract and the failure modes.

OpenAPI doc snippet

- name: bbox
  in: query
  required: false
  schema:
    type: string
    pattern: '^-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?$'
  description: |
    Viewport bounding box as `lat1,lng1,lat2,lng2` (south-west, north-east).
    When present, the endpoint returns only edges whose endpoints fall inside the bbox.
    When absent, behaviour matches the pre-bbox version.
  example: "55.50,37.30,55.95,37.95"

Suggested order of work

Add idx_nodes_lat_lng migration (idempotent; one line).
Implement edgesInBbox as a separate function (no API change yet).
Land the API parameter + OpenAPI doc.
Add the four httptest cases.
Profile with a representative dataset (generate a synthetic 5k-edge, 1k-node graph if there isn't one). If the bbox query is the slow part, swap in the R*Tree path.
Update the client to send bbox and refetch on moveend (debounced ~300ms).

The "design and measure before implementing" note in the issue body is the right instinct — step 5 is the only place where the design choice between (A) and (B) gets settled. Until you've measured, (A) is the right default.

The issue is already well-shaped (acceptance criteria, related work, open questions). Concrete suggestions for the *how*, grounded in the existing Go/SQLite stack. **SQLite path (recommended for v1)** The body itself flags "SQLite + rtree or composite indexes" as the path. Two viable index strategies: | Index | Pros | Cons | Verdict | |---|---|---|---| | **A. Composite B-tree on `(lat, lng)`** with the bbox query rewritten to a pair of `lat BETWEEN ? AND ?` + `lng BETWEEN ? AND ?` | No schema extension; works on the existing table; SQLite is well-optimized for ordered scans when the index is selective | Two-range query is wider than a true spatial index; on a 5k-edge dataset the planner will probably pick a full scan unless the bbox is small | Good enough for v1 with a small dataset; revisit if profiling shows the bbox query is hot | | **B. SQLite R\*Tree** (the `rtree` virtual table, separate from the main `nodes` table) | True spatial index; well-documented performance characteristics for bbox queries | Extra table to keep in sync; migration cost; the bbox query has to join `rtree` back to the main `nodes` table | Best for the "thousands of nodes" use case the body calls out | **Recommendation: start with (A), but build the bbox query as a function so (B) is a drop-in if the planner complains.** Something like: ```go // internal/api/topology.go func edgesInBbox(ctx context.Context, store ReadStore, bbox Bbox) ([]Edge, error) { return store.Query(` SELECT e.id, e.from_node, e.to_node, e.evidence FROM edges e JOIN nodes nf ON nf.id = e.from_node JOIN nodes nt ON nt.id = e.to_node WHERE nf.lat BETWEEN ? AND ? AND nf.lng BETWEEN ? AND ? AND nt.lat BETWEEN ? AND ? AND nt.lng BETWEEN ? AND ? `, bbox.S, bbox.W, bbox.N, bbox.E, bbox.S, bbox.W, bbox.N, bbox.E) } ``` Adding the `idx_nodes_lat_lng` index is one line in the migration; the query above is then index-friendly. **The three open questions, one answer each** 1. **Buffer (10%) on bbox?** Yes, do it client-side, not server-side. The client knows the viewport zoom level and can apply a buffer that scales (`buffer = max(0.1, 0.05 / zoom)` for example). Server stays dumb. This also matches the body: "long-range links" are a *display* concern, not a *data* concern. 2. **Does the endpoint already apply `topology_evidence_max_age` server-side?** Worth verifying with a quick grep in the topology handler. If it's client-side, the bbox-filtered query should still respect it — add a `WHERE evidence.created_at > ?` clause and pass the cutoff as a parameter. One-line change. 3. **Apply `topology_max_edges` cap on top of bbox?** Yes, but the response should *also* include a `truncated: true` flag with the visible-vs-total count, so the client can show "(N visible, M total in viewport)" rather than a misleading "(2000+)". One extra field in the response struct. **Buffer expansion math (client-side)** | Zoom | Suggested buffer | |---|---| | 0–4 (continent) | 50% (huge links might cross a hemisphere) | | 5–8 (country/region) | 20% | | 9–12 (city) | 10% | | 13+ (street) | 5% (very local — long links are obviously out of view) | These are starting numbers; tune from telemetry. The point is the buffer is in the *client's* code, so changing it doesn't touch the API. **httptest cases (mapped to the acceptance criteria)** ```go func TestEdgesBboxFiltering(t *testing.T) { // 1. valid bbox returns subset edges, _ := api.TopologyEdges(testServer, "?bbox=10.0,20.0,30.0,40.0") assert(subset(edges, allEdges)) assert(all(edges, e => 10 ≤ e.lat ≤ 30 && 20 ≤ e.lng ≤ 40)) // 2. bbox outside data range returns empty edges, _ = api.TopologyEdges(testServer, "?bbox=-80.0,-170.0,-70.0,-160.0") assert(len(edges) == 0) // 3. malformed bbox returns 400 for _, q := range []string{ "?bbox=10,20,30", // 3 coords, not 4 "?bbox=foo,bar,30,40", // non-numeric "?bbox=10,20,30,40,50", // 5 coords "?bbox=", // empty } { _, code := api.TopologyEdges(testServer, q) assert(code == 400) } // 4. ?bbox= invalid (south > north) returns 400 _, code := api.TopologyEdges(testServer, "?bbox=40,20,10,30") assert(code == 400) } ``` These pin down the contract and the failure modes. **OpenAPI doc snippet** ```yaml - name: bbox in: query required: false schema: type: string pattern: '^-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?$' description: | Viewport bounding box as `lat1,lng1,lat2,lng2` (south-west, north-east). When present, the endpoint returns only edges whose endpoints fall inside the bbox. When absent, behaviour matches the pre-bbox version. example: "55.50,37.30,55.95,37.95" ``` **Suggested order of work** 1. Add `idx_nodes_lat_lng` migration (idempotent; one line). 2. Implement `edgesInBbox` as a separate function (no API change yet). 3. Land the API parameter + OpenAPI doc. 4. Add the four httptest cases. 5. Profile with a representative dataset (generate a synthetic 5k-edge, 1k-node graph if there isn't one). If the bbox query is the slow part, swap in the R\*Tree path. 6. Update the client to send bbox and refetch on `moveend` (debounced ~300ms). The "design and measure before implementing" note in the issue body is the right instinct — step 5 is the *only* place where the design choice between (A) and (B) gets settled. Until you've measured, (A) is the right default.

skobkin-agent referenced this issue

2026-06-15 03:06:06 +03:00

Recent updates dialogue #89