Add ?bbox= viewport filtering to /api/v1/topology/edges #78

Open
opened 2026-06-08 15:51:46 +03:00 by skobkin · 1 comment
Owner

Summary

Add viewport (?bbox=) filtering to GET /api/v1/topology/edges so the "Show all topology" client (see #72) can fetch only the edges visible in the current map view.

Why

Follow-up to #72. Without viewport filtering, the endpoint returns every edge in the mesh up to the soft cap (web.map.topology_max_edges, default 2000). For larger deployments the client either hits the cap and sees a misleading "(2000+)" count, or we raise the cap and inflate every payload. bbox is the obvious fix: only request edges whose
endpoints fall inside the current Leaflet viewport, then refetch on moveend (debounced). The user gets a complete local view cheaply.

Concerns

Filed as "design and measure before implementing" because the current SQLite schema and indexes are not obviously a good fit:

  • Efficient lat BETWEEN ? AND ? AND lng BETWEEN ? against node positions, joined to topology edges, may need a new index, a denormalized position table, or a small in-memory spatial index rebuilt on topology changes.
  • Geo queries on SQLite are workable (with rtree or composite indexes) but should be benchmarked against representative dataset sizes (hundreds to low-thousands of nodes / thousands of edges) before committing to an approach.

Open questions

  1. Bbox is a rectangle, but the topology graph can have long-range links. Should the client also pass a buffer (e.g. extend bbox by 10%) so partially-visible edges aren't dropped on pan?
  2. Does the endpoint already apply topology_evidence_max_age server-side, or is that purely client-side? (Needs verification during design.)
  3. Should we still apply the web.map.topology_max_edges cap on top of bbox for safety, and signal truncated when the cap is hit within the viewport?

Acceptance criteria

☐ ?bbox=lat1,lng1,lat2,lng2 query parameter supported on /api/v1/topology/edges.
☐ Invalid bbox values return 400 with a clear error.
☐ Endpoint still respects web.map.topology_cache_ttl and web.map.topology_max_edges.
☐ openapi.yaml documents the new parameter.
☐ httptest covers: valid bbox returns subset; bbox outside data range returns empty; malformed bbox returns 400.
☐ Benchmark or documented numbers showing the query is fast enough for a 5k-edge, 1k-node dataset.

  • #72 — original feature this unlocks at scale
## Summary Add viewport (?bbox=) filtering to GET /api/v1/topology/edges so the "Show all topology" client (see #72) can fetch only the edges visible in the current map view. ## Why Follow-up to #72. Without viewport filtering, the endpoint returns every edge in the mesh up to the soft cap (web.map.topology_max_edges, default 2000). For larger deployments the client either hits the cap and sees a misleading "(2000+)" count, or we raise the cap and inflate every payload. bbox is the obvious fix: only request edges whose endpoints fall inside the current Leaflet viewport, then refetch on moveend (debounced). The user gets a complete local view cheaply. ## Concerns Filed as "design and measure before implementing" because the current SQLite schema and indexes are not obviously a good fit: - Efficient lat BETWEEN ? AND ? AND lng BETWEEN ? against node positions, joined to topology edges, may need a new index, a denormalized position table, or a small in-memory spatial index rebuilt on topology changes. - Geo queries on SQLite are workable (with rtree or composite indexes) but should be benchmarked against representative dataset sizes (hundreds to low-thousands of nodes / thousands of edges) before committing to an approach. ## Open questions 1. Bbox is a rectangle, but the topology graph can have long-range links. Should the client also pass a buffer (e.g. extend bbox by 10%) so partially-visible edges aren't dropped on pan? 2. Does the endpoint already apply topology_evidence_max_age server-side, or is that purely client-side? (Needs verification during design.) 3. Should we still apply the web.map.topology_max_edges cap on top of bbox for safety, and signal truncated when the cap is hit within the viewport? ## Acceptance criteria ☐ ?bbox=lat1,lng1,lat2,lng2 query parameter supported on /api/v1/topology/edges. ☐ Invalid bbox values return 400 with a clear error. ☐ Endpoint still respects web.map.topology_cache_ttl and web.map.topology_max_edges. ☐ openapi.yaml documents the new parameter. ☐ httptest covers: valid bbox returns subset; bbox outside data range returns empty; malformed bbox returns 400. ☐ Benchmark or documented numbers showing the query is fast enough for a 5k-edge, 1k-node dataset. ## Related - #72 — original feature this unlocks at scale
skobkin self-assigned this 2026-06-08 15:51:46 +03:00
Collaborator

The issue is already well-shaped (acceptance criteria, related work, open questions). Concrete suggestions for the how, grounded in the existing Go/SQLite stack.

SQLite path (recommended for v1)

The body itself flags "SQLite + rtree or composite indexes" as the path. Two viable index strategies:

Index Pros Cons Verdict
A. Composite B-tree on (lat, lng) with the bbox query rewritten to a pair of lat BETWEEN ? AND ? + lng BETWEEN ? AND ? No schema extension; works on the existing table; SQLite is well-optimized for ordered scans when the index is selective Two-range query is wider than a true spatial index; on a 5k-edge dataset the planner will probably pick a full scan unless the bbox is small Good enough for v1 with a small dataset; revisit if profiling shows the bbox query is hot
B. SQLite R*Tree (the rtree virtual table, separate from the main nodes table) True spatial index; well-documented performance characteristics for bbox queries Extra table to keep in sync; migration cost; the bbox query has to join rtree back to the main nodes table Best for the "thousands of nodes" use case the body calls out

Recommendation: start with (A), but build the bbox query as a function so (B) is a drop-in if the planner complains. Something like:

// internal/api/topology.go
func edgesInBbox(ctx context.Context, store ReadStore, bbox Bbox) ([]Edge, error) {
    return store.Query(`
        SELECT e.id, e.from_node, e.to_node, e.evidence
        FROM edges e
        JOIN nodes nf ON nf.id = e.from_node
        JOIN nodes nt ON nt.id = e.to_node
        WHERE nf.lat BETWEEN ? AND ? AND nf.lng BETWEEN ? AND ?
          AND nt.lat BETWEEN ? AND ? AND nt.lng BETWEEN ? AND ?
    `, bbox.S, bbox.W, bbox.N, bbox.E, bbox.S, bbox.W, bbox.N, bbox.E)
}

Adding the idx_nodes_lat_lng index is one line in the migration; the query above is then index-friendly.

The three open questions, one answer each

  1. Buffer (10%) on bbox? Yes, do it client-side, not server-side. The client knows the viewport zoom level and can apply a buffer that scales (buffer = max(0.1, 0.05 / zoom) for example). Server stays dumb. This also matches the body: "long-range links" are a display concern, not a data concern.
  2. Does the endpoint already apply topology_evidence_max_age server-side? Worth verifying with a quick grep in the topology handler. If it's client-side, the bbox-filtered query should still respect it — add a WHERE evidence.created_at > ? clause and pass the cutoff as a parameter. One-line change.
  3. Apply topology_max_edges cap on top of bbox? Yes, but the response should also include a truncated: true flag with the visible-vs-total count, so the client can show "(N visible, M total in viewport)" rather than a misleading "(2000+)". One extra field in the response struct.

Buffer expansion math (client-side)

Zoom Suggested buffer
0–4 (continent) 50% (huge links might cross a hemisphere)
5–8 (country/region) 20%
9–12 (city) 10%
13+ (street) 5% (very local — long links are obviously out of view)

These are starting numbers; tune from telemetry. The point is the buffer is in the client's code, so changing it doesn't touch the API.

httptest cases (mapped to the acceptance criteria)

func TestEdgesBboxFiltering(t *testing.T) {
    // 1. valid bbox returns subset
    edges, _ := api.TopologyEdges(testServer, "?bbox=10.0,20.0,30.0,40.0")
    assert(subset(edges, allEdges))
    assert(all(edges, e => 10  e.lat  30 && 20  e.lng  40))

    // 2. bbox outside data range returns empty
    edges, _ = api.TopologyEdges(testServer, "?bbox=-80.0,-170.0,-70.0,-160.0")
    assert(len(edges) == 0)

    // 3. malformed bbox returns 400
    for _, q := range []string{
        "?bbox=10,20,30",          // 3 coords, not 4
        "?bbox=foo,bar,30,40",     // non-numeric
        "?bbox=10,20,30,40,50",    // 5 coords
        "?bbox=",                  // empty
    } {
        _, code := api.TopologyEdges(testServer, q)
        assert(code == 400)
    }

    // 4. ?bbox= invalid (south > north) returns 400
    _, code := api.TopologyEdges(testServer, "?bbox=40,20,10,30")
    assert(code == 400)
}

These pin down the contract and the failure modes.

OpenAPI doc snippet

- name: bbox
  in: query
  required: false
  schema:
    type: string
    pattern: '^-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?$'
  description: |
    Viewport bounding box as `lat1,lng1,lat2,lng2` (south-west, north-east).
    When present, the endpoint returns only edges whose endpoints fall inside the bbox.
    When absent, behaviour matches the pre-bbox version.
  example: "55.50,37.30,55.95,37.95"

Suggested order of work

  1. Add idx_nodes_lat_lng migration (idempotent; one line).
  2. Implement edgesInBbox as a separate function (no API change yet).
  3. Land the API parameter + OpenAPI doc.
  4. Add the four httptest cases.
  5. Profile with a representative dataset (generate a synthetic 5k-edge, 1k-node graph if there isn't one). If the bbox query is the slow part, swap in the R*Tree path.
  6. Update the client to send bbox and refetch on moveend (debounced ~300ms).

The "design and measure before implementing" note in the issue body is the right instinct — step 5 is the only place where the design choice between (A) and (B) gets settled. Until you've measured, (A) is the right default.

The issue is already well-shaped (acceptance criteria, related work, open questions). Concrete suggestions for the *how*, grounded in the existing Go/SQLite stack. **SQLite path (recommended for v1)** The body itself flags "SQLite + rtree or composite indexes" as the path. Two viable index strategies: | Index | Pros | Cons | Verdict | |---|---|---|---| | **A. Composite B-tree on `(lat, lng)`** with the bbox query rewritten to a pair of `lat BETWEEN ? AND ?` + `lng BETWEEN ? AND ?` | No schema extension; works on the existing table; SQLite is well-optimized for ordered scans when the index is selective | Two-range query is wider than a true spatial index; on a 5k-edge dataset the planner will probably pick a full scan unless the bbox is small | Good enough for v1 with a small dataset; revisit if profiling shows the bbox query is hot | | **B. SQLite R\*Tree** (the `rtree` virtual table, separate from the main `nodes` table) | True spatial index; well-documented performance characteristics for bbox queries | Extra table to keep in sync; migration cost; the bbox query has to join `rtree` back to the main `nodes` table | Best for the "thousands of nodes" use case the body calls out | **Recommendation: start with (A), but build the bbox query as a function so (B) is a drop-in if the planner complains.** Something like: ```go // internal/api/topology.go func edgesInBbox(ctx context.Context, store ReadStore, bbox Bbox) ([]Edge, error) { return store.Query(` SELECT e.id, e.from_node, e.to_node, e.evidence FROM edges e JOIN nodes nf ON nf.id = e.from_node JOIN nodes nt ON nt.id = e.to_node WHERE nf.lat BETWEEN ? AND ? AND nf.lng BETWEEN ? AND ? AND nt.lat BETWEEN ? AND ? AND nt.lng BETWEEN ? AND ? `, bbox.S, bbox.W, bbox.N, bbox.E, bbox.S, bbox.W, bbox.N, bbox.E) } ``` Adding the `idx_nodes_lat_lng` index is one line in the migration; the query above is then index-friendly. **The three open questions, one answer each** 1. **Buffer (10%) on bbox?** Yes, do it client-side, not server-side. The client knows the viewport zoom level and can apply a buffer that scales (`buffer = max(0.1, 0.05 / zoom)` for example). Server stays dumb. This also matches the body: "long-range links" are a *display* concern, not a *data* concern. 2. **Does the endpoint already apply `topology_evidence_max_age` server-side?** Worth verifying with a quick grep in the topology handler. If it's client-side, the bbox-filtered query should still respect it — add a `WHERE evidence.created_at > ?` clause and pass the cutoff as a parameter. One-line change. 3. **Apply `topology_max_edges` cap on top of bbox?** Yes, but the response should *also* include a `truncated: true` flag with the visible-vs-total count, so the client can show "(N visible, M total in viewport)" rather than a misleading "(2000+)". One extra field in the response struct. **Buffer expansion math (client-side)** | Zoom | Suggested buffer | |---|---| | 0–4 (continent) | 50% (huge links might cross a hemisphere) | | 5–8 (country/region) | 20% | | 9–12 (city) | 10% | | 13+ (street) | 5% (very local — long links are obviously out of view) | These are starting numbers; tune from telemetry. The point is the buffer is in the *client's* code, so changing it doesn't touch the API. **httptest cases (mapped to the acceptance criteria)** ```go func TestEdgesBboxFiltering(t *testing.T) { // 1. valid bbox returns subset edges, _ := api.TopologyEdges(testServer, "?bbox=10.0,20.0,30.0,40.0") assert(subset(edges, allEdges)) assert(all(edges, e => 10 ≤ e.lat ≤ 30 && 20 ≤ e.lng ≤ 40)) // 2. bbox outside data range returns empty edges, _ = api.TopologyEdges(testServer, "?bbox=-80.0,-170.0,-70.0,-160.0") assert(len(edges) == 0) // 3. malformed bbox returns 400 for _, q := range []string{ "?bbox=10,20,30", // 3 coords, not 4 "?bbox=foo,bar,30,40", // non-numeric "?bbox=10,20,30,40,50", // 5 coords "?bbox=", // empty } { _, code := api.TopologyEdges(testServer, q) assert(code == 400) } // 4. ?bbox= invalid (south > north) returns 400 _, code := api.TopologyEdges(testServer, "?bbox=40,20,10,30") assert(code == 400) } ``` These pin down the contract and the failure modes. **OpenAPI doc snippet** ```yaml - name: bbox in: query required: false schema: type: string pattern: '^-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?,-?\d+(\.\d+)?$' description: | Viewport bounding box as `lat1,lng1,lat2,lng2` (south-west, north-east). When present, the endpoint returns only edges whose endpoints fall inside the bbox. When absent, behaviour matches the pre-bbox version. example: "55.50,37.30,55.95,37.95" ``` **Suggested order of work** 1. Add `idx_nodes_lat_lng` migration (idempotent; one line). 2. Implement `edgesInBbox` as a separate function (no API change yet). 3. Land the API parameter + OpenAPI doc. 4. Add the four httptest cases. 5. Profile with a representative dataset (generate a synthetic 5k-edge, 1k-node graph if there isn't one). If the bbox query is the slow part, swap in the R\*Tree path. 6. Update the client to send bbox and refetch on `moveend` (debounced ~300ms). The "design and measure before implementing" note in the issue body is the right instinct — step 5 is the *only* place where the design choice between (A) and (B) gets settled. Until you've measured, (A) is the right default.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
skobkin/meshmap-lite#78
No description provided.