skobkin/meshmap-lite

Fork 1

Node view cache #28

New issue

Open

opened 2026-03-07 03:55:10 +03:00 by skobkin · 0 comments

skobkin commented

2026-03-07 03:55:10 +03:00

Owner

Proposal: Canonical Node View Cache for Realtime Enrichment

Context

This document proposes an optimization for realtime node-data enrichment.

The idea is to reduce repeated SQLite reads in hot live paths by introducing a canonical node-view projection and, optionally, a small in-memory cache behind it.

The main motivation is database usage and realtime-path efficiency. A secondary benefit is that snapshot and realtime paths can share the same view-building logic.

Problem Summary

Current shape:

                     BOOTSTRAP / SNAPSHOT PATH
+---------+      +-------------+      +------------------+      +-----------+
| Browser | ---> | HTTP API    | ---> | ReadStore/SQLite | ---> | JSON DTOs |
+---------+      +-------------+      +------------------+      +-----------+


                       LIVE / REALTIME PATH
+------+      +----------------+      +------------------+      +--------+
| MQTT | ---> | Ingest Service | ---> | WriteStore/SQLite| ---> | WS Hub |
+------+      +----------------+      +------------------+      +--------+
                    |
                    v
           build WS payload directly
           from parsed packet / partial data

This creates two different ways to build the same logical node view:

Snapshot path: merged and enriched state from persistence.
Realtime path: sparse event-derived state.

That split is not automatically wrong, but it makes enrichment behavior harder to centralize and can force extra point reads whenever realtime payloads need canonical node information.

Goal

Introduce a single canonical node-view path that can be used by:

snapshot REST endpoints
realtime WebSocket payload generation

Optional secondary goal:

reduce per-event SQLite enrichment reads in hot realtime paths

Non-Goal

This is not a proposal to make an in-memory store the source of truth.

The database/repository must remain authoritative.

Option 1: Post-Write Read-Through Cache

Flow

packet arrives
  -> persist write
  -> evict cache[node_id]
  -> if live payload needs canonical node view:
       projector.get(node_id)
         -> cache miss
         -> read DB
         -> fill cache
         -> emit WS payload

Schema

+------+      +----------------+      +------------------+
| MQTT | ---> | Ingest Service | ---> | WriteStore/SQLite|
+------+      +----------------+      +------------------+
                    |                         |
                    | evict cache[node_id]    |
                    v                         |
             +------------------+             |
             | Node View Cache  | <-----------+
             +------------------+
                    |
                    v
             +------------------+
             | ReadStore        |
             | GetNodeDetails   |
             +------------------+
                    |
                    v
                 WS Hub

Pros

simplest and safest implementation
keeps merge semantics authoritative in persistence layer
low risk of cache/DB divergence
enough to remove repeated reads for hot nodes after first miss
useful first step if we want fast delivery with low refactor risk

Cons

still performs one read after writes when canonical live payload is needed
realtime latency still depends on DB on cache miss
does not eliminate serialized read-after-write pressure on single-connection SQLite
merge/projection logic remains split between persistence and live service

Option 2: Canonical Projector with Immediate Cache Update

Flow

packet arrives
  -> merge into canonical node view
  -> persist canonical result
  -> update cache with same canonical result
  -> emit WS payload from canonical result

Schema

                           SHARED CANONICAL PATH
                 +-----------------------------------+
                 | Node Projector / Node Aggregator |
                 | owns canonical node-view rules    |
                 +-----------------------------------+
                          |                  |
                          v                  v
                   persist canonical      update cache
                      state/result         same result


+------+      +----------------+      +-----------------------+      +--------+
| MQTT | ---> | Ingest Service | ---> | Projector + Repository| ---> | WS Hub |
+------+      +----------------+      +-----------------------+      +--------+
                                               |
                                               v
                                          SQLite store

Pros

best architectural consistency model
no post-write enrichment read required
live and snapshot payloads can use the same projection rules
best latency under bursty traffic
avoids duplicated view-building logic across layers

Cons

highest implementation cost
requires careful extraction of merge logic from current SQL-only behavior
must preserve existing semantics for:
- UpsertNode field preservation
- telemetry merge behavior
- timestamp update rules
larger refactor with more surface area to test

Option 3: Patch Cache Directly from Partial Ingest Payloads

Flow

packet arrives
  -> write DB
  -> patch cache from incoming partial payload
  -> emit WS payload from patched cache

Pros

low latency
avoids follow-up reads
implementation may look small initially

Cons

highest correctness risk
duplicates merge logic across ingest paths
easy to regress on field-preservation semantics
creates another place where canonical node-view rules must be maintained
hard to reason about once more packet types are added

Recommendation

Do not implement this option.

Why This Matters in This Repo

Current repository behavior is not plain replacement:

node upserts preserve existing non-empty fields
telemetry uses merge semantics rather than overwrite-all
some node-level timestamps are updated from related writes

That means "just patch the cache from the packet" is not equivalent to persisted state.

If cache is introduced, it should reflect canonical merged state, not raw ingest fragments.

Rough Performance Expectations

These are rough estimates only.

Assumptions:

SQLite uses a single open connection in current config
point reads are cheap in isolation but serialized with writes
main value comes from reducing read-after-write enrichment in realtime paths

If cache is used only for occasional display-name enrichment

likely small benefit
maybe low single-digit or low-teens percentage reduction in DB work on busy live paths
probably not user-visible by itself

If cache backs canonical live payload generation for `node.*`, `log.event`, and similar events

moderate benefit is plausible on busy instances
rough expectation: ~20-40% reduction in DB operations on those paths
p95 live-event latency may improve noticeably during bursts because enrichment reads stop competing with writes

Where the real value is

fewer enrichment reads in realtime paths
lower read-after-write pressure on single-connection SQLite
one clear path for canonical node views
simpler reasoning about what data shape is emitted live

Consistency is an important design benefit, but the primary reason to do this work is performance and architectural simplification around realtime reads.

Recommended Rollout

Phase 1: Define Canonical Projection

define a canonical node-view shape used by both REST and WS
centralize node display-name and other view-level fallback rules

Phase 2: Add Safe Cache

add a small in-memory cache keyed by node_id
use read-through behavior on cache miss
invalidate on successful writes affecting the node

Phase 3: Consider Immediate Cache Updates

only after canonical projection/merge logic is centralized
update cache from canonical merged result, not from raw packets
then use cached canonical state for WS fanout

Suggested Design Constraints

DB/repository remains source of truth
cache is an optimization, not an authority
do not introduce ad hoc write hooks in multiple layers
avoid separate business rules for SQL, cache, and WS payload builders
keep node-view projection explicit and testable

Open Questions

Should the canonical live payload be identical to GetNodeDetails, or a smaller dedicated WS node view?
Should the projector live in internal/ingest, internal/domain, or a dedicated projection package?
Do we want simple invalidation first, or is the expected event volume high enough to justify immediate cache updates from day one?
Should frontend stores also be adjusted so partial events merge safely instead of replacing richer state?

Recommendation Summary

Recommended direction:

Do not introduce a parallel in-memory node storage system with ad hoc synchronization hooks.
Introduce a canonical node-view projection path shared by snapshot and realtime flows.
If optimization is needed, add a read-through cache behind that projection path.
Only update cache immediately after writes if the update uses canonical merged state.

Short version:

Good idea:
  cache as an optimization behind canonical node-view projection

Bad idea:
  second mutable node store patched directly from partial ingest payloads

# Proposal: Canonical Node View Cache for Realtime Enrichment ## Context This document proposes an optimization for realtime node-data enrichment. The idea is to reduce repeated SQLite reads in hot live paths by introducing a canonical node-view projection and, optionally, a small in-memory cache behind it. The main motivation is database usage and realtime-path efficiency. A secondary benefit is that snapshot and realtime paths can share the same view-building logic. ## Problem Summary Current shape: ```text BOOTSTRAP / SNAPSHOT PATH +---------+ +-------------+ +------------------+ +-----------+ | Browser | ---> | HTTP API | ---> | ReadStore/SQLite | ---> | JSON DTOs | +---------+ +-------------+ +------------------+ +-----------+ LIVE / REALTIME PATH +------+ +----------------+ +------------------+ +--------+ | MQTT | ---> | Ingest Service | ---> | WriteStore/SQLite| ---> | WS Hub | +------+ +----------------+ +------------------+ +--------+ | v build WS payload directly from parsed packet / partial data ``` This creates two different ways to build the same logical node view: - Snapshot path: merged and enriched state from persistence. - Realtime path: sparse event-derived state. That split is not automatically wrong, but it makes enrichment behavior harder to centralize and can force extra point reads whenever realtime payloads need canonical node information. ## Goal Introduce a single canonical node-view path that can be used by: - snapshot REST endpoints - realtime WebSocket payload generation Optional secondary goal: - reduce per-event SQLite enrichment reads in hot realtime paths ## Non-Goal This is not a proposal to make an in-memory store the source of truth. The database/repository must remain authoritative. ## Option 1: Post-Write Read-Through Cache ### Flow ```text packet arrives -> persist write -> evict cache[node_id] -> if live payload needs canonical node view: projector.get(node_id) -> cache miss -> read DB -> fill cache -> emit WS payload ``` ### Schema ```text +------+ +----------------+ +------------------+ | MQTT | ---> | Ingest Service | ---> | WriteStore/SQLite| +------+ +----------------+ +------------------+ | | | evict cache[node_id] | v | +------------------+ | | Node View Cache | <-----------+ +------------------+ | v +------------------+ | ReadStore | | GetNodeDetails | +------------------+ | v WS Hub ``` ### Pros - simplest and safest implementation - keeps merge semantics authoritative in persistence layer - low risk of cache/DB divergence - enough to remove repeated reads for hot nodes after first miss - useful first step if we want fast delivery with low refactor risk ### Cons - still performs one read after writes when canonical live payload is needed - realtime latency still depends on DB on cache miss - does not eliminate serialized read-after-write pressure on single-connection SQLite - merge/projection logic remains split between persistence and live service ## Option 2: Canonical Projector with Immediate Cache Update ### Flow ```text packet arrives -> merge into canonical node view -> persist canonical result -> update cache with same canonical result -> emit WS payload from canonical result ``` ### Schema ```text SHARED CANONICAL PATH +-----------------------------------+ | Node Projector / Node Aggregator | | owns canonical node-view rules | +-----------------------------------+ | | v v persist canonical update cache state/result same result +------+ +----------------+ +-----------------------+ +--------+ | MQTT | ---> | Ingest Service | ---> | Projector + Repository| ---> | WS Hub | +------+ +----------------+ +-----------------------+ +--------+ | v SQLite store ``` ### Pros - best architectural consistency model - no post-write enrichment read required - live and snapshot payloads can use the same projection rules - best latency under bursty traffic - avoids duplicated view-building logic across layers ### Cons - highest implementation cost - requires careful extraction of merge logic from current SQL-only behavior - must preserve existing semantics for: - `UpsertNode` field preservation - telemetry merge behavior - timestamp update rules - larger refactor with more surface area to test ## Option 3: Patch Cache Directly from Partial Ingest Payloads ### Flow ```text packet arrives -> write DB -> patch cache from incoming partial payload -> emit WS payload from patched cache ``` ### Pros - low latency - avoids follow-up reads - implementation may look small initially ### Cons - highest correctness risk - duplicates merge logic across ingest paths - easy to regress on field-preservation semantics - creates another place where canonical node-view rules must be maintained - hard to reason about once more packet types are added ### Recommendation Do not implement this option. ## Why This Matters in This Repo Current repository behavior is not plain replacement: - node upserts preserve existing non-empty fields - telemetry uses merge semantics rather than overwrite-all - some node-level timestamps are updated from related writes That means "just patch the cache from the packet" is not equivalent to persisted state. If cache is introduced, it should reflect canonical merged state, not raw ingest fragments. ## Rough Performance Expectations These are rough estimates only. Assumptions: - SQLite uses a single open connection in current config - point reads are cheap in isolation but serialized with writes - main value comes from reducing read-after-write enrichment in realtime paths ### If cache is used only for occasional display-name enrichment - likely small benefit - maybe low single-digit or low-teens percentage reduction in DB work on busy live paths - probably not user-visible by itself ### If cache backs canonical live payload generation for `node.*`, `log.event`, and similar events - moderate benefit is plausible on busy instances - rough expectation: ~20-40% reduction in DB operations on those paths - p95 live-event latency may improve noticeably during bursts because enrichment reads stop competing with writes ### Where the real value is - fewer enrichment reads in realtime paths - lower read-after-write pressure on single-connection SQLite - one clear path for canonical node views - simpler reasoning about what data shape is emitted live Consistency is an important design benefit, but the primary reason to do this work is performance and architectural simplification around realtime reads. ## Recommended Rollout ### Phase 1: Define Canonical Projection - define a canonical node-view shape used by both REST and WS - centralize node display-name and other view-level fallback rules ### Phase 2: Add Safe Cache - add a small in-memory cache keyed by `node_id` - use read-through behavior on cache miss - invalidate on successful writes affecting the node ### Phase 3: Consider Immediate Cache Updates - only after canonical projection/merge logic is centralized - update cache from canonical merged result, not from raw packets - then use cached canonical state for WS fanout ## Suggested Design Constraints - DB/repository remains source of truth - cache is an optimization, not an authority - do not introduce ad hoc write hooks in multiple layers - avoid separate business rules for SQL, cache, and WS payload builders - keep node-view projection explicit and testable ## Open Questions - Should the canonical live payload be identical to `GetNodeDetails`, or a smaller dedicated WS node view? - Should the projector live in `internal/ingest`, `internal/domain`, or a dedicated projection package? - Do we want simple invalidation first, or is the expected event volume high enough to justify immediate cache updates from day one? - Should frontend stores also be adjusted so partial events merge safely instead of replacing richer state? ## Recommendation Summary Recommended direction: 1. Do not introduce a parallel in-memory node storage system with ad hoc synchronization hooks. 2. Introduce a canonical node-view projection path shared by snapshot and realtime flows. 3. If optimization is needed, add a read-through cache behind that projection path. 4. Only update cache immediately after writes if the update uses canonical merged state. Short version: ```text Good idea: cache as an optimization behind canonical node-view projection Bad idea: second mutable node store patched directly from partial ingest payloads ```

skobkin added the

enhancement

label

2026-03-07 03:55:18 +03:00

No labels

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

skobkin/meshmap-lite#28

No description provided.

Rows
Columns

Node view cache #28

Proposal: Canonical Node View Cache for Realtime Enrichment

Context

Problem Summary

Goal

Non-Goal

Option 1: Post-Write Read-Through Cache

Flow

Schema

Pros

Cons

Option 2: Canonical Projector with Immediate Cache Update

Flow

Schema

Pros

Cons

Option 3: Patch Cache Directly from Partial Ingest Payloads

Flow

Pros

Cons

Recommendation

Why This Matters in This Repo

Rough Performance Expectations

If cache is used only for occasional display-name enrichment

If cache backs canonical live payload generation for node.*, log.event, and similar events

Where the real value is

Recommended Rollout

Phase 1: Define Canonical Projection

Phase 2: Add Safe Cache

Phase 3: Consider Immediate Cache Updates

Suggested Design Constraints

Open Questions

Recommendation Summary

If cache backs canonical live payload generation for `node.*`, `log.event`, and similar events