Concurrent websocket write bug #21

Closed
opened 2026-03-01 00:27:19 +03:00 by skobkin · 0 comments
Owner
  • The actual unsafe code is in hub.go:63: Hub.Emit calls c.WriteMessage(...) directly on shared *websocket.Conn values with no per-connection write serialization.
  • gorilla/websocket requires one concurrent writer per connection. If two goroutines call Emit at the same time, they can write to the same socket concurrently and panic with concurrent write to websocket connection.
  • 3f5d913 added a second periodic emitter path by splitting stats and heartbeat into separate goroutines in server.go:221 and app.go:100. That makes overlapping hub.Emit calls more likely.
  • The current refactor (91fad39066) did not change the WebSocket hub logic. It still passes hub.Emit from the same places and does not add any locking around writes.

One nuance:

  • Even before 3f5d913, ingest could already emit from MQTT handling while the stats ticker emitted from another goroutine, via service.go:255 and the other s.emitter.Emit(...) calls. So the bug existed already.
  • 3f5d913 likely increased the reproduction rate by adding another independent emitter.

So my assessment is:

  • Not caused by the current refactor.
  • Likely surfaced by 3f5d913.
  • Fundamentally caused by internal/api/ws/hub.go not enforcing single-writer semantics.

Natural next steps:

  1. Add a package-owned WS client type with a write mutex or dedicated write loop/channel.
  2. Add a regression test that calls Emit concurrently and verifies no panic / dropped-client handling.
- The actual unsafe code is in `hub.go:63`: Hub.Emit calls `c.WriteMessage(...)` directly on shared `*websocket.Conn` values with no per-connection write serialization. - `gorilla/websocket` requires one concurrent writer per connection. If two goroutines call `Emit` at the same time, they can write to the same socket concurrently and panic with concurrent write to websocket connection. - 3f5d913 added a second periodic emitter path by splitting stats and heartbeat into separate goroutines in `server.go:221` and `app.go:100`. That makes overlapping `hub.Emit` calls more likely. - The current refactor (91fad390661391817126e64ed41e2086cb1c3349) did not change the WebSocket hub logic. It still passes `hub.Emit` from the same places and does not add any locking around writes. One nuance: - Even before 3f5d913, ingest could already emit from MQTT handling while the stats ticker emitted from another goroutine, via `service.go:255` and the other `s.emitter.Emit(...)` calls. So the bug existed already. - 3f5d913 likely increased the reproduction rate by adding another independent emitter. So my assessment is: - Not caused by the current refactor. - Likely surfaced by 3f5d913. - Fundamentally caused by `internal/api/ws/hub.go` not enforcing single-writer semantics. Natural next steps: 1. Add a package-owned WS client type with a write mutex or dedicated write loop/channel. 2. Add a regression test that calls Emit concurrently and verifies no panic / dropped-client handling.
skobkin self-assigned this 2026-03-01 00:27:19 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
skobkin/meshmap-lite#21
No description provided.