NATS subject taxonomy and event envelope DATAVERKET 007

Proposed Integration Messaging Nats Events Jetstream

Defines the shared NATS subject taxonomy, message envelope, and reliability baseline for Dataverket control-plane communication.

Author: Lars Solem
Updated: 2026-03-14

Dataverket will use NATS for communication between platform components. The repository already points toward an event-based architecture, but currently does not define the concrete contract that services must follow.

Without a shared NATS taxonomy, each service will invent its own subject naming, payload style, and reliability model. That would make orchestration, auditing, SDK integration, and cross-service debugging unnecessarily fragile.

Decision

Dataverket uses NATS as the internal command and event backbone with a shared subject taxonomy and a standard message envelope.

NATS is also the standard communication mechanism between Dataverket datacenters.

NATS is an internal control-plane transport between Dataverket services and workers. It is not the primary external integration surface for tenant or operator clients.

NATS is used for:

commands between control plane services and workers
domain events emitted by services
task lifecycle events
request/reply where synchronous service coordination is justified
intra-datacenter control-plane communication within each datacenter
inter-datacenter control-plane communication

Public clients integrate through the Sentral-owned HTTP API and task resources. NATS subjects and envelopes are internal platform contracts unless a later ADR explicitly promotes a specific stream or bridge to a supported external interface.

NATS is not the long-term system of record. Desired state remains in service databases, primarily PostgreSQL.

Inter-datacenter model

Dataverket supports two or more datacenters, and NATS is the standard communication path between those sites for platform coordination.

This means:

each datacenter uses NATS as its standard internal control-plane transport
site-local services publish and consume through NATS within their datacenter
cross-site coordination also happens through NATS subjects rather than through ad hoc custom protocols
datacenter identity must be explicit in topology, placement, and operational tracing

The inter-datacenter design should prefer site-local streams and explicit cross-site coordination flows over treating every subject as globally shared by default.

NATS is therefore both an intra-datacenter and inter-datacenter control-plane transport.

Subject taxonomy

All NATS subjects use the dv. prefix.

The standard subject families are:

dv.<service>.cmd.<action>
dv.<service>.evt.<entity>.<verb>
dv.task.evt.<verb>
dv.rpc.<service>.<operation>

Examples:

dv.maskin.cmd.provision
dv.maskin.evt.server.provisioned
dv.nett.cmd.apply
dv.plattform.evt.cluster.ready
dv.task.evt.updated
dv.rpc.identitet.introspect_token

Semantics by subject family

Commands

Commands express desired work. They are addressed to one service domain and handled by a responsible consumer group.

Commands:

must be durable
must be idempotent
must carry correlation and actor metadata
may produce zero or more follow-up events

Commands should normally be backed by JetStream.

Events

Events are facts about something that already happened inside a service domain.

Events:

are immutable
may be consumed by multiple downstream services
must not be rewritten as hidden RPC responses
should describe domain state transitions, not log spam

Important integration events should be published through JetStream so they can be replayed by consumers.

Task events

Long-running operations must expose task lifecycle through dv.task.evt.*.

The minimum task verbs are:

created
queued
started
progress
succeeded
failed
cancelled

RPC

dv.rpc.* exists for narrow synchronous interactions where request/reply is materially better than an asynchronous workflow.

RPC should be used sparingly. If the operation changes infrastructure state or may take more than a few seconds, it should be modeled as a command plus task instead.

Standard message envelope

All commands and events must use a common envelope.

The envelope fields are:

specversion: envelope version, initially 1.0
id: unique message ID
type: logical message type, such as maskin.server.provision.requested
source: emitting service, such as sentral or maskin
subject: resource identifier within the emitting domain
time: UTC timestamp in RFC 3339 format
datacontenttype: usually application/json
tenant_id: tenant or organization identifier when applicable
project_id: project identifier when applicable
environment_id: environment identifier when applicable
datacenter_id: datacenter or site identifier when applicable
actor: identity that initiated the action
correlation_id: stable ID shared across a workflow
causation_id: parent message ID that triggered this message
data: message payload

This is intentionally close to CloudEvents structure, but tailored to Dataverket’s control-plane needs.

Payload rules

Payloads must be JSON objects
Payload schemas must be versioned
Consumers must ignore unknown fields
Producers must not silently change field meaning without a schema version bump
Opaque binary blobs must not be embedded directly in normal message payloads

Large artifacts should be stored elsewhere and referenced by URI or object ID.

Reliability model

The platform reliability model is:

PostgreSQL stores desired state and authoritative resource state
JetStream stores durable commands and important integration events
consumers are responsible for idempotent handling
at-least-once delivery is assumed
workflow correlation is mandatory
cross-datacenter links must be treated as failure-prone and partitionable

Cross-site NATS usage must therefore distinguish between:

site-local workflow streams
cross-site coordination subjects
replicated or restorable durable state needed for recovery

No service may assume exactly-once processing.

Services must also not assume permanent low-latency connectivity between datacenters.

Ordering model

Ordering is only guaranteed within the limits of a subject stream and consumer behavior. Services must therefore:

not rely on global ordering
tolerate duplicate delivery
validate current state before applying effects
use correlation and causation IDs for workflow reconstruction
tolerate delayed or temporarily partitioned cross-datacenter delivery

Failure-handling baseline

The first production implementation must include a baseline policy for retries, poison messages, and replay.

The minimum baseline is:

transient failures may be retried with bounded backoff
repeated permanent failures must transition work into a visible failed task state
poison messages must be diverted to a dead-letter path after bounded retry exhaustion
operators must be able to inspect dead-lettered work with correlation context intact
replay must be an explicit operational action, not an accidental side effect of restart

This does not replace a later detailed ADR, but it establishes the minimum safety bar for building on NATS.

Naming guidance

Service segment names should use accepted Dataverket service identities:

sentral
identitet
maskin
plattform
tjeneste
objekt
nett

Additional service names must be introduced by ADR before becoming part of the stable taxonomy.

Consequences

every service now has a uniform NATS contract
observability and auditing become easier because workflows can be correlated consistently
JetStream becomes an infrastructure dependency for orchestration
teams must design idempotent consumers from the start
NATS remains the transport layer, not the authoritative data store
cross-datacenter coordination now shares the same transport and message model as intra-site orchestration

Decision Outcome

Proposed. This ADR records the current preferred direction and still needs acceptance before it becomes binding.

More Information

stream and consumer layout in JetStream
schema registry or schema publication workflow
retry and dead-letter policy
inter-datacenter NATS topology and failure handling

Audit

2026-03-14: ADR proposed.