nats-io · MauriceVanVeen · Oct 28, 2025 · Nov 12, 2025 · Dec 5, 2025 · sciascid
diff --git a/adr/ADR-56.md b/adr/ADR-56.md
@@ -10,6 +10,8 @@
 | Revision | Date       | Author                      | Info                                              |
 |----------|------------|-----------------------------|---------------------------------------------------|
 | 1        | 2025-09-12 | @ripienaar, @MauriceVanVeen | Initial document for R1 `async` persistence model |
+| 2        | 2025-10-28 | @MauriceVanVeen             | Add read consistencies                            |
+| 3        | 2025-12-05 | @MauriceVanVeen             | Add design for linearizable reads                 |
 
 ## Context and Problem Statement
 
@@ -50,3 +52,160 @@ The interactions between `PersistMode:async` and `sync:always` are as follows:
  * When the user provides no value for `PersistMode` the implied default is `default` but the server will not set this in the configuration, result of INFO requests will also have it unset
  * Setting `PersistMode` to anything other than empty/absent will require API Level 2
 
+## Read Consistencies
+
+The table below describes the current read consistencies supported by the JetStream API, from the highest consistency
+level to lowest.
+
+| Stream configuration                     | JetStream API                     | Description                                                                                                                                     | Level of consistency                                                                                                                                                                                                                         |
+|:-----------------------------------------|:----------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `AllowDirect` disabled.                  | `$JS.API.STREAM.MSG.GET.<stream>` | An API meant only for management outside of the hot path. The request goes to every server, but is normally only answered by the stream leader. | Current highest level of read consistency. Only the stream leader answers, but stale reads are technically possible after leader changes or during network partitions since an old leader could still answer before the current leader does. |
+| `AllowDirect` enabled.                   | `$JS.API.DIRECT.GET.<stream>`     | If the stream is replicated, the followers will also answer read requests.                                                                      | Higher availability read responses but with lower consistency. A read request will be randomly served by a server hosting the stream. Recently written data is not guaranteed to be returned on a subsequent read request.                   |
+| `MirrorDirect` enabled on mirror stream. | `$JS.API.DIRECT.GET.<stream>`     | If the stream is mirrored, the mirror can also answer read requests. For example a mirror stream in a different cluster or on a leaf node.      | Higher availability with potential of fast local read responses but with lowest consistency. Mirrors can be in any relative state to the source.                                                                                             |
+
+Additionally, if a stream is replicated and a consumer is created, there is no guarantee that the consumer can
+immediately observe all the written messages at that time. For example, if a R1 consumer is created on a follower not up
+to date on all writes yet. The consumer will eventually observe all the writes as it keeps on fetching new messages as
+they come in.
+
+## Proposal to add linearizability
+
+Newer server versions, like for 2.14+, should support more configurability or in general higher levels of consistency as
+opt-in. For example, higher read consistency for consumers can be achieved by having consumer CRUD operations go through
+the stream's Raft log instead of the Meta Raft log, which ensures that a consumer created at time X in the stream log
+can observe all the stream writes up to time X.
+
+Specifically, higher level consistency for message read requests would roughly require:
+
+- `AllowDirect` should not need to be disabled. The `$JS.API.STREAM.MSG.GET.<stream>` API, when `AllowDirect` is
+  disabled, has significant overhead since these requests go to ALL servers not just the servers hosting the stream.
+- Direct Get allows using batch requests, this should also be supported. (Which is not the case with the Msg Get API
+  above)
+
+Linearizable reads would be desirable, but a minimum would be to opt in to at least session-level guarantees such as
+reading your own writes and monotonic reads.
+
+### Discussion
+
+#### What would be expected given different topologies?
+
+For a replicated stream, the minimum to expect would be that within the cluster the read response will always be aware
+of the writes performed up to that point. Either only writes performed by solely this process doing the read request, or
+all writes performed by all processes on the same stream. (To be discussed later)
+
+But what about when the stream is mirrored on a leaf node and a client is connected to the leaf node? Or similarly if
+the client is connected to another cluster in a super cluster, and the stream is mirrored there?
+
+Writes still only go to the source, which will be aware of all writes to the stream up to that point. So a new write may
+be immediately reflected in a read request when the client is connected directly to the cluster, but perhaps not when
+connected to the leaf node. Is that okay given the topology, or would the expectation be that the client can always get
+the most consistent view of a stream without being "location-specific"?
+
+#### Should all read requests get higher consistency, or only a few?
+
+For example, if this would be a setting on the stream like `ReadConsistency: weak/high` that would mean that ALL read
+requests would be served by the stream leader only if set to `high`. This has the side effect of lower availability when
+there's no leader available at a given time.
+
+But does this actually need to be for ALL read requests, or only a select few?
+
+If high availability is valued, then the current Direct Get API could still be used while high consistency read requests
+could be served by the stream leader only. Would such a hybrid approach even be desirable, given that now the app
+developer will need to decide per process or app which consistency level to use? Is this additional complexity worth the
+flexibility?
+
+#### What should the performance considerations be?
+
+Tradeoffs can be made regarding performance versus consistency.
+
+Over-simplifying, there are two options:
+
+- Reads go through Raft. Simplest way to implement and ensure no stale reads happen, but requires an additional network
+  hop for consensus.
+- Reads do not go through Raft. Requires a mechanism like a "leader lease", can immediately answer read requests like
+  before, but requires timeout tuning and a new leader election to take way longer to happen.
+
+Having reads go through Raft is essentially what etcd also did:
+> When we evaluated etcd 0.4.1 in 2014, we found that it exhibited stale reads by default due to an optimization. While
+> the Raft paper discusses the need to thread reads through the consensus system to ensure liveness, etcd performed reads
+> on any leader, locally, without checking to see whether a newer leader could have more recent state. The etcd team
+> implemented an optional quorum flag, and in version 3.0 of the etcd API, made linearizability the default for all
+> operations except for watches.
+> - https://jepsen.io/analyses/etcd-3.4.3 (2020-01-30)
+
+Having leader leases is essentially what YugabyteDB did:
+> Within a shard, Raft ensures linearizability for all operations which go through Raft’s log. However, for performance
+> reasons, YugaByte DB does not use Raft’s consensus for reads. Instead, it cheats: reads return the local state from any
+> Raft leader immediately, using leader leases to ensure safety. Using `CLOCK_MONOTONIC` for leases (instead of
+`CLOCK_REALTIME`) insulates YugaByte DB from some classes of clock error, such as leap seconds.
+> - https://jepsen.io/analyses/yugabyte-db-1.1.9 (2019-03-26)
+
+Generally we hear users are willing to pay a performance "penalty" for higher consistency. But there are two things to
+consider:
+
+- The previous point of "Should all read requests get higher consistency, or only a few?", are all reads considered
+  equal or should there be a hybrid approach? If hybrid, then going through Raft for some reads probably makes most sense?
+- Leader leases are tricky to implement and can (under niche conditions) still result in stale reads. Do we prefer being
+  able to strictly guarantee no stale reads?
+- In some ways NATS' KV can be considered similar to etcd's KV, should we make similar choices?
+
+> etcd ensures linearizability for all other operations by default. Linearizability comes with a cost, however, because linearized requests must go through the Raft consensus process. To obtain lower latencies and higher throughput for read requests, clients can configure a request’s consistency mode to serializable, which may access stale data with respect to quorum, but removes the performance penalty of linearized accesses’ reliance on live consensus.
+> - https://etcd.io/docs/v3.5/learning/api_guarantees/
+
+### Design
+
+The design introduces 'linearizable reads' to JetStream by adding a new API that's specifically used for this purpose.
+Allowing location transparent access; it doesn't matter if a client is connected via a leaf node and several hops to the
+stream leader, if it requires linearizable reads, it can use this new API to get this guarantee. Additionally, this API
+is enabled through a new stream setting: `AllowDirectLeader`. If enabled, the leader will also respond to
+`$JS.API.DIRECT.GET.<stream>`, not requiring `AllowDirect`. This allows clients to migrate away from using the
+`$JS.API.STREAM.MSG.GET.<stream>` API, since it's primarily meant for management purposes only.
+
+- Introduce a new API for linearizable reads: `$JS.API.DIRECT_LEADER.GET.<stream>`.
+- The new API will be similar to `$JS.API.DIRECT.GET.<stream>` but will go to the stream leader only. If it's a
+  replicated stream, the read will need to "go through Raft" to ensure linearizability.
+- The new API will be enabled by an `AllowDirectLeader` setting on the stream. Once enabled, the new API will be active,
+  and the leader will also respond to `$JS.API.DIRECT.GET.<stream>`, not requiring `AllowDirect`. This allows clients to
+  use the DirectGet API instead of the MsgGet API:
+    - If `AllowDirect` is set, the client should use `$JS.API.DIRECT.GET.<stream>` by default.
+    - If the user specifies requiring linearizable reads:
+        - If `AllowDirectLeader` is NOT set, then the client should return an error that 'linearizable reads are not
+          enabled for this stream'.
+        - If `AllowDirectLeader` is set, then the client should use the new `$JS.API.DIRECT_LEADER.GET.<stream>` API.
+        - If the client does not know the current value of `AllowDirectLeader` (since it might not have access to the
+          stream info), then the client should use the new `$JS.API.DIRECT_LEADER.GET.<stream>` API anyway. A '503 No
+          Responders' error will be returned to the user, which will either mean there's temporarily no leader
+          available, or the stream is not configured to allow linearizable reads, but the client can't differentiate
+          between these two cases.
+    - If `AllowDirect` is NOT set but `AllowDirectLeader` is set, then the client should use
+      `$JS.API.DIRECT.GET.<stream>`. The consistency guarantees would be the same as having disabled `AllowDirect` and
+      the `$JS.API.STREAM.MSG.GET.<stream>` API being used, but without the additional overhead of that API.
+    - If none are specified, the client falls back to the `$JS.API.STREAM.MSG.GET.<stream>` API.
+- The `AllowDirectLeader` setting will require API Level 3.
+
+The addition of `AllowDirectLeader` allows for various levels of read consistencies (from weakest to strongest):
+
+- Leader/Follow/Mirror reads, high availability with potential of fast local read responses by a mirror with no
+  cross-request/session consistency guarantees, weakest consistency: `AllowDirect` and `AllowMirror` enabled, and a
+  request to `$JS.API.DIRECT.GET.<stream>`.
+- Leader/Follower reads, high availability with no cross-request/session consistency guarantees: `AllowDirect` enabled,
+  any (up-to-date enough) peer answers requests to `$JS.API.DIRECT.GET.<stream>`.
+- Leader reads, high consistency fast responses, but potential of inconsistency during leader changes or network
+  partitions: `AllowDirect` disabled, and a request to `$JS.API.DIRECT.GET.<stream>` (`AllowDirectLeader` enabled) or
+  `$JS.API.STREAM.MSG.GET.<stream>`.
+- Linearizable reads, highest level of consistency: `AllowDirectLeader` enabled with a request to
+  `$JS.API.DIRECT_LEADER.GET.<stream>` which is only answered by the stream leader if it can guarantee linearizability.
+
+This design makes linearizability an opt-in and a conscious choice by a user. The server provides all the tools required
+for various consistency levels. Clients can ease the user experience by offering:
+
+- Per-request linearizable read opt-in. For example: `js.GetMsg("my-stream", 1, nats.Linearizable()` and
+  `js.GetLastMsg("stream", "subject", nats.Linearizable())`. This allows the user to value availability by default, but
+  opt in to linearizable reads for the requests that need it.
+- Per-object linearizable read opt-in. Opt-in to linearizable reads for a specific stream, KV or Object Store. All reads
+  to that 'object' will use linearizable read requests by default, without needing the user to specify this on a
+  per-request basis. For example:
+  `js.CreateKeyValue(ctx, jetstream.KeyValueConfig{Bucket: "TEST", Replicas: 3, LinearizableReads: true})`.
+
+The clients are free to implement this in a way that's best for the given language, but should generally provide both
+the per-request and per-object options.