Releases: nats-io/nats-server
Release v2.11.0-RC.2
Changelog
This release also includes all fixes and improvements from 2.10.26 and earlier.
Go Version
- 1.24.0 (#6508)
Dependencies
- golang.org/x/crypto v0.36.0 (#6618)
- golang.org/x/sys v0.31.0 (#6618)
- golang.org/x/time v0.11.0 (#6618)
- github.com/google/go-tpm v0.9.3 (#6295)
- github.com/antithesishq/antithesis-sdk-go v0.4.3-default-no-op (#6164)
Added
General
- Distributed message tracing (#5014, #5057)
- A message with the
Nats-Trace-Dest
header set to a valid subject will receive events representing what happens to the message as it moves through the system - Events contain information such as ingress, subject mapping, stream exports, service imports, egress to subscriptions, routes, gateways or leafnodes
- An additional
Nats-Trace-Only
header, if set totrue
, will produce the same tracing events but will not deliver the message to the final destination
- A message with the
- Configuration state digest (#4325)
- A hash of the configuration file can be generated using the
-t
option on the command line - The hash of the currently running configuration file can be seen in the
config_digest
option invarz
- A hash of the configuration file can be generated using the
- Enable scoped users to have templates that are not limited to a subject token (#5981)
JetStream
- Per-message TTLs (#6272, #6354, #6363, #6370, #6376, #6385, #6400)
- The
Nats-TTL
header, provided either as a string duration (1m
,30s
) or an integer in seconds, will age out the message independently of stream limits - More information on this is available in ADR-43
- The
- Subject delete markers on
MaxAge
(#6378, #6389, #6393, #6400, #6404, #6428, #6432)- The
SubjectDeleteMarkerTTL
stream configuration option determines whether to place marker messages and how long they should live for - The marker message will have a
Nats-Marker-Reason
header explaining which limit caused the marker to be left behind - More information on this is available in ADR-43
- The
- Pull consumer priority groups with pinning and overflow (#5814, #6078, #6081)
- Allows patterns such as one consumer receiving all messages, but handing over to a second consumer if the first one fails, or groups of clients accessing the same consumer should have different priorities
- The
PriorityGroups
andPriorityPolicy
options in the consumer configuration control the policy - More information on this is available in ADR-42
- Consumer pausing (#5066)
- The
PauseUntil
consumer configuration option suspends message delivery to the consumer until the time specified is reached, after which point it will resume automatically
- The
- Asset versioning (#5850, #5855, #5857)
- More information on this is available in ADR-44
- Multi-get directly from a stream (#5107)
- More information on this is available in ADR-31
- Pedantic mode (#5245)
- Ensures that stream and consumer creates or updates will fail if the resulting configuration would differ due to defaults, useful for desired-state configuration
- Stream ingest rate limiting (#5796)
- New
max_buffered_size
andmax_buffered_msgs
options in thejetstream
block of the server config control how many publishes should be queued before rate-limiting, making it easier to protect the system against Core NATS publishes into JetStream - Where a reply subject is provided, rate-limited messages will receive a 429 “Too Many Requests” response and can retry later
- New
- Support for
Nats-Expected-Last-Subject-Sequence-Subject
header, customising the subject used when paired withNats-Expected-Last-Subject-Sequence
(#5281) Thanks to @cchamplin for the contribution! - Ability to move cluster Raft traffic into the asset account instead of using the system account using the new
cluster_traffic
configuration option (#5466, #5947) - Ability to specify preferred placement tags or clusters using
preferred
when issuing stepdown requests to the metaleader, streams or consumers (#6282, #6484) - Implement strict decoding for JetStream API requests with the new
strict
option in thejetstream
block of the server config (#5858) - JetStream encryption on Windows can now use the TPM for key storage (#5273)
- The
js_cluster_migrate
option can now be configured with a delay, controlling how long before a failure would result in asset migration (#5903)
Leafnodes
WebSocket
- WebSocket custom response headers (#5230) Thanks to @ramonberrutti for the contribution!
MQTT
- SparkplugB Aware support (#5241)
Improved
General
- A graceful shutdown caused by the
SIGTERM
signal will now return exit code 0 instead of exit code 1 (#6336)
Fixed
General
- Server, cluster and gateway names containing spaces will now be rejected, since these can cause issues (#5676)
JetStream
- Message removals due to acks in clustered interest-based or work queue streams are now proposed through Raft (#6140)
- Ensures that the removal ordering across all replicas is consistent, but may increase the amount of replication traffic
- Consistency improvements for the metalayer, streams and consumers (#6194, #6485, #6518)
- A new leader only starts responding to read/write requests once it's initially up-to-date with its Raft log
- Also fixes issues where KV creates/updates to a key during leader changes could desync the stream
- Replicated consumers should no longer skip redeliveries of unacknowledged messages after a leader change (#6566)
- Consumer starting sequence is now always respected, except for consumers used for sources/mirrors (#6253)
Complete Changes
Release v2.10.26
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6 (#6452)
Dependencies
- github.com/nats-io/nats.go v1.39.1 (#6574)
- golang.org/x/crypto v0.34.0 (#6574)
- golang.org/x/sys v0.30.0 (#6487)
- golang.org/x/time v0.10.0 (#6487)
- github.com/nats-io/nkeys v0.4.10 (#6494)
- github.com/klauspost/compress v1.18.0 (#6565)
Added
General
- New server option
no_fast_producer_stall
allows disabling the stall gates, instead preferring to drop messages to consumers that would have resulted in a stall instead (#6500) - New server option
first_info_timeout
to control how long a leafnode connection should wait for the initial connection info, useful for high latency links (#5424)
Monitoring
- The
gatewayz
monitoring endpoint can now return subscription information (#6525)
Improved
General
- The configured write deadline is now applied to only the current batch of write vectors (with a maximum of 64MB), making it easier to configure and reason about (#6471)
- Publishing through a service import to an account with no interest will now generate a "no responders" error instead of silently dropping the message (#6532)
- Adjust the stall gate for producers to be less penalizing (#6568, #6579)
JetStream
- Consumer signaling from streams has been optimized, taking consumer filters into account, significantly reducing CPU usage and overheads when there are a large number of consumers with sparse or non-overlapping interest (#6499)
- Num pending with multiple filters, enforcing per-subject limits and loading the per-subject info now use a faster subject tree lookup with fewer allocations (#6458)
- Optimizations for calculating num pending etc. by handling literal subjects using a faster path (#6446)
- Optimizations for loading the next message with multiple filters by avoiding linear scans in message blocks in some cases, particularly where there are lots of deletes or a small number of subjects (#6448)
- Avoid unnecessary system time calls when ranging a large number of interior deletes, reducing CPU time (#6450)
- Removed unnecessary locking around finding out if Raft groups are leaderless, reducing contention (#6438)
- Improved the error message when trying to change the consumer type (#6408)
- Improved the error messages returned by
healthz
to be more descriptive about why the healthcheck failed (#6416) - The limit of concurrent disk I/O operations that JetStream can perform simultaneously has been raised (#6449)
- Reduced the number of allocations needed for handling client info headers around the JetStream API and service imports/exports (#6453)
- Calculating the starting sequence for a source consumer has been optimized for streams where there are many interior deletes (#6461)
- Messages used for cluster replication are now correctly accounted for in the statistics of the origin account (#6474)
- Reduce the amount of time taken for cluster nodes to start campaigning in some cases (#6511)
- Reduce memory allocations when writing new messages to the filestore write-through cache (#6576)
Monitoring
- The
routez
endpoint now reportspending_bytes
(#6476)
Fixed
General
- The
max_closed_clients
option is now parsed correctly from the server configuration file (#6497)
JetStream
- A bug in the subject state tracking that could result in in consumers skipping messages on interest or WQ streams has been fixed (#6526)
- A data race between the stream config and looking up streams has been fixed (#6424) Thanks to @evankanderson!
- Fixed an issue where Raft proposals were incorrectly dropped after a peer remove operation, which could result in a stream desync (#6456)
- Stream disk reservations will no longer be counted multiple times after stream reset errors have occurred (#6457)
- Fixed an issue where a stream could desync if the server exited during a catchup (#6459)
- Fixed a deadlock that could occur when cleaning up large numbers of consumers that have reached their inactivity threshold (#6460)
- A bug which could result in stuck consumers after a leader change has been fixed (#6469)
- Fixed an issue where it was not possible to update a stream or consumer if up against the max streams or max consumers limit (#6477)
- The preferred stream leader will no longer respond if it has not completed setting up the Raft node yet, fixing some API timeouts on stream info and other API calls shortly after the stream is created (#6480)
- Auth callouts can now correctly authenticate the username and password or authorization token from a leafnode connection (#6492)
- Stream ingest from an imported subject will now continue to work correctly after an update to imports/exports via a JWT update (#6498)
- Parallel stream creation requests for the same stream will no longer incorrectly return a limits error when max streams is configured (#6502)
- Consumers created or recreated while a cluster node was down are now handled correctly after a snapshot when the node comes back online (#6507)
- Invalidate entries in the pending append entry cache correctly, reducing the chance of an incorrect apply (#6513)
- When compacting or truncating streams or logs, correctly clean up the delete map, fixing potential memory leaks and the potential for
index.db
to not be recovered correctly after a restart (#6515) - Retry removals from acks if they have been missed due to the consumer ack floor being ahead of the stream applies, correcting a potential stream drift across replicas (#6519)
- When recovering from block files, do not put deleted messages below the first sequence into the delete map (#6521)
- Preserve max delivered messages with interest retention policy using the redelivered state, such that a new consumer will not unexpectedly remove the message (#6575)
Leafnodes
- Do not incorrectly send duplicate messages when a queue group has members across different leafnodes when connected through a gateway (#6517)
WebSockets
- Fixed a couple cases where memory may not be reclaimed from Flate compressors correctly after a WebSocket client disconnect or error scenario (#6451)
Tests
Complete Changes
Release v2.10.26-RC.7
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6
Dependencies
Improved
General
JetStream
- Reduce memory allocations when writing new messages to the filestore write-through cache (#6576)
Fixed
JetStream
- Preserve max delivered messages with interest retention policy using the redelivered state, such that a new consumer will not unexpectedly remove the message (#6575)
Complete Changes
Release v2.10.26-RC.5
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6
Dependencies
- github.com/klauspost/compress v1.18.0 (#6565)
Added
Leafnodes
- New server option
first_info_timeout
to control how long a leafnode connection should wait for the initial connection info, useful for high latency links (#5424)
Monitoring
- The
gatewayz
monitoring endpoint can now return subscription information (#6525)
Fixed
General
- Publishing through a service import to an account with no interest will now generate a "no responders" error instead of silently dropping the message (#6532)
JetStream
- A bug in the subject state tracking that could result in in consumers skipping messages on interest or WQ streams has been fixed (#6526)
Tests
- Unit tests have been improved (#6524)
Complete Changes
Release v2.10.26-RC.4
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6
Improved
Monitoring
- The
routez
endpoint now reportspending_size
(#6476)
JetStream
- Reduce the amount of time taken for cluster nodes to start campaigning in some cases (#6511)
Fixed
General
- The
max_closed_clients
option is now parsed correctly from the server configuration file (#6497)
JetStream
- Consumers created or recreated while a cluster node was down are now handled correctly after a snapshot when the node comes back online (#6507)
- Invalidate entries in the pending append entry cache correctly, reducing the chance of an incorrect apply (#6513)
- When compacting or truncating streams or logs, correctly clean up the delete map, fixing potential memory leaks and the potential for
index.db
to not be recovered correctly after a restart (#6515) - Retry removals from acks if they have been missed due to the consumer ack floor being ahead of the stream applies, correcting a potential stream drift across replicas (#6519)
- When recovering from block files, do not put deleted messages below the first sequence into the delete map (#6521)
Leafnodes
- Do not incorrectly send duplicate messages when a queue group has members across different leafnodes when connected through a gateway (#6517)
Complete Changes
Release v2.10.26-RC.3
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6
Dependencies
- github.com/nats-io/nkeys v0.4.10 (#6494)
Added
General
- New server option
no_fast_producer_stall
allows disabling the stall gates, instead preferring to drop messages to consumers that would have resulted in a stall instead (#6500)
Improved
JetStream
- Consumer signalling from streams has been optimised, taking consumer filters into account, significantly reducing CPU usage and overheads when there are a large number of consumers with sparse or non-overlapping interest (#6499)
Fixed
JetStream
- Auth callouts can now correctly authenticate the username and password or authorization token from a leafnode connection (#6492)
- Stream ingest from an imported subject will now continue to work correctly after an update to imports/exports via a JWT update (#6498)
- Parallel stream creation requests for the same stream will no longer incorrectly return a limits error when max streams is configured (#6502)
Complete Changes
Release v2.10.26-RC.2
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6
Dependencies
- github.com/nats-io/nats.go v1.39.0 (#6464)
- golang.org/x/crypto v0.33.0 (#6487)
- golang.org/x/sys v0.30.0 (#6487)
- golang.org/x/time v0.10.0 (#6487)
Improved
General
- The configured write deadline is now applied to only the current batch of write vectors (with a maximum of 64MB), making it easier to configure and reason about (#6471)
JetStream
- Messages used for cluster replication are now correctly accounted for in the statistics of the origin account (#6474)
Fixed
JetStream
- A bug which could result in stuck consumers after a leader change has been fixed (#6469)
- Fixed an issue where it was not possible to update a stream or consumer if up against the max streams or max consumers limit (#6477)
- The preferred stream leader will no longer respond if it has not completed setting up the Raft node yet, fixing some API timeouts on stream info and other API calls shortly after the stream is created (#6480)
Tests
- Unit tests have been improved (#6472)
Complete Changes
Release v2.10.26-RC.1
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.6 (#6452)
Improved
JetStream
- Improved the error message when trying to change the consumer type (#6408)
- Improved the error messages returned by
healthz
to be more descriptive about why the healthcheck failed (#6416) - Removed unnecessary locking around finding out if Raft groups are leaderless, reducing contention (#6438)
- Optimisations for calculating num pending etc by handling literal subjects using a faster path (#6446)
- Optimisations for loading the next message with multiple filters by avoiding linear scans in message blocks in some cases, particularly where there are lots of deletes or a small number of subjects (#6448)
- The limit of concurrent disk I/O operations that JetStream can perform simultaneously has been raised (#6449)
- Avoid unnecessary system time calls when ranging a large number of interior deletes, reducing CPU time (#6450)
- Reduced the number of allocations needed for handling client info headers around the JetStream API and service imports/exports (#6453)
- Num pending with multiple filters, enforcing per-subject limits and loading the per-subject info now use a faster subject tree lookup with fewer allocations (#6458)
- Calculating the starting sequence for a source consumer has been optimised for streams where there are many interior deletes (#6461)
Fixed
JetStream
- A data race between the stream config and looking up streams has been fixed (#6424) Thanks to @evankanderson!
- Fixed an issue where Raft proposals were incorrectly dropped after a peer remove operation, which could result in a stream desync (#6456)
- Stream disk reservations will no longer be counted multiple times after stream reset errors have occurred (#6457)
- Fixed an issue where a stream could desync if the server exited during a catchup (#6459)
- Fixed a deadlock that could occur when cleaning up large numbers of consumers that have reached their inactivity threshold (#6460)
WebSockets
- Fixed a couple cases where memory may not be reclaimed from Flate compressors correctly after a WebSocket client disconnect or error scenario (#6451)
Tests
Complete Changes
Release v2.10.25
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.5 (#6379)
Dependencies
- golang.org/x/sys v0.29.0 (#6323)
- golang.org/x/time v0.9.0 (#6324)
- golang.org/x/crypto v0.32.0 (#6367)
Improved
JetStream
- Raft groups will no longer snapshot too often in some situations, improving performance (#6277)
- Optimistically perform stream and consumer snapshots on a normal shutdown (#6279)
- The stream snapshot interval has been removed, now relying on the compaction minimum, which improves performance (#6289)
- Raft groups will no longer report current while they are paused with pending commits (#6317)
- Unnecessary client info fields have been removed from stream and consumer assignment proposals, API advisories and stream snapshot/restore advisories (#6326, #6338)
- Reduced lock contention between the JetStream lock and Raft group locks (#6335)
- Advisories will only be encoded and sent when there is interest, reducing CPU usage (#6341)
- Consumers with inactivity thresholds will now start less clean-up goroutines, which can reduce load on the goroutine scheduler (#6344)
- Consumer cleanup goroutines will now stop faster when the server shuts down (#6351)
Fixed
JetStream
- Subject state consistency with some message removal patterns (#6226)
- A performance issue has been fixed when updating the per-subject state (#6235)
- Fixed consistency issues with detecting partial writes in the filestore (#6283)
- A race condition between removing peers and updating replica counts has been fixed (#6316)
- Pre-acks for a sequence are now removed when the message is removed, correcting a potential memory leak (#6325)
- Metalayer snapshot errors are now surfaced correctly (#6361)
- Healthchecks no longer re-evaluate stream and consumer assignments, avoiding some streams and consumers being unexpectedly recreated shortly after a deletion (#6362)
- Clients should no longer timeout on a retried ack with the
AckAll
policy after a server restart (#6392) - Replicated consumers should no longer get stuck after leader changes due to incorrect accounting (#6387)
- Consumers will now correctly handle the case where messages queued for delivery have been removed, fixing a delivery slowdown (#6387, #6399)
- The API in-flight metric has been fixed so that it does not drift after the queue has been dropped (#6373)
- Handles for temporary files are now closed correctly if compression errors occur (#6390) — Thanks to @deem0n for the contribution!
- JetStream will now shut down correctly when detecting that the store directory underlying filesystem has become read-only (#6292) — Thanks to @souravagrawal for the contribution!
Leafnodes
- Fixed an interest propagation issue that could occur when the hub has a user with subscribe permissions on a literal subject (#6291)
- Fixed a bug where all queue interest across leafnodes could be dropped over gateways in a supercluster deployment after a leafnode connection drops (#6377)
Tests
- A number of unit tests have been improved (#6150, #6278, #6297, #6300, #6343, #6329, #6330, #6331, #6331, #6334, #6364)
Complete Changes
Release v2.10.25-RC.3
Changelog
Refer to the 2.10 Upgrade Guide for backwards compatibility notes with 2.9.x.
Go Version
- 1.23.5 (#6379)
Fixed
JetStream
- Clients should no longer timeout on a retried ack with the
AckAll
policy after a server restart (#6392) - Replicated consumers should no longer get stuck after leader changes due to incorrect accounting (#6387)
- Consumers will now correctly handle the case where messages queued for delivery have been removed, fixing a delivery slowdown (#6387)
- The API in-flight metric has been fixed so that it does not drift after the queue has been dropped (#6373)
- Handles for temporary files are now closed correctly if compression errors occur (#6390) — Thanks to @deem0n for the contribution!
- JetStream will now shut down correctly when detecting that the store directory underlying filesystem has become read-only (#6292) — Thanks to @souravagrawal for the contribution!
Leafnodes
- Fixed a bug where all queue interest across leafnodes could be dropped over gateways in a supercluster deployment after a leafnode connection drops (#6377)