forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
master: Update pkg/testutils/release/cockroach_releases.yaml #5
Open
github-actions
wants to merge
87
commits into
master
Choose a base branch
from
crdb-releases-yaml-update-master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github-actions
bot
force-pushed
the
crdb-releases-yaml-update-master
branch
from
September 16, 2023 00:54
5decaa5
to
d1ebbb3
Compare
Previously the `lenient` flag that allowed errors during microbenchmarks to be tolerated would also result in the exit status being 0 even if errors occurred. The error tolerance should only allow the run to continue, if errors are encountered, but still report the failures by signalling an exit code 1 so that failures can be tracked and reported on. Release Note: None Epic: None
This PR fixes the test scripts used by developers to quickly setup a multitenant test environment. The changes in cockroachdb since these were created broke them. Epic: none Release note: None
Add `--flaky_test_attempts=4` to the coverage unit test builds. We don't want flaky tests failing these builds often. Epic: none Release note: None
To make it easier to identify the metric being used to generate charts, this commit adds the metric to the tooltip of all charts on the Metrics page. Fixes cockroachdb#109277 This also fix the metric name for `Schema Registry Registrations`. Fixes cockroachdb#108095 Release note (ui change): On the Metric page, now the information about which metric is used to create each chart is available on the chart's tooltip. Release note (bug fix): Fix metric name for `Schema Registry Registrations`.
We recently introduced metrics into the logging package, but unfortunately we did not prefix them properly. All metrics in the logging package should share the same `log.*` prefix, to clearly group them together. Luckily, only 1 log metric exists so far. This patch updates the metric name to have the `log.*` prefix. Release note (ops change): This patch renames the metric `fluent.sink.conn.errors` to `log.fluent.sink.conn.errors`. The addition of the `log.` prefix was to better group together logging-related metrics. The behavior and purpose of the metric remains unchanged.
The Measurement metadata for this metric was incorrect. This patch fixes it to better represent what's being measured. Release note: none
The log.fluent.sink.conn.errors metric's metadata was missing the MetricType. This patch adds it. Release note (ops change): This patch sets the Metric Type on the metric `log.fluent.sink.conn.errors`. Previously, the Metric Type was incorrectly left unset. Note that this is simply an update to the metric's metadata. The behavior and purpose of the metric remains unchanged.
Previously, a LogMetrics implementations was not provided to the logging package in tests. This could lead to tests that exercise code paths involving LogMetrics to experience problems like nil pointer errors. This patch assigns a dummy test implementation in the testing log scope setup, to avoid this case. Release note: none
Previously, the MetricsStruct used by the logmetrics package was protected by the same mutex that protects the map of metric name to counter. However, the MetricsStruct is never written to after initialization. It's only read within `NewRegistry()` to dump the underlying counters into a new registry for in-process tenants. Since concurrent writes are not possible with this MetricsStruct (it's only read from), protection by this mutex is unnecessary. In fact, the unnecessary mutex protection can cause a deadlock if a metric is incremented in the hot path for logging (e.g. something like once per-log message as it passes through `outputLogEntry`). For example: 1. NewRegistry is called 2. NewRegistry acquires and holds mutex 3. NewRegistry initializes a new registry, which eventually [makes a logging call](https://github.com/cockroachdb/cockroach/blob/master/pkg/util/metric/registry.go#L87) 4. Logging call makes its way through the logging code and attempts to increment a logmetrics counter. 5. `IncrementCounter` is called. 6. `IncrementCounter` attempts to acquire the mutex. 7. The mutex is already being held via step 2. 8. Deadlock! By removing the unnecessary protection of the mutex for the MetricsStruct, we eliminate this possibility. Release note: none
Buffered network logging sinks have a `max-buffer-size` attribute, which determines, in bytes, how many log messages can be buffered. If a writer attempts to append a log message to the buffer that would exceed this `max-buffer-size`, then the buffered log sink logic drops older messages to make room for the new. Previously, these dropped messages were not tracked in any way. A TODO was left to add a metric tracking them. This patch introduces a metric to do so: `log.buffered.messages.dropped` It's shared across all buffered log sinks and counts the number of messages dropped from the buffer. Release note (ops change): This patch introduces a new metric, `log.buffered.messages.dropped`. Buffered network logging sinks have a `max-buffer-size` attribute, which determines, in bytes, how many log messages can be buffered. Any `fluent-server` or `http-server` log sink that makes use of a `buffering` attribute in its configuration (enabled by default) qualifies as a buffered network logging sink. If this buffer becomes full, and an additional log message is sent to the buffered log sink, the buffer would exceed this `max-buffer-size`. Therefore, the buffered log sink drops older messages in the buffer to handle, in order to make room for the new. `log.buffered.messages.dropped` counts the number of messages dropped from the buffer. Note that the count is shared across all buffered logging sinks.
Release note: None
Release note: None
Prior to this patch, when a virtual cluster was created without a name, a default name was generated with structure `tenant-NNN`. To avoid emphasizing multi-tenancy, this commit changes this to `cluster-NNN`. (No release note because there is no user-facing way to create a record without a name.) Release note: None
Release note: None
Release note: None
Release note: None
Release note: None
Release note: None
Multiple people have seen this timeout for `race`. Let's bump this timeout only for `race`. Epic: none Release note: None
This change adds a cluster setting, `kv.snapshot_receiver.excise.enabled`, to use IngestAndExcise for the replicated/user-key portion of a replica's contents instead of rangedels. This reduces write-amp as rangedels/rangekeydels have to be compacted while an excise shrinks sstables into virtual sstables to clear out contents of a replica immediately. At the moment, this is an experimental feature and should be used with caution. Epic: none Release note: None
111576: sql: use 'cluster-NNN' for virtual cluster records without a name r=stevendanna a=knz Epic: CRDB-29380 Prior to this patch, when a virtual cluster was created without a name, a default name was generated with structure `tenant-NNN`. To avoid emphasizing multi-tenancy, this commit changes this to `cluster-NNN`. (No release note because there is no user-facing way to create a record without a name.) Release note: None 111586: settings: more guidance r=dt a=knz Epic: CRDB-6671 As requested by `@dt` [here](cockroachdb#111579 (comment)). Release note: None 111587: configprofiles: more clamping down on spurious slice overwrites r=yuzefovich a=knz Epic: CRDB-26691 Suggested by `@yuzefovich` [here](cockroachdb#111569 (review)). Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
When there are multiple shared locks on a key, any active waiters will push the first of the lock holders (aka the claimant). Previously, when the claimaint was finalized, we weren't recomputing the waiting state for any active waiters to push the new claimaint. As a result, in such a scenario, waiters would end up blocking indefinitely without pushing. This is non-ideal, as it means we're not going to be running deadlock/liveness detection. Waiters would hang indefinitely if there was a deadlock/liveness issue. This patch fixes this behaviour by recomputing new waiting state in cases where a shared lock is released but the key isn't unlocked. Epic: none Release note: None
This patch introduces the `log.messages.count` metric. The metric counts the number of messages logged, recording at the point of `outputLogEntry`, which all logging calls (e.g. `Info`, `Error`, etc.) commonly pass through. This metric will be helpful to better understand log volume and rates. Note that this does not capture the fanout of a single log message to multiple logging sinks. Release note (ops change): This patch introduces the metric, `log.messages.count`. This metric measures the count of messages logged on the node since startup. Note that this does not measure the fan-out of single log messages to the various configured logging sinks. This metric can be helpful in understanding log rates and volumes.
This is a small optimization made to the logmetrics package. The log package previously provided a metric name string when incrementing a metric, which would prompt the logmetrics package to perform a map lookup. By using enum values instead, we can do direct index lookups instead. These log metrics are in the critical logging path, so these types of optimizations are worthwhile, especially when the effort is low (like here). Release note: none
This commit adds a program that takes an output directory path, collect all statements in all logic tests, and write them, per file, to the provided output directory. Release note: None
This commits adds a nightly task in TC that collects statements in all logic tests and store them in google cloud under `cockroach-corpus/logictest-stmts-corpus/`. Release note: None
111571: tests: silence some warnings r=yuzefovich a=knz This will improve investigations for failures like cockroachdb#111541. Epic: CRDB-18499. 111590: github-pull-request-make: longer overall timeout for `stressrace` r=jlinder a=rickystewart Multiple people have seen this timeout for `race`. Let's bump this timeout only for `race`. Epic: none Release note: None Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
For some reason after an update GoLand stopped compiling because of this. Epic: None Release note: None
Previously, StartSharedProcessTenant() would hang if it were run on a tenant that was created by a replication stream. This patch fixes this bug by ensuring `ALTER TENANT $1 START SERVICE SHARED` is run even if the tenant was already created. Epic: none Release note: None
Epic: none Release note: None
111613: sqlstats: fix counter for in-memory fingerprints r=j82w a=j82w Problem: The counters used to track the number of unique fingerprints we store in-memory for sql stats were refactored in cockroachdb#110805. In change cockroachdb#110805 a bug was introduced where it incresease the memory instead of resetting the counts. This causes the statstics to stop calculating new stats once the limit is hit. Solution: Fix the bug by resetting the counters instead of increasing them. Added new test to test the reset functionality. Fixes: cockroachdb#111583 Release note (sql change): Fix a bug that causes the sql stats to stop collecting new stats. Co-authored-by: j82w <jwilley@cockroachlabs.com>
Release note: None
This renames the setting `kv.raft_log.synchronization.disabled` to `kv.raft_log.synchronization.unsafe.disabled` as per naming guidelines, and marks it as unsafe explicitly. Release note: None
Prior to this patch, it was possible to easily automate `SET CLUSTER SETTING` for unsafe cluster settings. This is undesirable; we want to strongly incentivize a human operator paying attention to changes to these settings. This patch implements an *interlock*: a mechanism through which the operator needs to perform two concurrent, related actions for the change to take effect. This works as follows: 1. the operator attempts to change a cluster setting from a SQL shell, for example: ```sql SET CLUSTER SETTING kv.raft_log.synchronization.unsafe.disabled = true; ``` 2. the server fails the execution, with an error: ``` ERROR: changing cluster setting "kv.raft_log.synchronization.unsafe.disabled" may cause cluster instability or data corruption. To confirm the change, run the following command before trying again: SET unsafe_setting_interlock_key = 'B7TxIA=='; ``` 3. the operator can then perform the recommended action, then try SET CLUSTER SETTING again. Because the key is properly set, the SET CLUSTER SETTING statement succeeds. Also, `RESET` statements (or `SET CLUSTER SETTING ... = DEFAULT`) are not subject to the interlock, as we assume that the default value is safe for use. (No release note because the only unsafe settings as of this writing are not documented to end-users.) Release note: None
109801: sql: implement an interlock to modify unsafe settings r=dt a=knz Fixes cockroachdb#109810. Epic: CRDB-28893 As discussed [here](https://docs.google.com/document/d/11mWsfORExZxKqyMJfa6vg7LUzLEYhJ295NkaP1-bvL4/edit?disco=AAAA3lp44WY). Prior to this patch, it was possible to easily automate `SET CLUSTER SETTING` for unsafe cluster settings. This is undesirable; we want to strongly incentivize a human operator paying attention to changes to these settings. This patch implements an *interlock*: a mechanism through which the operator needs to perform two concurrent, related actions for the change to take effect. This works as follows: 1. the operator attempts to change a cluster setting from a SQL shell, for example: ```sql SET CLUSTER SETTING kv.raft_log.synchronization.unsafe.disabled = true; ``` 2. the server fails the execution, with an error: ``` ERROR: changing cluster setting "kv.raft_log.synchronization.unsafe.disabled" may cause cluster instability or data corruption. To confirm the change, run the following command before trying again: SET unsafe_setting_interlock_key = 'B7TxIA=='; ``` 3. the operator can then perform the recommended action, then try SET CLUSTER SETTING again. Because the key is properly set, the SET CLUSTER SETTING statement succeeds. Also, `RESET` statements (or `SET CLUSTER SETTING ... = DEFAULT`) are not subject to the interlock, as we assume that the default value is safe for use. 111336: roachprod-microbench: update error tolerance r=renatolabs,srosenberg a=herkolategan Previously the `lenient` flag that allowed errors during microbenchmarks to be tolerated would also result in the exit status being 0 even if errors occurred. The error tolerance should only allow the run to continue, if errors are encountered, but still report the failures by signalling an exit code 1 so that failures can be tracked and reported on. Release Note: None Epic: None 111639: kvserver: skip `TestStoreRangeMergeRaftSnapshot` under metamorphic tests r=erikgrinaker a=erikgrinaker Touches cockroachdb#111624. Epic: none Release note: None Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net> Co-authored-by: Herko Lategan <herko@cockroachlabs.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
111594: sql-telemetry: query_sampling.max_event_frequency public r=maryliag a=emilaleksanteri Epic: none Fixes: cockroachdb#108385 Release note (sql change): make max_event_frequency public for public documentation 111638: kvfollowerreadsccl: use `SystemVisible` for `kv.closed_timestamp.propagation_slack` r=erikgrinaker a=erikgrinaker **changefeedccl: don't use `ALTER TENANT ALL` for closed timestamp setting** This is no longer necessary with the `SystemVisible` class. **kvfollowerreadsccl: use `SystemVisible` for `kv.closed_timestamp.propagation_slack`** It doesn't make any sense to configure this individually per tenant. Epic: none Release note: None Co-authored-by: Emil Lystimaki <emil@circularway.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
Epic: none Release note: None
This test is currently only used for manual benchmarking. It will revisited later as part of cockroachdb#111614. We should skip it to avoid noise in test failures. Fixes: cockroachdb#111542. Release note: None
Fix test to use correct config for injecting invalid lease indexes. Epic: none Release note: None
111645: roachtest: show running test in teamcity logs r=smg260 a=smg260 In the TC log, we currently show when a test has finished. Now that stderr/out has been cleaned up, it would be useful to also show when a test has begun running. We already do this in GCE (with a grafana link). Epic: none Release note: None Co-authored-by: Miral Gadani <miral@cockroachlabs.com>
Epic: none This change pins `pnpm` to `8.6.10` for the cluster-ui release (and release-next) workflow(s) to prevent not up-to-date lockfiles when installing cluster-ui dependencies with pnpm. Release note: None
111584: roachtest: add ruby-pg test to ignorelist r=rafiss a=rafiss fixes cockroachdb#111522 fixes cockroachdb#111508 Release note: None 111588: build: remove uses of `bindata` r=rail,srosenberg a=rickystewart This is deprecated in `rules_go`, and `go:embed` has the same functionality. Closes cockroachdb#111520. Epic: CRDB-8308 Release note: None Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Epic: none Release note: None
111655: kvnemesis: use correct probability for invalid lease r=pavelkalinnikov a=aliher1911 Fix test to use correct config for injecting invalid lease indexes. Epic: none Release note: None Co-authored-by: Oleg Afanasyev <oleg@cockroachlabs.com>
Add the ability to choose a color for specific metric series on Metric charts. On chart Replication -> Ranges, specify the color so it will be red. Otherwise could be confusing seeing any other as Red, and the `Unavailable` as green. Fixes cockroachdb#107637 Release note: None
…ings Fixes cockroachdb#111626 The previous impl assumed input string length <= math.MaxInt32. Go 1.20 added unsafe.StringData (https://pkg.go.dev/unsafe#StringData) which properly handles longer strings. This changes the impl to use unsafe.StringData and adds a unit test. Release note (bug fix): Fixed a panic that could occur if a query uses a string larger than 2^31-1 bytes.
When creating a new cluster, this moves the initialisation of the log with retry number to the top of the loop, so that we can pass in the log reference to the `clusterImpl`. Without this, new clusters are susceptible to a nil pointer. This surfaced when testing Azure cloud, since not all provider functions are implemented, and a log statement is issued. Epic: none Release note: none
Release note: None
111615: roachtest: skip admission-control/index-backfill from weekly runs r=sumeerbhola a=aadityasondhi This test is currently only used for manual benchmarking. It will revisited later as part of cockroachdb#111614. We should skip it to avoid noise in test failures. Fixes: cockroachdb#111542. Release note: None Co-authored-by: Aaditya Sondhi <20070511+aadityasondhi@users.noreply.github.com>
110943: kvserver,storage: ingest small snapshot as writes r=itsbilal,erikgrinaker a=sumeerbhola Small snapshots cause LSM overload by resulting in many tiny memtable flushes, which result in high sub-level count, which then needs to be compensated by running many inefficient compactions from L0 to Lbase. Despite some compaction scoring changes, we have not been able to fully eliminate impact of this in foreground traffic as discussed in cockroachdb/pebble#2832 (comment). Fixes cockroachdb#109808 Epic: none Release note (ops change): The cluster setting kv.snapshot.ingest_as_write_threshold controls the size threshold below which snapshots are converted to regular writes. It defaults to 100KiB. 111627: encoding: fix UnsafeConvertStringToBytes to work with large input strings r=ecwall a=ecwall Fixes cockroachdb#111626 The previous impl assumed input string length <= math.MaxInt32. Go 1.20 added unsafe.StringData (https://pkg.go.dev/unsafe#StringData) which properly handles longer strings. This changes the impl to use unsafe.StringData and adds a unit test. Release note (bug fix): Fixed a panic that could occur if a query uses a string larger than 2^31-1 bytes. 111656: cluster-ui: pin `pnpm` to `8.6.10` for cluster-ui-release workflow r=THardy98 a=THardy98 Epic: none This change pins `pnpm` to `8.6.10` for the cluster-ui release (and release-next) workflow(s) to prevent not up-to-date lockfiles when installing cluster-ui dependencies with pnpm. Release note: None Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com> Co-authored-by: Evan Wall <wall@cockroachlabs.com> Co-authored-by: Thomas Hardy <thardy@cockroachlabs.com>
111467: ui: allow custom color on metric r=maryliag a=maryliag Add the ability to choose a color for specific metric series on Metric charts. On chart Replication -> Ranges, specify the color so it will be red. Otherwise could be confusing seeing any other as Red, and the `Unavailable` as green. Fixes cockroachdb#107637 Release note: None Before <img width="857" alt="Screenshot 2023-09-28 at 7 49 55 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/88af7e07-3c58-463d-963c-d47a3dd3f7c3"> After <img width="897" alt="Screenshot 2023-10-03 at 12 27 00 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/cfe3c2f8-4038-412f-908c-3ca51a82d720"> Release note: None 111598: sql: support SHOW GRANTS ON PROCEDURE r=mgartner a=mgartner Epic: CRDB-25388 Release note: None 111642: kvserver: use `SystemVisible` for `kv.raft.command.max_size` r=erikgrinaker a=erikgrinaker Epic: none Release note: None Co-authored-by: maryliag <marylia@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
111535: build: coverage: retry flaky unit tests r=RaduBerinde a=RaduBerinde Add `--flaky_test_attempts=4` to the coverage unit test builds. We don't want flaky tests failing these builds often. Epic: none Release note: None 111633: roachtest: revert harmonize GCE and AWS machine types r=RaduBerinde,erikgrinaker a=srosenberg Revert the change to machine types in [1] until after 23.2 branch is cut. [1] cockroachdb#111140 Epic: none Release note: None Co-authored-by: Radu Berinde <radu@cockroachlabs.com> Co-authored-by: Stan Rosenberg <stan.rosenberg@gmail.com>
The test regularly took about 7m under race. This commit drops down the size of the test under race so that it runs in about the same time as the non-race test. Epic: None Release note: None
111680: kv: speed up `TestNewVsInvariants` under race r=nvanbenschoten a=nvanbenschoten The test regularly took about 7m under race. This commit drops down the size of the test under race so that it runs in about the same time as the non-race test. Epic: None Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
The stats in TeamCity show that the time it takes to run all the tests in this package can frequently get very close to the existing timeout. Release note: None
111519: sqlproxyccl: fix test scripts r=darinpp a=darinpp This PR fixes the test scripts used by developers to quickly setup a multitenant test environment. The changes in cockroachdb since these were created broke them. Epic: none Release note: None 111669: roachtest: avoid nil logger r=renatolabs,srosenberg a=smg260 When creating a new cluster, this moves the initialisation of the log with retry number to the top of the loop, so that we can pass in the log reference to the `clusterImpl`. Without this, new clusters are susceptible to a nil pointer. This surfaced when testing Azure cloud, since not all provider functions are implemented, and a log statement is issued. Epic: none Release note: none 111682: ttljob: increase test timeout r=rafiss a=rafiss The stats in TeamCity show that the time it takes to run all the tests in this package can frequently get very close to the existing timeout. fixes cockroachdb#111364 Release note: None Co-authored-by: Darin Peshev <darinp@gmail.com> Co-authored-by: Miral Gadani <miral@cockroachlabs.com> Co-authored-by: Miral Gadani <25202158+smg260@users.noreply.github.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Update pkg/testutils/release/cockroach_releases.yaml with recent values. Epic: None Release note: None
cameronnunez
force-pushed
the
crdb-releases-yaml-update-master
branch
from
October 4, 2023 16:26
d1ebbb3
to
8013a97
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Update pkg/testutils/release/cockroach_releases.yaml with recent values.
Epic: None
Release note: None