Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid noisy exceptions on data nodes when aborting snapshots #88476

Conversation

original-brownbear
Copy link
Member

Currently, an abort (especially when triggered an index delete) can
manifest as either an aborted snapshot exception, a missing index exception or
an NPE. The latter two show up as noise in logs.
This change catches effectively all of these cleanly as aborted snapshot
exceptions so they don't get logged as warnings and avoids the NPE if
a shard was removed from the index service concurrently by using the
API that throws on missing shards to look it up.

Seen in this noisy failure #86724 (comment)

Currently, an abort (especially when triggered an index delete) can
manifest as either an aborted snapshot exception, a missing index exception or
an NPE. The latter two show up as noise in logs.
This change catches effectively all of these cleanly as aborted snapshot
exceptions so they don't get logged as warnings and avoids the NPE if
a shard was removed from the index service concurrently by using the
API that throws on missing shards to look it up.
@original-brownbear original-brownbear added >non-issue :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.4.0 labels Jul 12, 2022
@original-brownbear original-brownbear marked this pull request as ready for review July 12, 2022 13:45
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 12, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@original-brownbear
Copy link
Member Author

Thanks Ievgen!

@original-brownbear original-brownbear merged commit ba46bd4 into elastic:master Jul 12, 2022
@original-brownbear original-brownbear deleted the clean-handle-abort-snapshot branch July 12, 2022 14:40
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Jul 13, 2022
* upstream/master: (38 commits)
  Simplify map copying (elastic#88432)
  Make DiffableUtils.diff implementation agnostic (elastic#88403)
  Ingest: Start separating Metadata from IngestSourceAndMetadata (elastic#88401)
  Move runtime fields base scripts out of scripting fields api package. (elastic#88488)
  Enable TRACE Logging for test and increase timeout (elastic#88477)
  Mute ReactiveStorageIT#testScaleDuringSplitOrClone (elastic#88480)
  Track the count of failed invocations since last successful policy snapshot (elastic#88398)
  Avoid noisy exceptions on data nodes when aborting snapshots (elastic#88476)
  Fix ReactiveStorageDeciderServiceTests testNodeSizeForDataBelowLowWatermark (elastic#88452)
  INFO logging of snapshot restore and completion (elastic#88257)
  unmute test (elastic#88454)
  Updatable API keys - noop check (elastic#88346)
  Corrected an incomplete sentence. (elastic#86542)
  Use consistent shard map type in IndexService (elastic#88465)
  Stop registering TestGeoShapeFieldMapperPlugin in ESIntegTestCase (elastic#88460)
  TSDB: RollupShardIndexer logging improvements (elastic#88416)
  Audit API key ID when create or grant API keys (elastic#88456)
  Bound random negative size test in SearchSourceBuilderTests#testNegativeSizeErrors (elastic#88457)
  Updatable API keys - logging audit trail event (elastic#88276)
  Polish reworked LoggedExec task (elastic#88424)
  ...

# Conflicts:
#	x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/v2/RollupShardIndexer.java
@original-brownbear original-brownbear restored the clean-handle-abort-snapshot branch April 18, 2023 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants