Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure no ongoing peer recovery in translog yaml test #46476

Merged
merged 1 commit into from
Sep 9, 2019

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Sep 9, 2019

We leave replicas unassigned until we reroute after the primary shard starts. If a cluster health request with wait_for_no_initializing_shards is executed before the reroute, it will return immediately although there will be some initializing replicas. Peer recoveries of those shards can prevent translog on the primary from trimming.

We add wait_for_events to the cluster health request so that it will execute after the reroute.

Closes #46425

@dnhatn dnhatn added >test Issues or PRs that are addressing/adding tests :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v8.0.0 v7.5.0 v7.4.1 v7.3.3 labels Sep 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @dnhatn. Relates #44433 so probably doesn't need to go back to v7.3.3.

@dnhatn
Copy link
Member Author

dnhatn commented Sep 9, 2019

Thanks @DaveCTurner.

@dnhatn dnhatn merged commit 2224f86 into elastic:master Sep 9, 2019
@dnhatn dnhatn deleted the fix-yaml-translog-stats branch September 9, 2019 13:38
dnhatn added a commit that referenced this pull request Sep 10, 2019
We leave replicas unassigned until we reroute after the primary shard
starts. If a cluster health request with wait_for_no_initializing_shards
is executed before the reroute, it will return immediately although
there will be some initializing replicas. Peer recoveries of those
shards can prevent translog on the primary from trimming.

We add wait_for_events to the cluster health request so that it will
execute after the reroute.

Closes #46425
dnhatn added a commit that referenced this pull request Sep 11, 2019
We leave replicas unassigned until we reroute after the primary shard
starts. If a cluster health request with wait_for_no_initializing_shards
is executed before the reroute, it will return immediately although
there will be some initializing replicas. Peer recoveries of those
shards can prevent translog on the primary from trimming.

We add wait_for_events to the cluster health request so that it will
execute after the reroute.

Closes #46425
@colings86 colings86 added v7.4.0 and removed v7.4.1 labels Sep 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >test Issues or PRs that are addressing/adding tests v7.4.0 v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SmokeTestMultiNodeClientYamlTestSuiteIT/indices.stats/20_translog failed retaining too much translog
5 participants