-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unpromotables skip replication and peer recovery #93210
Unpromotables skip replication and peer recovery #93210
Conversation
For skipping replication: * ReplicationTracker and Group filter promotable * Remove from in sync allocations in metadata Fixes ES-4861 For skipping peer recovery: * Shards pass directly to STARTED skipping intermediate peer recovery stages and messages Fixes ES-5257
Hi @kingherc, I've created a changelog YAML for you. |
Oh BTW, the source branch should read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, pretty much how I hoped. I left some comments but nothing structural. I suspect this breaks the refresh API tho, ideally we'd address that here (or beforehand) too.
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
Did not handle yet the feedback, apart from the comment about Refresh. Just pushed a commit that re-enables Refresh for unpromotable shards. Feel free to review that if you'd like as well. I will continue to handle the remaining feedback, and then invite you to review finally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a handful of comments, mostly tiny stuff.
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Outdated
Show resolved
Hide resolved
...rg/elasticsearch/action/admin/indices/refresh/TransportUnpromotableReplicaRefreshAction.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Iraklis. Just left some minor comments/questions.
server/src/main/java/org/elasticsearch/action/ActionModule.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/action/admin/indices/refresh/TransportShardRefreshAction.java
Show resolved
Hide resolved
...rg/elasticsearch/action/admin/indices/refresh/TransportUnpromotableReplicaRefreshAction.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
Regarding adding a new action: you'd need to add the new action to |
Oh thanks! Totally unaware of this. I do see @DaveCTurner might you know if I should add both [r] and [u] above? |
I believe that registering these actions is only necessary for actions that are exposed to a |
In fact, I think registering these transport actions is not just unnecessary, it's actively forbidden: Lines 117 to 126 in 52e2e37
|
I'm not sure, I understand this. But I think, if the new internal action is not in |
Probably simplest to see what happens if you try it:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi all, I handled your comments and this PR is now ready for review. Please feel free to review and also the previous unresolved conversations.
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Show resolved
Hide resolved
shardRoutings.addAll(routingTable.assignedShards()); // include relocation targets | ||
|
||
// include relocation targets | ||
shardRoutings.addAll(routingTable.assignedShards().stream().filter(ShardRouting::isPromotableToPrimary).toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's just pull out the relocation targets in that loop instead of adding all the shards again.
I now realize that iterating the shards above already contains the relocation targets, right? I am not even sure why the old code added again the assigned shards for relocation targets.
So we do not need this line at all in the end, I will simply remove it. Please tell me if I'm assuming wrong here.
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, there's just one more comment (an easy trap to fall into) and a few more nits.
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/ShardRoutingRoleIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/action/admin/indices/refresh/TransportUnpromotableShardRefreshAction.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/action/admin/indices/refresh/TransportUnpromotableShardRefreshAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
Hi all, I handled your remaining feedback. Feel free to check latest commits. It'd be great if you could review today if possible! Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
For skipping replication: * ReplicationTracker and Group filter shards that are promotable to primary * Remove unpromotable shards from in sync allocations in metadata * There is a new Refresh action for unpromotable replica shards Fixes ES-4861 For skipping peer recovery: * Unpromotable shards pass directly to STARTED skipping some intermediate peer recovery stages and messages Fixes ES-5257
.stream() | ||
.filter(Predicate.not(ShardRouting::isPromotableToPrimary)) | ||
.map(ShardRouting::currentNodeId) | ||
.collect(Collectors.toUnmodifiableSet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this intermediate collection? I think it could be replaced with distinct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we dropped this. We don't even need distinct()
, we're looking at node IDs for the copies of a single shard which are necessarily distinct anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we dropped it from another place, but not here. I can handle this in another PR.
For skipping replication:
Fixes ES-4861
For skipping peer recovery:
Fixes ES-5257