-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translog pruning based on retention leases #1100
Comments
What benchmarks did you run? Can you share the configuration and benchmark results here? |
It is not very clear as to why Tlog needs to be pruned based on retention lease for it to be fetched? All you need is a way to define start and end sequence numbers and then fetch those operations from tlog. Do you mind explaining this a bit more? |
We will not be using retention leases to fetch the operations from the Tlog rather the existing deletion policy to prune Tlog operation takes retention leases also into account. With TLog With luncene soft-deletes Now, In replication use-case, with support for translog pruning based on retention lease
Ran indexing load on single node, single shard on leader and fetched the operations to simulate the follower cluster side of things.
|
Can you share a well known benchmark like nyc taxis on a selected instance type and share the impact on indexing throughput or network transfer rate for when you use tlog vs lucene snapshot. Maybe a general purpose EC2 instance like m4.4xlarge will help. It would be good to have more than 1 follower cluster to identify the network impact as well. I would recommend capturing disk size limits for these tests after running these tests for a sufficiently longer duration(atleast upto default retention lease). A few concerns I have with this:
There maybe even more implications of this change that I can't think of at this point. Will need more feedback on this before we proceed with this change. @nknize What do you think about this change? |
Updated the additional benchmark details regarding dataset and instance used. Regarding some of the concerns listed above:
Let me know, If you have any other concerns. |
Please update the rally benchmark output with Tlog vs Lucene. e.g. indexing throughput/latency etc. |
@saikaranam-amazon Did you try figuring out a mechanism to improve the indexing throughput on the leader cluster without tlog optimizations by looking closely at how lucene stored fields are being read? Is there an opportunity to improve performance at that layer instead of making this change? |
I believe that one of the things that has not been tested well is data integrity across the 2 recovery sources upto the local checkpoint. The old logic either always used translog recovery or retention leases, but never both. Here, we are choosing between the two dynamically and hence it is imperative that we check the integrity of the data across the two sources. |
I agree w/ @itiyamas on the data integrity testing concern. Has this been addressed? Also, this is based on the document level replication model, but there's also segment level replication built into Lucene that rsyncs the segments rather than relying on the translog playback replication model. This doesn't require decompression at all and proves to be significantly faster in all cases (product search is taking this approach) though it would require some check-point coordination. Have we looked at this approach instead of building on a slower replication model? |
Yes, Explored compression algorithm used for the stored fields and this is at block level (with group of documents) and couldn't turnoff compression for set of docs. |
Performed these test via scripts with source as lucene and translog. I understand the concern, we will extend these scripts to work with the tests as part of integration tests under the replication plugin. |
Attaching the results for esrally with and without applying translog pruning Configuration
without enabling the setting
with enabling the setting
|
After reviewing the use-case, modifying the setting name to |
Test for peer recovery
with setting disabled
with setting enabled
|
…n lease. (#1416) The settings and the corresponding logic for translog pruning by retention lease which were added as part of #1100 have been deprecated. This commit removes those deprecated code in favor of an extension point for providing a custom TranslogDeletionPolicy. Signed-off-by: Rabi Panda <adnapibar@gmail.com>
…n lease. (#1416) (#1471) The settings and the corresponding logic for translog pruning by retention lease which were added as part of #1100 have been deprecated. This commit removes those deprecated code in favor of an extension point for providing a custom TranslogDeletionPolicy. Signed-off-by: Rabi Panda <adnapibar@gmail.com>
Is your feature request related to a problem? Please describe.
Background
Cross Cluster Replication feature follows logical replication model. Each follower pulls the latest operations from the corresponding shards in the leader index and replays them at the follower side. In order to maintain the latest operations at the leader cluster, retention leases are acquired at the leader index. Once the operations are replayed at the follower side, these retention leases are renewed. Existing peer-recovery infrastructure is leveraged and extended for cross cluster replication feature.
Problem
Retention leases preserves operations for each shard at Lucene level (used as part of peer recovery within the cluster).
During performance benchmarking for the replication feature (for high indexing workloads), the fetch for the latest operations from the leader cluster has seen an impact on CPU (of up to ~8-10%) due to Lucene stored fields decompression.
Describe the solution you'd like
Solution
All the latest operations are available under translog in uncompressed form. Currently, translog doesn't have the mechanism to prune the operations based on the retention lease. If translog pruning takes into account retention leases as well, then the fetch operations can directly serve the requests from translog saving CPU cycles.
Details
Describe alternatives you've considered
N/A
Additional context
N/A
The text was updated successfully, but these errors were encountered: