Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve aborted transactions cleanup #5669

Closed
spolitov opened this issue Sep 14, 2020 · 0 comments
Closed

Improve aborted transactions cleanup #5669

spolitov opened this issue Sep 14, 2020 · 0 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/high High Priority

Comments

@spolitov
Copy link
Contributor

No description provided.

@spolitov spolitov added kind/bug This issue is a bug area/docdb YugabyteDB core features labels Sep 14, 2020
@spolitov spolitov self-assigned this Sep 14, 2020
spolitov added a commit that referenced this issue Sep 18, 2020
Summary:
When a transaction is aborted its intents should be cleaned from participating tablets.
And the transaction itself should be unloaded from memory.
This diff fixes various scenarios when transactions were not cleaned:
1) Added a cleanup cache to transaction participant. So it will be able to clean up transaction when a cleanup request is received before the transaction was replicated to this node.
2) Fix state check for cleanup when the transaction heartbeat failed.
3) Clean up a transaction that failed to commit.
4) Attempt to clean up tablets that were not marked as having metadata, because in case of failure with child transaction we could write intents to them, while reporting overall operation as failed.
5) Clean up tablets involved in a child transaction when it fails.

Test Plan: ybd --gtest_filter CqlIndexTest.TxnCleanup

Reviewers: timur, mikhail

Reviewed By: mikhail

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D9358
@kmuthukk kmuthukk added the priority/high High Priority label Sep 18, 2020
spolitov added a commit that referenced this issue Oct 7, 2020
…up aborted ones

Summary:
This diff adds a periodic status check of each running transaction to transaction participant. This is needed to detect transactions that have been aborted and abandoned more proactively. Such cases might happen when the transaction client has crashed, so that there is no one to send a cleanup RPC to the transaction participant. Previously, we would have to wait for a compaction for those transactions' intents to be cleaned up.

The cleanup mechanism works as follows. Every running transaction now has an associated scheduled abort check hybrid time, abort_check_ht, which we set to start time + FLAGS_transaction_abort_check_interval_ms when the transaction starts. We keep resetting it to current time + the same interval FLAGS_transaction_abort_check_interval_ms when we receive a response saying the transaction is still pending. As a result of this, in the normal situation with no network disconnections or slowness, we check the status of each pending transaction once per FLAGS_transaction_abort_check_interval_ms milliseconds on average. In case of slow status request processing, we wait for the previous status request to time out (as per FLAGS_transaction_abort_check_timeout_ms flag) before scheduling a new status check for the same transaction.

To efficiently implement the above polling mechanism, we use rpc::Poller and rpc::Scheduler to invoke a Poll function every FLAGS_transactions_status_poll_interval_ms milliseconds. This polling interval is much smaller the per-transaction status check interval FLAGS_transaction_abort_check_timeout_ms. This function uses the new sequential index on abort_check_ht that is being added to the transactions_ multi-index container in TransactionParticipant to obtain the set of transactions that are due for status check at this iteration.

Also, in this diff we extract the code in TransactionParticipant that loads transaction metadata from intents RocksDB and large-transaction "apply metadata" from regular RocksDB into memory to a new class TransactionLoader.

Test Plan: ybd --gtest_filter CqlIndexTest.TxnPollCleanup

Reviewers: mikhail

Reviewed By: mikhail

Subscribers: bogdan, ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D9427
@spolitov spolitov closed this as completed Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/high High Priority
Projects
None yet
Development

No branches or pull requests

2 participants