Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Rollback only the youngest deadlocked transaction needed to make progress #14165

Closed
robertsami opened this issue Sep 23, 2022 · 4 comments
Assignees
Labels
2.20 Backport Required 2024.1_blocker area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@robertsami
Copy link
Contributor

robertsami commented Sep 23, 2022

Jira Link: DB-3646
We currently make a best effort to avoid rolling back transactions unnecessarily in the face of deadlock but we do not guarantee that we won't rollback multiple such transactions. If two deadlock detector instances simultaneously discover the same deadlock with probes starting at different waiting transactions, they may concurrently rollback the waiting transaction they manage. We could instead have an approach where we use some deterministic function to choose which transaction to rollback in a deadlock so that distributed actors will make the same decision and we won't rollback more than one deadlocked transaction

@robertsami robertsami self-assigned this Sep 23, 2022
@rthallamko3 rthallamko3 added the area/docdb YugabyteDB core features label Sep 23, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue status/awaiting-triage Issue awaiting triage and removed status/awaiting-triage Issue awaiting triage labels Sep 23, 2022
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Oct 5, 2022
basavaraj29 added a commit that referenced this issue Jul 12, 2023
Summary:
In the exisiting implementation, we return a generic error message when transactions are aborted due to a deadlock.

This diff addresses the issue by returning a detailed error info when transactions are aborted due to a deadlock. Introduced a new flag `clear_deadlocked_txns_info_older_than_seconds`. The deadlock detector now stores transactions aborted due to a deadlock, and we prune the list every `transaction_deadlock_detection_interval_usec`. The info gets removed if it is older than `clear_deadlocked_txns_info_older_than_seconds`.

The transaction coordinator now checks with the deadlock detector before reporting `ABORTED` for a txn that it is unable to find. If the txn is found at the deadlock detector, `GetTransactionStatusResponsePB` resp is populated with an newly added field `AppStatusPB reason`, which contains detailed info on the deadlock error.

Deadlocked transactions waiting in the wait-queue are aborted in 2 ways:
1. When the txn participant isn't able to find the txn, we directly poll the status tablet for the waiter's status.
2. The txn participant responds with the status it is last aware of (in this case, `ABORTED`).

Since both these codepaths use `GetTransactionStatusResponsePB`, we explicitly check if `reason` is populated and return back the detailed error, if any.

Note: In some cases, it is still possible that we return a generic error. When a waiter txn is notified that all of its blockers have been aborted, it re-runs conflict resolution. When `PrepareMetadata` is executed, the participant returns a generic error message if the txn is not found.
This can happen as the waiter txn too could been aborted as part of the deadlock, but that signal got delayed. If we deterministically abort just one txn when we find a deadlock (#14165), it should probably solve this issue.
Jira: DB-3597

Test Plan:
Reverting back the newly added test because of another issue #18100

Steps to manually test the feature:
Create a cluster using
```
./bin/yb-ctl create --rf=3 --data_dir ~/yugabyte-data --tserver_flags 'ysql_num_shards_per_tserver=1,enable_wait_queues=true,enable_deadlock_detection=true,enable_intentsdb_page=true'
```

Create a table using
```
create table test (k int primary key, v int);
insert into test select generate_series(1,11), 0;
```

Spawn transactions and create deadlocks. Should see error similar to the below
```
ERROR:  Transaction 6640330e-d7b6-4c6e-8fec-8d8e2b3311c3 aborted due to deadlock.
6640330e-d7b6-4c6e-8fec-8d8e2b3311c3 -> 12673b67-3ccb-4a90-8a3c-7ed72d92b8c8 -> 6640330e-d7b6-4c6e-8fec-8d8e2b3311c3
: 40P01
```

Reviewers: rsami, pjain, sergei

Reviewed By: rsami

Subscribers: esheng, ybase, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D26099
@robertsami robertsami moved this from Backlog to Pending in Wait-Queue Based Locking Aug 21, 2023
@robertsami
Copy link
Contributor Author

robertsami commented Jan 17, 2024

workloads which update tables with multiple indexes will run into deadlock frequently without this. we should add a test with a table with six indexes, and parallel updates to that table

cc @pkj415 who is adding a similar test

@robertsami robertsami moved this from Pending to In Progress in Wait-Queue Based Locking Jan 19, 2024
@robertsami robertsami changed the title [dst] Rollback only the minimal set of deadlocked transactions needed to make progress [dst] Rollback only the most recent set of deadlocked transactions needed to make progress Jan 22, 2024
@robertsami robertsami changed the title [dst] Rollback only the most recent set of deadlocked transactions needed to make progress [dst] Rollback only the youngest deadlocked transaction needed to make progress Jan 22, 2024
@rthallamko3 rthallamko3 moved this from In Progress to In Review in Wait-Queue Based Locking Jan 23, 2024
@rthallamko3 rthallamko3 changed the title [dst] Rollback only the youngest deadlocked transaction needed to make progress [DocDB] Rollback only the youngest deadlocked transaction needed to make progress Jan 29, 2024
@rthallamko3
Copy link
Contributor

@Karvy-yb , Do your tests cover this scenario as well? If not, can we add it as a functional test for the next milestone?

@Karvy-yb
Copy link

Karvy-yb commented Feb 1, 2024

@rthallamko3 We have test case running into this issue today but we do not validate that only 1 transaction should be aborted (that too younger). Added this to the milestone along with multi index test case that @robertsami mentioned above.

robertsami added a commit that referenced this issue Feb 7, 2024
…king to transaction coordinator

Summary:
In preparation for the changes being introduced in D31827, this revision moves recently aborted
transaction detection to the transaction coordinator with the following changes:
1. Deadlock detector now always sends an RPC to abort a deadlocked transaction, instead of making
   a local call. This makes the behavior to be introduced in a follow-up diff consistent between
   remote and local deadlock detectors aborting transactions.
2. If the coordinator receives an AbortTransaction request with a deadlock_reason populated, it
   updates the local TransactionState with that data and retains the TransactionState in
   managed_transactions for FLAGS_clear_deadlocked_txns_info_older_than_heartbeats txn heartbeat
   intervals.
3. UpdateTransaction and GetStatus requests now check the TransactionState for deadlock_reason
   instead of making a call to the deadlock detector

**Upgrade/Rollback safety:**
This revision makes an additive change to AbortTransactionRequestPB and does not pose any Upgrade/Downgrade safety issues.
Jira: DB-3646, DB-8053

Test Plan: Jenkins

Reviewers: bkolagani, sergei

Reviewed By: bkolagani

Subscribers: bogdan, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D31937
robertsami added a commit that referenced this issue Feb 13, 2024
Summary:
This revision modifies the deadlock detector to abort only the youngest transaction in a detected deadlock.

Previously, each deadlock detector would add simply a txn_id to the deadlock probe response. We now include information about the status tablet and txn age.

Upon receiving a probe response, the probe origin previously would abort the transaction for which the probe was started, which was guaranteed to be local. We now abort the youngest transaction, which may require a remote AbortTransaction RPC.

**Upgrade/Rollback safety:**
This revision adds a new "deadlock" field to ProbeTransactionDeadlockResponsePB which contains a new custom DeadlockedTxnInfo message type. This field is meant to replace the existing deadlocked_txn_ids field. The code in this revision updates both fields and handles both fields to ensure upgrade/downgrade safety without the use of any flags.
Jira: DB-3646

Test Plan: ybd --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueuesTest.DeadlockResolvesYoungestTxn

Reviewers: bkolagani, pjain

Reviewed By: bkolagani

Subscribers: yql, ybase, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D31827
@basavaraj29 basavaraj29 moved this from In Review to Backporting in Wait-Queue Based Locking Feb 14, 2024
basavaraj29 added a commit that referenced this issue Feb 15, 2024
…saction in a deadlock"

Summary:
This reverts commit ad37035.

The above commit caused a regression for contentious serializable workloads.

Earlier, when deadlock cycles were detected, we used to abort the transaction that originated the probe. But the above commit changes that logic to abort the youngest transaction instead. The problem with this approach is that when the origin transaction is part of two or more deadlock cycles, our deadlock algorithm ignores the callback for all cycles except the first detected cycle. So with the new logic, we end up in a deadlocked state, but don't detect it, leading to requests being timed out in the wait-queue (and new requests coming in, which trigger the probing algorithm again, which is when another deadlock might be detected). Despite the algorithm ignoring callbacks for subsequent cycle detected for a particular probe, we don't have this problem with the earlier algorithm since it aborts the originating transaction, breaking all possible deadlock cycles.

The test `testSerializableReadDelayWrite` rightly catches this issue.

**Upgrade/Downgrade safety** - The above commit hasn't been part of any official release yet, so reverting the new proto changes for now. This shouldn't have any implications on upgrades/downgrades among stable releases.
Jira: DB-10025, DB-3646

Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgTransactions#testSerializableReadDelayWrite'

Reviewers: rthallam, rsami, hsunder

Reviewed By: rthallam, rsami

Subscribers: ybase, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D32461
@basavaraj29 basavaraj29 moved this from Backporting to In Progress in Wait-Queue Based Locking Feb 15, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug and removed kind/enhancement This is an enhancement of an existing feature labels Mar 14, 2024
@robertsami robertsami moved this from In Progress to In Review in Wait-Queue Based Locking Mar 18, 2024
robertsami added a commit that referenced this issue Mar 20, 2024
… a deadlock

Summary:
Note: this revision was originally reverted in 9e07d79, as it caused a legitimate bug in serializable workloads which was originally masked with a test change. In particular, we could get into a situation where a single transaction ends up creating multiple deadlocks at the same time, but since we were only aborting the youngest txn with the original change, and only processing one deadlock per probe, if the offending transaction was not the youngest transaction in the first deadlock detected, it would not be aborted, and the second deadlock would be ignored until the coordinator triggered another probe up to 60s later.

We are re-introducing the change here with the test change removed to un-mask the original issue, along with the following fixes:
1. When processing probes, do not drop ProbeResponses from old versions which may not have the deadlock field populated. This was a bug in the initial implementation that was unrelated to the revert
2. If a deadlock is detected, and it is determined that a txn other than the originating txn should be aborted, trigger another probe for the originating transaction in case it was involved in another deadlock which would be later discovered (and ignored) by the original probe
3. In IsAnySubtxnActive, if the transaction is aborted, return false. This was a bug in the original implementation as well

Original Summary:

This revision modifies the deadlock detector to abort only the youngest transaction in a detected deadlock.

Previously, each deadlock detector would add simply a txn_id to the deadlock probe response. We now include information about the status tablet and txn age.

Upon receiving a probe response, the probe origin previously would abort the transaction for which the probe was started, which was guaranteed to be local. We now abort the youngest transaction, which may require a remote AbortTransaction RPC.

**Upgrade/Rollback safety:**
This revision adds a new "deadlock" field to ProbeTransactionDeadlockResponsePB which contains a new custom DeadlockedTxnInfo message type. This field is meant to replace the existing deadlocked_txn_ids field. The code in this revision updates both fields and handles both fields to ensure upgrade/downgrade safety without the use of any flags.
Jira: DB-3646

Test Plan: ybd --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueuesTest.DeadlockResolvesYoungestTxn

Reviewers: bkolagani, pjain

Reviewed By: bkolagani

Subscribers: bogdan, ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D32542
@robertsami robertsami moved this from In Review to Backporting in Wait-Queue Based Locking Mar 20, 2024
robertsami added a commit that referenced this issue Mar 23, 2024
…ransaction tracking to transaction coordinator

Summary:
Original commit: 916da9e / D31937
In preparation for the changes being introduced in D31827, this revision moves recently aborted
transaction detection to the transaction coordinator with the following changes:
1. Deadlock detector now always sends an RPC to abort a deadlocked transaction, instead of making
   a local call. This makes the behavior to be introduced in a follow-up diff consistent between
   remote and local deadlock detectors aborting transactions.
2. If the coordinator receives an AbortTransaction request with a deadlock_reason populated, it
   updates the local TransactionState with that data and retains the TransactionState in
   managed_transactions for FLAGS_clear_deadlocked_txns_info_older_than_heartbeats txn heartbeat
   intervals.
3. UpdateTransaction and GetStatus requests now check the TransactionState for deadlock_reason
   instead of making a call to the deadlock detector

**Upgrade/Rollback safety:**
This revision makes an additive change to AbortTransactionRequestPB and does not pose any Upgrade/Downgrade safety issues.
Jira: DB-3646, DB-8053

Test Plan: Jenkins

Reviewers: bkolagani, sergei

Reviewed By: bkolagani

Subscribers: ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33390
robertsami added a commit that referenced this issue Mar 26, 2024
…st transaction in a deadlock

Summary:
Original commit: e8e7394 / D32542
Note: this revision was originally reverted in 9e07d79, as it caused a legitimate bug in serializable workloads which was originally masked with a test change. In particular, we could get into a situation where a single transaction ends up creating multiple deadlocks at the same time, but since we were only aborting the youngest txn with the original change, and only processing one deadlock per probe, if the offending transaction was not the youngest transaction in the first deadlock detected, it would not be aborted, and the second deadlock would be ignored until the coordinator triggered another probe up to 60s later.

We are re-introducing the change here with the test change removed to un-mask the original issue, along with the following fixes:
1. When processing probes, do not drop ProbeResponses from old versions which may not have the deadlock field populated. This was a bug in the initial implementation that was unrelated to the revert
2. If a deadlock is detected, and it is determined that a txn other than the originating txn should be aborted, trigger another probe for the originating transaction in case it was involved in another deadlock which would be later discovered (and ignored) by the original probe
3. In IsAnySubtxnActive, if the transaction is aborted, return false. This was a bug in the original implementation as well

Original Summary:
Original commit: e8e7394 / D32542

This revision modifies the deadlock detector to abort only the youngest transaction in a detected deadlock.

Previously, each deadlock detector would add simply a txn_id to the deadlock probe response. We now include information about the status tablet and txn age.

Upon receiving a probe response, the probe origin previously would abort the transaction for which the probe was started, which was guaranteed to be local. We now abort the youngest transaction, which may require a remote AbortTransaction RPC.

**Upgrade/Rollback safety:**
This revision adds a new "deadlock" field to ProbeTransactionDeadlockResponsePB which contains a new custom DeadlockedTxnInfo message type. This field is meant to replace the existing deadlocked_txn_ids field. The code in this revision updates both fields and handles both fields to ensure upgrade/downgrade safety without the use of any flags.
Jira: DB-3646

Test Plan: ybd --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueuesTest.DeadlockResolvesYoungestTxn

Reviewers: bkolagani, pjain

Reviewed By: bkolagani

Subscribers: yql, ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33388
@rthallamko3
Copy link
Contributor

Pending 2.20 backport.

jasonyb pushed a commit that referenced this issue Apr 10, 2024
…ter-merge

Merge YB master commit 9e07d79 titled

    [#21058] DocDB: Revert "[#14165] DocDB: Rollback only the newest transaction in a deadlock"

and committed 2024-02-15T17:28:56-06:00 into YB pg15.

YB pg15 initial merge is 55782d5.

- jenkins_jobs.yml: YB master 9f91eeb
  changes clang16 to clang17.  pg15 branch has multiple modifications to
  this file.  Apply both.
- yb_uniqkeys.c: yb_is_const_clause_for_distinct_pushdown: YB master
  9161aec and YB pg15
  40e68e2 touch the same code.  Take
  pg15's version, as suggested by Patnaik.
- slot.c: ReplicationSlotCreate: YB master
  aa3528e adds extra parameter
  CRSSnapshotAction yb_snapshot_action; upstream PG
  19890a064ebf53dedcefed0d8339ed3d449b06e6 adds extra parameter
  two_phase.  Apply both.  Maybe two_phase should be passed down to
  YBCCreateReplicationSlot.
- slotfuncs.c: upstream PG 9f06d79ef831ffa333f908f6d3debdb654292414
  moves two ReplicationSlotCreate calls into
  create_physical_replication_slot and create_logical_replication_slot.
  YB master aa3528e passes
  CRS_NOEXPORT_SNAPSHOT to ReplicationSlotCreate.  Move that and any
  comments.  Note that create_logical_replication_slot and
  create_physical_replication_slot are also additionally called by
  copy_replication_slot, and hopefully CRS_NOEXPORT_SNAPSHOT makes sense
  for that case.
- walsender.c: CreateReplicationSlot: (same as slot.c)
- elog.c:
  - function declarations: YB pg15
    85f8a82 moves declarations up, so
    apply the changes of YB master
    2d0bd35 there.
  - yb_write_status_to_server_log: YB pg15
    85f8a82 adds call of
    yb_message_from_status_data that gets renamed to
    yb_format_and_append by YB master
    2d0bd35.
  - elog_start, elog_finish: upstream PG
    17a28b03645e27d73bf69a95d7569b61e58f06eb deletes these functions, so
    drop changes to elog_finish from YB master
    2d0bd35.
  - EVALUATE_MESSAGE: YB master 2d0bd35
    deletes yb_debug_report_error_stacktrace if condition whereas
    upstream PG d6c55de1f99a9028540516316b95321a7b12a540 removes
    pfree(fmtbuf).  Do both deletions.
  - Check that YB_EVALUATE_MESSAGE_FROM_STATUS is up-to-date with
    EVALUATE_MESSAGE: it is now closer due to PG
    d6c55de1f99a9028540516316b95321a7b12a540 removing formatting code
    that was never present in YB_EVALUATE_MESSAGE_FROM_STATUS.  Update
    the comment for YB_EVALUATE_MESSAGE_FROM_STATUS accordingly.
  - Check that yb_additional_errmsg is called wherever needed as
    explained in the comment.  The only new candidate function between
    PG 11.2 and PG 15.2 is errhint_plural, and that should not get
    yb_additional_errmsg since it's a hint.
- pgbench.c:
  - includes: upstream PG dddf4cdc3300073ec04b2c3e482a4c1fa4b8191b moves
    pgbench.h include higher.  Adjacent conflict with YB master
    6a009b1 adding
    ysql_bench_metrics_handler.h include.
  - variable declarations: upstream PG
    0e39a608ed5545cc6b9d538ac937c3c1ee8cdc36 adds pg_time_usec_t
    epoch_shift in the same location YB master
    6a009b1 adds YsqlBenchMetricEntry
    *ysql_bench_metric_entry = NULL.  Take both.
  - usage: YB pg15 initial merge moves --batch-size, introduced by YB
    master 62fd877, down into "Common
    options".  This is an adjacent conflict with YB master
    6a009b1 adding "Prometheus metrics
    options".
  - main:
    - old YB master 62fd877 adds
      batch-size, old YB master af25ba5
      adds max-tries, incoming YB master
      6a009b1 adds
      yb-metrics-bind-address and yb-metrics-bind-port.  YB pg15 initial
      merge preserves max-tries but deletes batch-size (this might be
      because max-tries was an import that originated from upstream PG
      while batch-size was not).
    - It also handles the max-tries case properly in getopt_long while
      loop but omits the batch-size case.  Ignore batch-size for now and
      try to faithfully preserve the yb-metrics-* options.  This
      requires renumbering the cases.
    - Upstream PG 547f04e7348b6ed992bd4a197d39661fe7c25097 changes a
      line to "conn_total_duration = 0" where YB master
      6a009b1 adjacently adds "Start
      metrics webserver".  Apply both.
  - threadRun: YB master 6a009b1 adds
    "if (ysql_bench_metric_entry)", but upstream PG
    9f75e3772350fb66f20a3d7f33bc94f30300d7eb moves all that code into a
    new function printProgressReport.  Move the changes there.  It is
    nontrivial to find where it belongs: after "fprintf(stderr," because
    that is the original line that was changed by old YB master
    af25ba5.  It appears the changes of
    that old YB master af25ba5 have not
    yet been translated to the new location printProgressReport, but
    ignore this issue for now.
- libpq-be.h: YB master 1b784fe adds
  yb_is_ssl_enabled_in_logical_conn while upstream PG adds authn_id in
  the same place.  Apply both.
- slot.h: (same as slot.c)
- pg15_tests/test_D31368.sh, pg15_tests/passing_tests.tsv: thanks to YB
  master e62fdc8, TestPgRegressProc now
  passes, so move it to passing_tests.tsv.
jasonyb pushed a commit that referenced this issue Apr 10, 2024
…bbc01c4' into pg15

Summary:
Merge YB master commit 9e07d79 titled

    [#21058] DocDB: Revert "[#14165] DocDB: Rollback only the newest transaction in a deadlock"

and committed 2024-02-15T17:28:56-06:00 into YB pg15.

YB pg15 initial merge is 55782d5.

- jenkins_jobs.yml: YB master 9f91eeb
  changes clang16 to clang17.  pg15 branch has multiple modifications to
  this file.  Apply both.
- yb_uniqkeys.c: yb_is_const_clause_for_distinct_pushdown: YB master
  9161aec and YB pg15
  40e68e2 touch the same code.  Take
  pg15's version, as suggested by Patnaik.
- slot.c: ReplicationSlotCreate: YB master
  aa3528e adds extra parameter
  CRSSnapshotAction yb_snapshot_action; upstream PG
  19890a064ebf53dedcefed0d8339ed3d449b06e6 adds extra parameter
  two_phase.  Apply both.  Maybe two_phase should be passed down to
  YBCCreateReplicationSlot.
- slotfuncs.c: upstream PG 9f06d79ef831ffa333f908f6d3debdb654292414
  moves two ReplicationSlotCreate calls into
  create_physical_replication_slot and create_logical_replication_slot.
  YB master aa3528e passes
  CRS_NOEXPORT_SNAPSHOT to ReplicationSlotCreate.  Move that and any
  comments.  Note that create_logical_replication_slot and
  create_physical_replication_slot are also additionally called by
  copy_replication_slot, and hopefully CRS_NOEXPORT_SNAPSHOT makes sense
  for that case.
- walsender.c: CreateReplicationSlot: (same as slot.c)
- elog.c:
  - function declarations: YB pg15
    85f8a82 moves declarations up, so
    apply the changes of YB master
    2d0bd35 there.
  - yb_write_status_to_server_log: YB pg15
    85f8a82 adds call of
    yb_message_from_status_data that gets renamed to
    yb_format_and_append by YB master
    2d0bd35.
  - elog_start, elog_finish: upstream PG
    17a28b03645e27d73bf69a95d7569b61e58f06eb deletes these functions, so
    drop changes to elog_finish from YB master
    2d0bd35.
  - EVALUATE_MESSAGE: YB master 2d0bd35
    deletes yb_debug_report_error_stacktrace if condition whereas
    upstream PG d6c55de1f99a9028540516316b95321a7b12a540 removes
    pfree(fmtbuf).  Do both deletions.
  - Check that YB_EVALUATE_MESSAGE_FROM_STATUS is up-to-date with
    EVALUATE_MESSAGE: it is now closer due to PG
    d6c55de1f99a9028540516316b95321a7b12a540 removing formatting code
    that was never present in YB_EVALUATE_MESSAGE_FROM_STATUS.  Update
    the comment for YB_EVALUATE_MESSAGE_FROM_STATUS accordingly.
  - Check that yb_additional_errmsg is called wherever needed as
    explained in the comment.  The only new candidate function between
    PG 11.2 and PG 15.2 is errhint_plural, and that should not get
    yb_additional_errmsg since it's a hint.
- pgbench.c:
  - includes: upstream PG dddf4cdc3300073ec04b2c3e482a4c1fa4b8191b moves
    pgbench.h include higher.  Adjacent conflict with YB master
    6a009b1 adding
    ysql_bench_metrics_handler.h include.
  - variable declarations: upstream PG
    0e39a608ed5545cc6b9d538ac937c3c1ee8cdc36 adds pg_time_usec_t
    epoch_shift in the same location YB master
    6a009b1 adds YsqlBenchMetricEntry
    *ysql_bench_metric_entry = NULL.  Take both.
  - usage: YB pg15 initial merge moves --batch-size, introduced by YB
    master 62fd877, down into "Common
    options".  This is an adjacent conflict with YB master
    6a009b1 adding "Prometheus metrics
    options".
  - main:
    - old YB master 62fd877 adds
      batch-size, old YB master af25ba5
      adds max-tries, incoming YB master
      6a009b1 adds
      yb-metrics-bind-address and yb-metrics-bind-port.  YB pg15 initial
      merge preserves max-tries but deletes batch-size (this might be
      because max-tries was an import that originated from upstream PG
      while batch-size was not).
    - It also handles the max-tries case properly in getopt_long while
      loop but omits the batch-size case.  Ignore batch-size for now and
      try to faithfully preserve the yb-metrics-* options.  This
      requires renumbering the cases.
    - Upstream PG 547f04e7348b6ed992bd4a197d39661fe7c25097 changes a
      line to "conn_total_duration = 0" where YB master
      6a009b1 adjacently adds "Start
      metrics webserver".  Apply both.
  - threadRun: YB master 6a009b1 adds
    "if (ysql_bench_metric_entry)", but upstream PG
    9f75e3772350fb66f20a3d7f33bc94f30300d7eb moves all that code into a
    new function printProgressReport.  Move the changes there.  It is
    nontrivial to find where it belongs: after "fprintf(stderr," because
    that is the original line that was changed by old YB master
    af25ba5.  It appears the changes of
    that old YB master af25ba5 have not
    yet been translated to the new location printProgressReport, but
    ignore this issue for now.
- libpq-be.h: YB master 1b784fe adds
  yb_is_ssl_enabled_in_logical_conn while upstream PG adds authn_id in
  the same place.  Apply both.
- slot.h: (same as slot.c)
- pg15_tests/test_D31368.sh, pg15_tests/passing_tests.tsv: thanks to YB
  master e62fdc8, TestPgRegressProc now
  passes, so move it to passing_tests.tsv.

Test Plan:
On Almalinux 8:

    #!/usr/bin/env bash
    set -eu
    ./yb_build.sh fastdebug --gcc11
    for _ in {1..50}; do grep RegressProc pg15_tests/passing_tests.tsv; done | pg15_tests/run_tests.sh
    # org.yb.pgsql.TestAlterTableWithConcurrentTxn#testDmlTransactionAfterAlterOnCurrentResourceWithCachedMetadata
    # might be flaky?
    pg15_tests/run_tests.sh

Tests from 1fc7b74:

    ./yb_build.sh fastdebug --gcc11 --enable_ysql_conn_mgr_test --java-test org.yb.pgsql.TestLDAPAuth --sj
    # TestYbRoleProfile without ysql_conn_mgr already fails.
    #./yb_build.sh fastdebug --gcc11 --enable_ysql_conn_mgr_test --java-test org.yb.pgsql.TestYbRoleProfile --sj

Manual test from 2d0bd35:

    #!/usr/bin/env bash
    set -eux
    ./yb_build.sh fastdebug --gcc11
    bin/yb-ctl destroy
    bin/yb-ctl create --ysql_pg_conf_csv yb_debug_report_error_stacktrace=true
    sleep 5
    # errmsg, errdetail, errhint
    count=$(bin/ysqlsh -c 'create table t (i int)' -c 'alter table t add primary key (i)' |& grep -c YBCGetStackTrace)
    [ "$count" -eq 1 ]
    # yb_errmsg_from_status
    count=$(bin/ysqlsh -c 'insert into t values (1), (1)' |& grep -c YBCGetStackTrace)
    [ "$count" -eq 1 ]
    # errmsg_internal
    count=$(bin/ysqlsh -c 'begin' -c 'set transaction_deferrable = false' |& grep -c YBCGetStackTrace)
    [ "$count" -eq 1 ]
    # elog_finish
    count=$(bin/ysqlsh -c 'create temp table tmp (i int)' -c "select yb_get_range_split_clause('tmp'::regclass)" |& grep -c YBCGetStackTrace)
    [ "$count" -eq 1 ]
    # Workaround for #20820 (comment)
    sleep 2
    # errmsg_plural
    count=$(bin/ysqlsh -c 'create table r (i int primary key)' -c 'create table f1 (i int references r)' -c 'create table f2 (i int references r)' -c 'drop table r cascade' |& grep -c YBCGetStackTrace)
    [ "$count" -eq 1 ]

Running other test plan tests, got

    1	2024-04-09T15:50:35-07:00	JAVA	org.yb.pgsql.TestPgRegressDistinctPushdown
    0	2024-04-09T15:50:57-07:00	JAVA	org.yb.ysqlconnmgr.TestPreparedStatements#testPreparedQuery
    0	2024-04-09T15:51:05-07:00	integration-tests_xcluster_ddl_queue_handler-test	XClusterDDLQueueHandlerMockedTest.VerifySafeTimes
    0	2024-04-09T15:51:12-07:00	integration-tests_xcluster_ddl_queue_handler-test	XClusterDDLQueueHandlerMockedTest.VerifyBasicJsonParsing
    0	2024-04-09T15:51:19-07:00	integration-tests_xcluster_ddl_queue_handler-test	XClusterDDLQueueHandlerMockedTest.VerifyAlreadyProcessed
    0	2024-04-09T15:51:27-07:00	integration-tests_xcluster_safe_time_service-test	XClusterSafeTimeServiceTest.ComputeSafeTime
    0	2024-04-09T15:51:35-07:00	integration-tests_xcluster_safe_time_service-test	XClusterSafeTimeServiceTest.ComputeSafeTimeWithFilters
    0	2024-04-09T15:51:43-07:00	integration-tests_xcluster_safe_time_service-test	XClusterSafeTimeServiceTest.ComputeSafeTimeWithFiltersSingleTablet
    0	2024-04-09T15:52:14-07:00	pgwrapper_pg_get_lock_status-test	PgGetLockStatusTest.TestWaiterLockContainingColumnId
    0	2024-04-09T15:52:27-07:00	tserver_clone_operation-test	CloneOperationTest.Hardlink
    8	2024-04-09T15:52:40-07:00	tserver_clone_operation-test	CloneOperationTest.CloneAndRestore
    8	2024-04-09T15:52:52-07:00	tserver_clone_operation-test	CloneOperationTest.CloneOnSameDrive
    0	2024-04-09T15:53:16-07:00	integration-tests_clone-tablet-itest	CloneTabletExternalItest/CloneTabletExternalCrashItest.CrashDuringApply/0
    0	2024-04-09T15:54:29-07:00	integration-tests_xcluster_ysql_index-test	XClusterDbScopedYsqlIndexTest.CreateIndex
    0	2024-04-09T15:55:46-07:00	integration-tests_xcluster_ysql_index-test	XClusterDbScopedYsqlIndexTest.CreateIndexWithWorkload
    0	2024-04-09T15:57:55-07:00	integration-tests_xcluster_db_scoped-test	XClusterDBScopedTest.CreateTable
    0	2024-04-09T15:58:04-07:00	integration-tests_xcluster_ddl_queue_handler-test	XClusterDDLQueueHandlerMockedTest.SkipScanWhenNoNewRecords
    0	2024-04-09T15:59:51-07:00	JAVA	org.yb.pgsql.TestPgWriteRestart
    1	2024-04-09T16:31:44-07:00	JAVA	org.yb.pgsql.TestPgEncryption
    0	2024-04-09T16:32:52-07:00	JAVA	org.yb.pgsql.TestPgDdlConcurrency#testModifiedTableWrite
    0	2024-04-09T16:38:02-07:00	JAVA	org.yb.pgsql.ConcurrentTablespaceTest

- TestPgRegressDistinctPushdown fails because EXPLAIN differences
- The two Clone tests fail with "Bad status: Not found (yb/tserver/ts_tablet_manager.cc:2256): Tablet target_tablet_id not found"
- TestPgEncryption fails with "Connection to 127.0.0.182:27726 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections."

Jenkins: rebase: pg15

Reviewers: aagrawal, tfoucher, xCluster, hsunder

Reviewed By: tfoucher

Subscribers: patnaik.balivada, ybase, ycdcxcluster, yql

Differential Revision: https://phorge.dev.yugabyte.com/D33933
robertsami added a commit that referenced this issue Apr 11, 2024
… transaction in a deadlock

Summary:
Original commit: e8e7394 / D32542
Note: this revision was originally reverted in 9e07d79, as it caused a legitimate bug in serializable workloads which was originally masked with a test change. In particular, we could get into a situation where a single transaction ends up creating multiple deadlocks at the same time, but since we were only aborting the youngest txn with the original change, and only processing one deadlock per probe, if the offending transaction was not the youngest transaction in the first deadlock detected, it would not be aborted, and the second deadlock would be ignored until the coordinator triggered another probe up to 60s later.

We are re-introducing the change here with the test change removed to un-mask the original issue, along with the following fixes:
1. When processing probes, do not drop ProbeResponses from old versions which may not have the deadlock field populated. This was a bug in the initial implementation that was unrelated to the revert
2. If a deadlock is detected, and it is determined that a txn other than the originating txn should be aborted, trigger another probe for the originating transaction in case it was involved in another deadlock which would be later discovered (and ignored) by the original probe
3. In IsAnySubtxnActive, if the transaction is aborted, return false. This was a bug in the original implementation as well

Original Summary:
Original commit: e8e7394 / D32542

This revision modifies the deadlock detector to abort only the youngest transaction in a detected deadlock.

Previously, each deadlock detector would add simply a txn_id to the deadlock probe response. We now include information about the status tablet and txn age.

Upon receiving a probe response, the probe origin previously would abort the transaction for which the probe was started, which was guaranteed to be local. We now abort the youngest transaction, which may require a remote AbortTransaction RPC.

**Upgrade/Rollback safety:**
This revision adds a new "deadlock" field to ProbeTransactionDeadlockResponsePB which contains a new custom DeadlockedTxnInfo message type. This field is meant to replace the existing deadlocked_txn_ids field. The code in this revision updates both fields and handles both fields to ensure upgrade/downgrade safety without the use of any flags.
Jira: DB-3646

Test Plan: ybd --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueuesTest.DeadlockResolvesYoungestTxn

Reviewers: bkolagani, pjain

Reviewed By: bkolagani

Subscribers: yql, ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33495
@github-project-automation github-project-automation bot moved this from Backporting to Done in Wait-Queue Based Locking Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.20 Backport Required 2024.1_blocker area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
Status: Done
Development

No branches or pull requests

4 participants