Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Update with expression pushdown may cause data inconsistency #11886

Closed
d-uspenskiy opened this issue Mar 25, 2022 · 0 comments
Closed
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/high High Priority

Comments

@d-uspenskiy
Copy link
Contributor

Description

The following test fails on latest master

TEST_F(PgMiniTest, YB_DISABLE_TEST_IN_TSAN(ConcurrentSingleRowUpdate)) {
  auto conn = ASSERT_RESULT(Connect());
  ASSERT_OK(conn.Execute("CREATE TABLE t(k INT PRIMARY KEY, counter INT)"));
  ASSERT_OK(conn.Execute("INSERT INTO t VALUES(1, 0)"));
  const size_t thread_count = 10;
  const size_t increment_per_thread = 5;
  {
    CountDownLatch latch(thread_count);
    TestThreadHolder thread_holder;
    for (size_t i = 0; i < thread_count; ++i) {
      thread_holder.AddThreadFunctor([this, &stop = thread_holder.stop_flag(), &latch] {
        auto thread_conn = ASSERT_RESULT(Connect());
        ASSERT_OK(thread_conn.Execute("SET yb_enable_expression_pushdown TO true"));
        latch.CountDown();
        latch.Wait();
        for (size_t j = 0; j < increment_per_thread; ++j) {
          ASSERT_OK(thread_conn.Execute("UPDATE t SET counter = counter + 1 WHERE k = 1"));
        }
      });
    }
  }
  auto res = ASSERT_RESULT(conn.Fetch("SELECT counter FROM t WHERE k = 1"));
  ASSERT_EQ(1, PQnfields(res.get()));
  ASSERT_EQ(1, PQntuples(res.get()));
  auto counter = ASSERT_RESULT(GetInt32(res.get(), 0, 0));
  ASSERT_EQ(thread_count * increment_per_thread, counter);
}

Reason: Non transactional write operation uses read-time as a result this write operation may see non latest state of the DocDB.

@d-uspenskiy d-uspenskiy added kind/bug This issue is a bug area/ysql Yugabyte SQL (YSQL) labels Mar 25, 2022
@d-uspenskiy d-uspenskiy self-assigned this Mar 25, 2022
@rthallamko3 rthallamko3 added the priority/high High Priority label Apr 1, 2022
@tverona1 tverona1 added the status/awaiting-triage Issue awaiting triage label Apr 6, 2022
d-uspenskiy added a commit that referenced this issue Apr 16, 2022
…w transaction

Summary:
To read latest DocDB state in case of non-distributed-transaction session should not specify read time. Due to the bug in the `PgClientSession` class (introduced by D13244 / c5f5125) session read time has been reset to current time of local tserver for each new non-distributed-transaction.
Solution: session read time for non-distributed-transaction should  be reset only in case it is valid (i.e. was explicitly set before).

Test Plan:
Jenkins
New test case was introduced

```
./yb_build.sh --gtest_filter PgMiniTest.ConcurrentSingleRowUpdate
```

Reviewers: amitanand, sergei

Reviewed By: amitanand, sergei

Subscribers: rthallam, kannan, mbautin, bogdan, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D16201
d-uspenskiy added a commit that referenced this issue Apr 18, 2022
…case of single row transaction

Summary:
To read latest DocDB state in case of non-distributed-transaction session should not specify read time. Due to the bug in the `PgClientSession` class (introduced by D13244 / c5f5125) session read time has been reset to current time of local tserver for each new non-distributed-transaction.
Solution: session read time for non-distributed-transaction should  be reset only in case it is valid (i.e. was explicitly set before).

Original commit: 6dbde2e / D16201

Test Plan:
Jenkins: rebase: 2.13.1
New test case was introduced

```
./yb_build.sh --gtest_filter PgMiniTest.ConcurrentSingleRowUpdate
```

Reviewers: sergei, amitanand

Reviewed By: amitanand

Subscribers: yql, bogdan, mbautin, kannan, rthallam

Differential Revision: https://phabricator.dev.yugabyte.com/D16582
@rthallamko3 rthallamko3 removed the status/awaiting-triage Issue awaiting triage label Apr 28, 2022
d-uspenskiy added a commit that referenced this issue May 17, 2022
Summary:
In context of the #11886 issue optimization for using tserver's read time has been introduced.
This optimization may work incorrectly in case YSQL runs multiple `Perform` RPC simultaneously.

`PgClientService` analyzes number of tablets which will be used by first read request in the transaction. In case of read from single tablet read time will be selected by tablet's tserver and in case of read multiple tablets read time will be selected locally.
Running multiple `Perform` RPCs simultaneously may break the mechanism of choosing read time because each of the `Perform` RPC may read from single tablet by actually read will be performed from multiple tablets.

For optimization purposes in case of foreign key check reading from multiple tables in parallel was introduced in context of the #11470 issue. Current version sends multiple `Perform` RPCs simultaneously.

Current diff preserves reading from multiple tables in parallel by uses single `Perform` RPCs for this purpose.

Test Plan:
New unit test is introduced

```
./yb_build.sh --gtest_filter PgFKeyTest.MultipleFKConstraintRPCCount
./yb_build.sh --gtest_filter PgFKeyTest.InsertWithLargeNumberOfFK
```

Reviewers: sergei, alex, lnguyen

Reviewed By: lnguyen

Subscribers: ssong, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D16751
d-uspenskiy added a commit that referenced this issue Jul 8, 2022
…yb_disable_transactional_writes = 1)

Summary:
In context of #11886 / D16201 read time selection on t-server side was introduced.
But this was done only for read operations as write operation in general case creates a distributed transaction and read time is taken from this transaction.

But postgres has a GUC variable to perform non transactional writes (it is controlled by the `yb_disable_transactional_writes` GUC variable).
In case of such writes (which doesn't start distributed transaction) `PgClientSession` doesn't set the read time neither for first request nor for further. Sending second and further request on a session without read time is prohibited and raises the `IllegalState, "Used read time is not set"` error.

One of the possible solution is to send first write request without read time. In this case remote t-server will select its current time as a read time and it is possible to return it back to client with response. On getting response session should update its read time with read time from the response. This mechanism works for read requests but it is not applicable for write requests. Because it requires that YSQL doesn't call `Perform` method multiple times without waiting for previous response. But due to write operations buffering YSQL may collect huge number of write operations and send them in batches. New batch may be sent before previous one is completed.

 So in context of this diff another solution is implemented. YSQL sets same read time for all non-txn write operations in context of current postgres's transaction in case first operation was non-txn write. This solution is tolerant to possible request processing order change by the `PgClientService`. I.e. if YSQL sends 2 batches with non-txn write operations by calling `Perform` RPC twice it doesn't matter in which order the `PgClientService` will process these requests as both will have same read time.

**Note**
Also this diff contains a fix for a problem with "disappearing" of buffered operations.
For optimization purposes currently buffered operations may be combined with read operation in some cases. This is done via intercepting of buffered operations sending process. Container of operations is passed by value into interceptor by using the move semantic. But even in case interception will not be performed buffered operations will still be empty because of moving.

Solution is to pass container of operations by pointer instead of passing it by value.

Test Plan:
New test case was introduced
```
./yb_build.sh --gtest_filter PgOpBufferingTest.FKCheckWithNonTxnWrites
```

Reviewers: sergei, lnguyen

Reviewed By: sergei, lnguyen

Subscribers: yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D18079
d-uspenskiy added a commit that referenced this issue Jul 9, 2022
…on-txn writes (yb_disable_transactional_writes = 1)

Summary:
In context of #11886 / D16201 read time selection on t-server side was introduced.
But this was done only for read operations as write operation in general case creates a distributed transaction and read time is taken from this transaction.

But postgres has a GUC variable to perform non transactional writes (it is controlled by the `yb_disable_transactional_writes` GUC variable).
In case of such writes (which doesn't start distributed transaction) `PgClientSession` doesn't set the read time neither for first request nor for further. Sending second and further request on a session without read time is prohibited and raises the `IllegalState, "Used read time is not set"` error.

One of the possible solution is to send first write request without read time. In this case remote t-server will select its current time as a read time and it is possible to return it back to client with response. On getting response session should update its read time with read time from the response. This mechanism works for read requests but it is not applicable for write requests. Because it requires that YSQL doesn't call `Perform` method multiple times without waiting for previous response. But due to write operations buffering YSQL may collect huge number of write operations and send them in batches. New batch may be sent before previous one is completed.

So in context of this diff another solution is implemented. YSQL sets same read time for all non-txn write operations in context of current postgres's transaction in case first operation was non-txn write. This solution is tolerant to possible request processing order change by the `PgClientService`. I.e. if YSQL sends 2 batches with non-txn write operations by calling `Perform` RPC twice it doesn't matter in which order the `PgClientService` will process these requests as both will have same read time.

**Note**
Also this diff contains a fix for a problem with "disappearing" of buffered operations.
For optimization purposes currently buffered operations may be combined with read operation in some cases. This is done via intercepting of buffered operations sending process. Container of operations is passed by value into interceptor by using the move semantic. But even in case interception will not be performed buffered operations will still be empty because of moving.

Solution is to pass container of operations by pointer instead of passing it by value.

Original commit: b9b764d / D18079

Test Plan:
Jenkins: rebase: 2.14
New test case was introduced
```
./yb_build.sh --gtest_filter PgOpBufferingTest.FKCheckWithNonTxnWrites
```

Reviewers: sergei, lnguyen, smishra

Reviewed By: lnguyen, smishra

Subscribers: bogdan, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D18201
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/high High Priority
Projects
None yet
Development

No branches or pull requests

3 participants