Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distr txns: for multi-key updates, reads occasionally seeing inconsistent writetime #287

Closed
kmuthukk opened this issue May 15, 2018 · 2 comments
Assignees

Comments

@kmuthukk
Copy link
Collaborator

For multi-key updates (i.e. keys that may fall on different shards) with distributed transactions enabled, the reads for the two keys in the same select statement are sometimes seeing an inconsistent writetime.

I am seeing this behavior when using the sample app CassandraTransactionalKeyValue.java.

The inserts in this app perform two key updates for the same key prefix. For example, for a key prefix like 100, it updates two keys: 'key:100_1' and 'key:100_2'. These two keys, in the common case, may fall on different shards because the primary key is hash partitioned in the test.

The reads portion of the test then reads both the keys and verifies that their "writetime" (the hybrid time of the write operation) is the same via a SELECT statement of the form:

For example, something like:

  SELECT k, v, writetime(v) 
      FROM <table> 
      WHERE k in ('key_100_1', 'key_100_2');

Issue:

When concurrently running read and write operations, we are occasionally running into a 'writetime' mismatch error for some keys suggesting that we are sometimes reading an inconsistent cut of the two keys.

Could it be that for the two keys we are reading, we are not using the same "read point"?

Sample errors I see during the run:

2018-05-15 14:58:20,756 [FATAL|com.yugabyte.sample.apps.CassandraKeyValue|CassandraTransactionalKeyValue]
Writetime mismatch for key: Key: 6865, value: val:6865, 1526396271311166 vs 1526396300753247

Rough steps to repro.

Note: A RF=1 (replication factor) cluster is sufficient to repro this. First perform about 10K writes (each comprising of a write to two keys):

% ~/yugabyte-1.0.1.0/bin/yb-ctl destroy
% ~/yugabyte-1.0.1.0/bin/yb-ctl --rf 1 create
% java -jar ~/yugabyte-1.0.1.0/java/yb-sample-apps.jar \
       --workload CassandraTransactionalKeyValue \
       --nodes 127.0.0.1:9042  --num_threads_write 8 --num_threads_read 0 \
       --nouuid --num_unique_keys 10000 --num_writes 10000

Now that the keys have been written, in one window we keep running the write operation (which keeps overwriting the same 10K keys in a loop):

# Window 1
% for i in {1..2}; do \
      java -jar ~/yugabyte-1.0.1.0/java/yb-sample-apps.jar \
         --workload CassandraTransactionalKeyValue --nodes 127.0.0.1:9042  \
         --num_threads_write 8 --num_threads_read 0 --nouuid \
         --num_unique_keys 10000 --num_writes 10000; \
    done

And in a second window, run another instance of the same app but in pure read mode. We need to specify the max key (max_written_key) up to which it is OK to read.

# Window 2: Read indefinitely and at random keys with prefix up to 9999.

% java -jar ~/yugabyte-1.0.1.0/java/yb-sample-apps.jar \
      --workload CassandraTransactionalKeyValue --nodes 127.0.0.1:9042 \
      --read_only --num_threads_read 8 --nouuid \
      --max_written_key 9999
@kmuthukk
Copy link
Collaborator Author

kmuthukk commented May 17, 2018

[capturing offline discussion here]

@robertpang mentioned that when we read the keys from potentially different tablets, we just need to pick a common read-point and send that along with the read request rather than letting each tablet a make local/independent decision on the read-point.

yugabyte-ci pushed a commit that referenced this issue Jun 11, 2018
…ed transaction

Summary: Previously, reading data within a transaction was always consistent, but not when it is outside of a transaction. This revision implements consistent read outside of a transaction also.

Test Plan: TestTransaction.testReadRestarts

Reviewers: mikhail, sergei

Reviewed By: mikhail, sergei

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D4943
@robertpang
Copy link
Contributor

Fixed in commit f20bb98.

jasonyb pushed a commit that referenced this issue Jun 11, 2024
Adding SQL file for PG15. Initial version is a replica of PG14 SQL file.
Changes will be required in this file to fully support newly introduced columns
in PG15.

This currently fixes compilation and make install targets.

Also, updated the first line comments where the SQL version was mentioend as
1.1. The version remains at 1.0 for the time being.
devansh-ism pushed a commit to devansh-ism/yugabyte-db that referenced this issue Jul 17, 2024
Resolves yugabyte#287.

Also removed outdated manual custom schema installation instructions
from the README.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants