Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docdb] Speedup local bootstrap: do not do LogReader::Open twice #2124

Closed
bmatican opened this issue Aug 22, 2019 · 2 comments
Closed

[docdb] Speedup local bootstrap: do not do LogReader::Open twice #2124

bmatican opened this issue Aug 22, 2019 · 2 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature
Milestone

Comments

@bmatican
Copy link
Contributor

Screen Shot 2019-08-21 at 8 42 20 PM

Did a perf record/report during local bootstrap and noticed the above picture.

It seems that we are opening a LogReader instance twice and taking about the same time under it (whooping ~20%). I suspect we could unify that into one single instance, created in the TabletBootstrap::OpenLogReader path and then passed into the new Log instance explicitly...

cc @kmuthukk @rahuldesirazu @hectorgcr

@bmatican bmatican added kind/enhancement This is an enhancement of an existing feature area/docdb YugabyteDB core features labels Aug 22, 2019
@bmatican bmatican self-assigned this Aug 22, 2019
@kmuthukk
Copy link
Collaborator

@bmatican -
Nice find! Guessing this is an older issue and not related to recent optimization work of trying to optimize which WALs to replay (and in the process somehow ending up opening the WAL twice)?

@bmatican
Copy link
Contributor Author

bmatican commented Sep 1, 2019

Forgot to reply to this @kmuthukk , it's not related to that optimization work, nor a regression from @rahuldesirazu 's work on not rebuilding indexes, as much as before that, the code would need 2 passes to open the logs (rebuild, actually read) and we just never removed that, since his work was still under a feature flag.

@bmatican bmatican added this to the v2.1 milestone Sep 1, 2019
@iSignal iSignal assigned iSignal and unassigned bmatican Sep 17, 2019
@rajukumaryb rajukumaryb assigned rajukumaryb and unassigned iSignal Oct 2, 2019
rajukumaryb added a commit that referenced this issue Oct 15, 2019
…nabled

Summary: Use TabletBootstrap::log_->reader_ instead of instantiating TabletBootstrap::log_reader_

Test Plan:
Diff used for BenchmarkA:

```
diff --git a/bin/test_bsopt.py b/bin/test_bsopt.py
index 81767050f..10c4aaf07 100755
--- a/bin/test_bsopt.py
+++ b/bin/test_bsopt.py
@@ -17,13 +17,13 @@ YUGABYTE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
 YBSAMPLEAPPS_DIR = os.path.normpath(os.path.join(YUGABYTE_DIR, '..', 'yb-sample-apps'))

 # Number of tablets to use
-NUM_TABLETS = 2
+NUM_TABLETS = 1

 # Time to run the CQL test for (seconds)
-CQL_TEST_TIME_SEC = 10 * 60
+CQL_TEST_TIME_SEC = 1 * 60

 # Number of trials, both for the optimization on and off
-NUM_TRIALS = 15
+NUM_TRIALS = 10

 def test_cluster(opt_on):
@@ -32,7 +32,7 @@ def test_cluster(opt_on):
         "java", "-jar", YBSAMPLEAPPS_DIR + "/target/yb-sample-apps.jar", "--workload",
         "CassandraBatchKeyValue",
         "--nodes", "127.0.0.1:9042", "--num_threads_read", "1", "--num_threads_write", "1",
-        "--num_unique_keys", "100000", "--nouuid", "--value_size", "1024", "--batch_size", "64"]
+        "--num_unique_keys", "1000000", "--nouuid", "--value_size", "1024", "--batch_size", "64"]
     proc = subprocess.Popen(args)

     # After time is up, kill the test
@@ -41,17 +41,10 @@ def test_cluster(opt_on):
     proc.wait()
     timer.cancel()

-    # Use yb-admin to flush all writes to RocksDB
-    subprocess.check_call(
-        YUGABYTE_DIR +
-        "/build/latest/bin/yb-admin -master_addresses 127.0.0.1 flush_table " +
-        "ybdemo_keyspace cassandrakeyvalue 60",
-        shell=True)
-
     # Restart the cluster
     subprocess.check_call([YUGABYTE_DIR + "/bin/yb-ctl", "stop"])
     subprocess.check_call(
-        YUGABYTE_DIR + "/bin/yb-ctl start --tserver_flags \"skip_flushed_entries=" +
+        YUGABYTE_DIR + "/bin/yb-ctl start --tserver_flags \"skip_flushed_entries=1,log_min_segments_to_retain=4,skip_log_reader_double_open=" +
         str(opt_on).lower() + "\"",
         shell=True)
     time.sleep(10)
@@ -81,7 +74,7 @@ def remake_cluster():
     subprocess.check_call(YUGABYTE_DIR + "/bin/yb-ctl stop", shell=True)
     subprocess.check_call(YUGABYTE_DIR + "/bin/yb-ctl destroy", shell=True)
     subprocess.check_call(
-        YUGABYTE_DIR + "/bin/yb-ctl --num_shards_per_tserver " + str(NUM_TABLETS) + " create",
+        YUGABYTE_DIR + "/bin/yb-ctl --replication_factor 1 --num_shards_per_tserver " + str(NUM_TABLETS) + " create",
         shell=True)
     time.sleep(5)

diff --git a/src/yb/tablet/tablet_bootstrap.cc b/src/yb/tablet/tablet_bootstrap.cc
index 30620d626..2d7b56414 100644
--- a/src/yb/tablet/tablet_bootstrap.cc
+++ b/src/yb/tablet/tablet_bootstrap.cc
@@ -63,6 +63,11 @@ DEFINE_bool(skip_wal_rewrite, true,
 TAG_FLAG(skip_wal_rewrite, experimental);
 TAG_FLAG(skip_wal_rewrite, runtime);

+DEFINE_bool(skip_log_reader_double_open, true,
+            "Do not open log reader twice if skip_wal_rewrite is on.");
+TAG_FLAG(skip_log_reader_double_open, experimental);
+TAG_FLAG(skip_log_reader_double_open, runtime);
+
 DEFINE_test_flag(double, fault_crash_during_log_replay, 0.0,
                  "Fraction of the time when the tablet will crash immediately "
                  "after processing a log entry during log replay.");
@@ -349,7 +354,7 @@ Status TabletBootstrap::Bootstrap(shared_ptr<TabletClass>* rebuilt_tablet,

   bool needs_recovery;
   RETURN_NOT_OK(PrepareToReplay(&needs_recovery));
-  if (needs_recovery && !skip_wal_rewrite_) {
+  if (needs_recovery && (!skip_wal_rewrite_ || !FLAGS_skip_log_reader_double_open)) {
     RETURN_NOT_OK(OpenLogReader());
   }
```

BenchmarkA Results:
        Without optimization:
                1557139 1610705 1757151 1733478 2099114 1553073 1716388 1882349 1680560 1611217
                Mean = 1720117.4, Stdev = 167126.8 (9.7% of Mean)

        With optimization:
                1557828 2027292 1562075 1810216 1619249 1597930 1592700 1919192 1564048 1744308
                Mean = 1699483.8, Stdev = 168471.8 (9.9% of Mean)

        Change in Mean = -1% (not significant)

Setup: GCP - Ubuntu - Release build - 8 Cores - 16 GB RAM

BenchmarkB: PF-1 Cluster
    Delay in bootstrapping all tablets - without Optimization: 110s
    Delay in bootstrapping all tablets - with Optimization: 102s
    Improvement: 7%

    LogReader::Open() overhead observed using `perf record -ag` and `perf report -i perf.data`: 2.4%

Reviewers: rahuldesirazu

Reviewed By: rahuldesirazu

Subscribers: kannan, hector, ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D7352
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants