diff --git a/debezium/perf/README.md b/debezium/perf/README.md index 3b5e2d475ec..03ccaefd700 100644 --- a/debezium/perf/README.md +++ b/debezium/perf/README.md @@ -3,16 +3,16 @@ The docker compose file in this directory is similar to the one in the ../demo directory, with additional images for running Materialize and Deephaven side-by-side -with the same debezium and Kafka input stream being +with the same Debezium and Kafka input stream being fed to both. Please see `../demo/README.md` for a general reference that also applies to this setup; in particular, the 'How to run' section should -apply verbatim, substituing `debezium/demo` +apply verbatim, substituting `debezium/demo` for this directory, `debezium/perf`. -Please see the "Memmory and CPU requirements" +Please see the "Memory and CPU requirements" section below; as this compose is performance analysis oriented, it has considerably larger requirements than our other @@ -23,7 +23,7 @@ feature-oriented demos. On top of what is required for `../demo` (see `../demo/README.md`), the automated testing requires building the Deephaven Java client examples. -At the toplevel directory of your git clone (`../..`), run: +At the top level directory of your git clone (`../..`), run: ``` ./gradlew java-client-session-examples:installDist @@ -167,7 +167,7 @@ can make them execute their respective demo scripts. After running the commands above, both Materialize and Deephaven should be running semantically-equivalent -demo scripts fed from debezium and kafka events +demo scripts fed from Debezium and Kafka events (triggered by the loadgen script). You can increase or decrease the rate of events using @@ -187,8 +187,7 @@ for instructions. SELECT total, to_timestamp(max_received_at) max_received_ts, - mz_logical_timestamp()/1000.0 AS logical_ts_ms, - mz_logical_timestamp()/1000.0 - max_received_at AS dt_ms + mz_logical_timestamp - 1000*max_received_at AS dt_ms FROM pageviews_summary;' -U materialize -h localhost -p 6875 ``` @@ -196,12 +195,12 @@ for instructions. The parameters used for images in the docker compose file in this directory are geared towards high message throughput. While Deephaven -itself is running with the same default configuration used for -general demos (as of this writing, 4 cpus and 4 Gb of memory), the -configurations for redpanda, mysql, and debezium are tweaked to reduce -their impact in end-to-end latency and throughput measurements; -we make extensive use of RAM disks (tmpfs) and increase some -parameters to ones closer to production (e.g., redpanda's number +itself is running with the same default configuration used for general +demos (as of this writing, 4 CPU threads and an update cycle of 1 +second), the configurations for redpanda, MySQL, and Debezium are +tweaked to reduce their impact in end-to-end latency and throughput +measurements; we make extensive use of RAM disks (tmpfs) and increase +some parameters to ones closer to production (e.g., redpanda's number of cpus and memory per core). To get a full picture of the configuration used, consult the files: @@ -210,13 +209,13 @@ configuration used, consult the files: * `docker-compose.yml` * `.env` -Once started the compose will take around 6 Gb -of memory from the host; as events arrive and -specially if event rates are increased, it -will increase to 10-16 Gb or more. +Once started the compose will take around 3 GiB of memory from the +host; as events arrive and specially if event rates are increased, it +will increase. To test rates of the order of 100k msg/sec, either of +Deephaven or Materialize will need on the order of 12 GiB. -For the mild initial rate (same as default demo in `../demo`), -the compose will consume around 2 CPU threads -(tested in a Xeon E5 2698 v4 CPU). -For increased event rates (eg, 50,000 pageviews per second), -CPU utilization will spike to 14 CPU threads or more. +For the mild initial rate (same as default demo in `../demo`), once +the demo code is loaded in both engines the compose will consume +around 2 full CPUs (tested in a Xeon E5 2698 v4 CPU). At high even +rates, deephaven will take up to 3 CPU threads. Materialize will take +as many full cores as configured in their `-w` argument. diff --git a/debezium/perf/run_experiment.sh b/debezium/perf/run_experiment.sh index 0f846f0fdac..8d146ec8380 100755 --- a/debezium/perf/run_experiment.sh +++ b/debezium/perf/run_experiment.sh @@ -1,10 +1,18 @@ -#!/bin/sh +#!/bin/bash set -eu -if [ $# -ne 5 -o \( "$1" != 'dh' -a "$1" != 'mz' \) ]; then +usage_and_exit() { echo "Usage: $0 dh|mz per_second_rate wait_seconds top_samples top_delay_seconds" 1>&2 exit 1 +} + +if [ "$#" -ne 5 ]; then + usage_and_exit +fi + +if [ "$1" != 'dh' -a "$1" != 'mz' ]; then + usage_and_exit fi engine="$1" diff --git a/debezium/perf/sample_top.sh b/debezium/perf/sample_top.sh index 671b6ce2efb..e3f0aed4d6c 100755 --- a/debezium/perf/sample_top.sh +++ b/debezium/perf/sample_top.sh @@ -2,8 +2,17 @@ set -eu -if [ $# -ne 3 -o \( "$1" != 'dh' -a "$1" != 'mz' \) ]; then +usage_and_exit() { echo "Usage: $0 mz|dh rate nsamples delay_sec" + exit 1 +} + +if [ "$#" -ne 3 ]; then + usage_and_exit +fi + +if [ "$1" != 'dh' -a "$1" != 'mz' ]; then + usage_and_exit fi engine=$1