Skip to content

Commit

Permalink
Add more details to E2E benchmarks README (#395)
Browse files Browse the repository at this point in the history
Add the following to E2E benchmarks README to make it more consumable to general audience:

- Instructions to set up Morpheus Dev container
- Info on how to manage Morpheus configs for each workflow
- Add warmup option to account for ONNX->TRT conversion during first run
- Information on fields added to JSON report by custom hook

Authors:
  - Eli Fajardo (https://github.com/efajardo-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #395
  • Loading branch information
efajardo-nv authored Oct 14, 2022
1 parent 1dcf216 commit 93f8bda
Showing 1 changed file with 111 additions and 14 deletions.
125 changes: 111 additions & 14 deletions tests/benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,39 +47,136 @@ Once Triton server finishes starting up, it will display the status of all loade
+--------------------+---------+--------+
```

### Set up Morpheus Dev Container

If you don't already have the Morpheus Dev container, run the following to build it:
```
./docker/build_container_dev.sh
```

Now run the container:
```
./docker/run_container_dev.sh
```

Note that Morpheus containers are tagged by date. By default, `run_container_dev.sh` will try to use current date as tag. Therefore, if you are trying to run a container that was not built on the current date, you must set the `DOCKER_IMAGE_TAG` environment variable. For example,
```
DOCKER_IMAGE_TAG=dev-221003 ./docker/run_container_dev.sh
```

In the `/workspace` directory of the container, run the following to compile Morpheus:
```
./scripts/compile.sh
```

Now install Morpheus:
```
pip install -e /workspace
```

Fetch input data for benchmarks:
```
./scripts/fetch_data.py fetch validation
```


### Run E2E Benchmarks

Benchmarks are run using `pytest-benchmark`. Benchmarks for an individual workflow can be run using the following:
Benchmarks are run using `pytest-benchmark`. By default, there are five rounds of measurement. For each round, there will be one iteration of each workflow. Measurements are taken for each round. Final results such as `min`, `max` and `mean` times will be based on these measurements.

To provide your own calibration or use other `pytest-benchmark` features with these workflows, please refer to their [documentation](https://pytest-benchmark.readthedocs.io/en/latest/).

Morpheus configurations for each workflow are managed using `e2e_test_configs.json`. For example, this is the Morpheus configuration for `sid_nlp`:
```
"test_sid_nlp_e2e": {
"file_path": "../../models/datasets/validation-data/sid-validation-data.csv",
"repeat": 10,
"num_threads": 8,
"pipeline_batch_size": 1024,
"model_max_batch_size": 64,
"feature_length": 256,
"edge_buffer_size": 4
},
...
```

Benchmarks for an individual workflow can be run using the following:

```
cd tests/benchmarks
pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py::<test-workflow>
pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_pipelines.py::<test-workflow>
```
The `-s` option allows outputs of pipeline execution to be displayed so you can ensure there are no errors while running your benchmarks.

`<test-workflow>` is the name of the test to run benchmarks on. This can be `test_sid_nlp_e2e`, `test_abp_fil_e2e`, `test_phishing_nlp_e2e` or `test_cloudtrail_ae_e2e`.
The `--benchmark-warmup` and `--benchmark-warmup-iterations` options are used to run the workflow(s) once before starting measurements. This is because the models deployed to Triton are configured to convert from ONNX to TensorRT on first use. Since the conversion can take a considerable amount of time, we don't want to include it in the measurements.

`<test-workflow>` is the name of the test to run benchmarks on. This can be one of the following:
- `test_sid_nlp_e2e`
- `test_abp_fil_e2e`
- `test_phishing_nlp_e2e`
- `test_cloudtrail_ae_e2e`

For example, to run E2E benchmarks on the SID NLP workflow:
```
pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py::test_sid_nlp_e2e
pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_pipelines.py::test_sid_nlp_e2e
```

To run E2E benchmarks on all workflows:
```
pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py
pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_pipelines.py
```

The console output should look like this:
```
------------------------------------------------------------------------------------------- benchmark: 4 tests ------------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_phishing_nlp_e2e 834.5413 (1.0) 892.8774 (1.0) 858.9724 (1.0) 22.5832 (1.0) 854.7082 (1.0) 31.7465 (1.0) 2;0 1.1642 (1.0) 5 1
test_sid_nlp_e2e 2,055.0733 (2.46) 2,118.1255 (2.37) 2,095.8951 (2.44) 26.2586 (1.16) 2,105.8771 (2.46) 38.5301 (1.21) 1;0 0.4771 (0.41) 5 1
test_abp_fil_e2e 5,016.7639 (6.01) 5,292.9841 (5.93) 5,179.0901 (6.03) 121.5466 (5.38) 5,195.2253 (6.08) 215.2213 (6.78) 1;0 0.1931 (0.17) 5 1
test_cloudtrail_ae_e2e 6,929.7436 (8.30) 7,157.0487 (8.02) 6,995.1969 (8.14) 92.8935 (4.11) 6,971.9611 (8.16) 87.2056 (2.75) 1;1 0.1430 (0.12) 5 1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------- benchmark: 4 tests --------------------------------------------------------------------------------
Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_sid_nlp_e2e 1.8907 (1.0) 1.9817 (1.0) 1.9400 (1.0) 0.0325 (2.12) 1.9438 (1.0) 0.0297 (1.21) 2;0 0.5155 (1.0) 5 1
test_cloudtrail_ae_e2e 3.3403 (1.77) 3.3769 (1.70) 3.3626 (1.73) 0.0153 (1.0) 3.3668 (1.73) 0.0245 (1.0) 1;0 0.2974 (0.58) 5 1
test_abp_fil_e2e 5.1271 (2.71) 5.3044 (2.68) 5.2083 (2.68) 0.0856 (5.59) 5.1862 (2.67) 0.1653 (6.75) 1;0 0.1920 (0.37) 5 1
test_phishing_nlp_e2e 5.6629 (3.00) 6.0987 (3.08) 5.8835 (3.03) 0.1697 (11.08) 5.8988 (3.03) 0.2584 (10.55) 2;0 0.1700 (0.33) 5 1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
```

### Benchmarks Report

Each time you run the benchmarks as above, a comprehensive report for each run will be generated and saved to a JSON file in `./tests/benchmarks/.benchmarks`. The file name will begin
with `000N` where N is incremented for every run. For example, the report file name for first benchmarks run will look like:
```
0001_dacccac5198c7eeddc477794bc278028e739c2cd_20220929_182030.json
```

A hook to `pytest-benchmark` was developed to add the following information to the JSON report:

GPU(s) used by Morpheus. For example:
```
"gpu_0": {
"id": 0,
"name": "Quadro RTX 8000",
"load": "0.0%",
"free_memory": "42444.0MB",
"used_memory": "6156.0MB",
"temperature": "61.0 C",
"uuid": "GPU-dc32de82-bdaa-2d05-2abe-260a847e1989"
}
```

A comprehensive report for each test run will be saved to a JSON file in `./tests/benchmarks/.benchmarks`. This will include throughput (lines/sec, bytes/sec), GPU info and Morpheus configs for each test workflow.
Morpheus config for each workflow:
- num_threads
- pipeline_batch_size
- model_max_batch_size
- feature_length
- edge_buffer_size

Additional benchmark stats for each workflow:
- input_lines
- min_throughput_lines
- max_throughput_lines
- mean_throughput_lines
- median_throughput_lines
- input_bytes
- min_throughput_bytes
- max_throughput_bytes
- mean_throughput_bytes
- median_throughput_bytes

0 comments on commit 93f8bda

Please sign in to comment.