nvFuser Python Benchmarks

The python benchmarks use pytest-benchmark and torch.profiler.

Adding a benchmark

To benchmark any target function, use run_benchmark (python_benchmarks/core.py):

run_benchmark(benchmark, target_function, function_inputs, rounds=10, warmup_rounds=1)

Arguments:

benchmark: pytest-benchmark fixture passed to every function intended to be run as a benchmark by pytest.
target_function: Function to benchmark
function_inputs: List of inputs to the target_function
rounds: Number of rounds the target_function is run (Default = 10).
warmup_rounds: Number of warmup rounds the target_function is run before benchmarking (Default = 1).

Example:

# Parametrize over any number of arguments (e.g., input sizes, dtypes)
@pytest.mark.parametrize("param1", ...)
@pytest.mark.parametrize("param2", ...)
def test_example_benchmark(````
    benchmark, param1, param2, ...
):
   # Setup function inputs
   run_benchmark(benchmark, target_function, function_inputs)

The benchmark name should start with test_ to be automatically discovered by pytest.

Executing benchmarks

Running a benchmark file: pytest [options] <benchmark-file>.
Running the complete benchmark suite: pytest [options] python_benchmarks/
Sharding: Pytest is memory-intensive resulting in CPU OOMs when running a large number of tests. Sharding is recommended when running the complete benchmarking suite. We use pytest-shard in our CI. To execute a specific shard with n total shards:

pytest --shard-id=i --num-shards=n [options] where i = {0..n-1}.

Some useful options for running benchmarks:

Pytest/Pytest-benchmark options:

Filtering benchmarks: -k <filter>
Saving benchmarks: --benchmark-save=NAME, --benchmark-autosave, --benchmark-json=PATH
Debugging: --benchmark-verbose.

Custom command-line options:

Disable output validation: --disable-validation Skips the output validation in the nvFuser benchmarks.
Disable benchmarking: --disable-benchmarking Skips the nvFuser benchmarking, useful for only testing correctness of fusion definitions without benchmarking the fusions.
Run eager mode benchmarks: --benchmark-eager
Run torch.compile mode benchmarks: --benchmark-torchcompile

Resources:

Pytest: https://pytest-benchmark.readthedocs.io/en/latest/
Pytest-benchmarks: https://pytest-benchmark.readthedocs.io/en/latest/index.html
Pytest-shard: https://pypi.org/project/pytest-shard/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvFuser Python Benchmarks

Adding a benchmark

Executing benchmarks

Some useful options for running benchmarks:

Resources:

Clone this wiki locally