Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for NormalizeIntensity #6887

Conversation

john-zielke-snkeos
Copy link
Contributor

Description

In order to implement the "nonzero" functionality of the NormalizeIntensity transform a mask is used. In case nonzero is False, the mask is still used, but is initialized to all True/1. This unecessary masking causes a considerable performance hit. The changed implementation forgoes using the mask in case nonzero is False. I ran a quick benchmark on my system comparing the old implementation, the new implementation and the normalization using the wrapper around the torchvision normalize transform. The results were the following, showing a more than 10x performance improvement (notice the times for the old normalize are in milliseconds, the other times are in microseconds):

[-------------- torchvision ---------------]
| cpu | cuda
1 threads: ---------------------------------
(250, 250, 250) | 18847.2 | 1440.5
(100, 100, 100) | 484.6 | 395.5

Times are in microseconds (us).

[--------------- monai ----------------]
| cpu | cuda
1 threads: -----------------------------
(250, 250, 250) | 603.7 | 11.5
(100, 100, 100) | 39.9 | 1.5

Times are in milliseconds (ms).

[------------- monai_improved ------------]
| cpu | cuda
1 threads: --------------------------------
(250, 250, 250) | 17763.2 | 720.0
(100, 100, 100) | 938.0 | 185.2

Times are in microseconds (us).

The benchmarks were created with the following code (the ImprovedNormalizeIntensity class does not exist in the PR, this was my quick fix to have both the old and the new implementation available)

import torch.utils.benchmark as benchmark
import torch
from monai.transforms import TorchVision
from monai.transforms.intensity.array import ImprovedNormalizeIntensity, NormalizeIntensity

shapes = [
      (250, 250, 250),
      (100,100,100)
      ]

normalizers = {
    'torchvision': TorchVision(name="Normalize", mean=1000, std=333),
    'monai': NormalizeIntensity(subtrahend=1000, divisor=333),
    'monai_improved': ImprovedNormalizeIntensity(subtrahend=1000, divisor=333),
}
results = []
for shape in shapes:
    for device in ['cpu', 'cuda']:
        torch_tensor = torch.rand((1,1)+shape).to(device)

        for name, normalizer in normalizers.items():
            t = benchmark.Timer(
                stmt='normalizer(x)',
                globals={'normalizer': normalizer , 'x': torch_tensor},
                label=name,
                sub_label=str(shape),
                description=device,
                num_threads=1,
                )
            results.append(t.blocked_autorange(min_run_time=10))

compare = benchmark.Compare(results)
compare.print()

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Signed-off-by: John Zielke <john.zielke@snkeos.com>
Copy link
Contributor

@wyli wyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it looks good to me.

@wyli
Copy link
Contributor

wyli commented Aug 18, 2023

/build

@wyli wyli enabled auto-merge (squash) August 18, 2023 11:17
@wyli wyli merged commit 8aabdc9 into Project-MONAI:dev Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants