Optimize chained hyperslab selection. #1031

1uc · 2024-07-25T13:13:00Z

A common pattern for creating semi-unstructured selection is to use many
(small) RegularHyperSlab and chain them:

HyperSlab hyperslab;
for(auto slab : regular_hyper_slabs) {
  hyperslab |= slab;
}

This eventually triggers calling:

for(auto slab : regular_hyper_slabs) {
  auto [offset, stride, counts, blocks] = slab;
  H5Sselect_hyperslab(space_id, offset, stride, counts, block);
}

Measurements show that this has runtime that's quadratic in the number
of regular hyper slabs. This starts becoming prohibitive at 10k - 40k
slabs.

We noticed that H5Scombine_select does not suffer from the same
performance issue. This allows us to optimize (long) chain of Op::Or
using divide and conquer.

The current implementation only optimizes streaks of Op::Or.

codecov · 2024-07-25T15:18:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.88%. Comparing base (8145c27) to head (94727b8).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1031      +/-   ##
==========================================
+ Coverage   86.78%   86.88%   +0.09%     
==========================================
  Files         101      101              
  Lines        5964     6008      +44     
==========================================
+ Hits         5176     5220      +44     
  Misses        788      788

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jorblancoa

LGTM!
Did you test it with libsonata to confirm that the regression is fixed?

1uc · 2024-07-26T11:10:54Z

Yes, they provided the following reproducer:

import libsonata
import numpy as np
import time

np.random.seed(42)

sto = libsonata.NodeStorage('sscx-nodes-sonata.h5')
pop = sto.open_population('All')

#ids = np.arange(0, 100000, 2)
count = int(0.01*pop.size)
# count = 100
print(f'selecting {count} from {pop.size}')
ids = np.random.randint(0, pop.size, count)

t1 = time.perf_counter()
sel = libsonata.Selection(ids)
t2 = time.perf_counter()
print(f"elapsed = {t2 - t1}")

print(np.mean(pop.get_attribute('x', sel)))

The selection results in about 41k slabs, which takes 40s with the performance bug and 0.06 - 0.1s with libsonata@master and highfive@1uc/backport-optimize-hyperslab-selection. It takes 0.1 - 0.16s for libsonata@0.1.24.

We also ran their integration tests against the backport of this branch:
https://github.com/BlueBrain/HighFive-testing/actions/runs/10108668331

A common pattern for creating semi-unstructured selection is to use many (small) RegularHyperSlab and chain them: ``` HyperSlab hyperslab; for(auto slab : regular_hyper_slabs) { hyperslab |= slab; } ``` This eventually triggers calling: ``` for(auto slab : regular_hyper_slabs) { auto [offset, stride, counts, blocks] = slab; H5Sselect_hyperslab(space_id, offset, stride, counts, block); } ``` Measurements show that this has runtime that's quadratic in the number of regular hyper slabs. This starts becoming prohibitive at 10k - 40k slabs. We noticed that `H5Scombine_select` does not suffer from the same performance issue. This allows us to optimize (long) chain of `Op::Or` using divide and conquer. The current implementation only optimizes streaks of `Op::Or`.

1uc force-pushed the 1uc/optimize-hyperslab-selection branch 2 times, most recently from e7cf94a to bb47941 Compare July 25, 2024 14:45

1uc marked this pull request as ready for review July 25, 2024 15:30

1uc force-pushed the 1uc/optimize-hyperslab-selection branch from 6b7511b to fc616da Compare July 25, 2024 15:37

mgeplf mentioned this pull request Jul 26, 2024

Performance regression materializing Selection BlueBrain/libsonata#364

Closed

jorblancoa approved these changes Jul 26, 2024

View reviewed changes

1uc added 2 commits July 26, 2024 13:11

Missing inline.

94727b8

1uc force-pushed the 1uc/optimize-hyperslab-selection branch from 82e9b9a to 94727b8 Compare July 26, 2024 11:11

1uc merged commit e9492c1 into master Jul 26, 2024
37 checks passed

1uc deleted the 1uc/optimize-hyperslab-selection branch July 26, 2024 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize chained hyperslab selection. #1031

Optimize chained hyperslab selection. #1031

1uc commented Jul 25, 2024

codecov bot commented Jul 25, 2024 •

edited

Loading

jorblancoa left a comment

1uc commented Jul 26, 2024

Optimize chained hyperslab selection. #1031

Optimize chained hyperslab selection. #1031

Conversation

1uc commented Jul 25, 2024

codecov bot commented Jul 25, 2024 • edited Loading

Codecov Report

jorblancoa left a comment

Choose a reason for hiding this comment

1uc commented Jul 26, 2024

codecov bot commented Jul 25, 2024 •

edited

Loading