Refactoring towards GPU support #794

jonmmease · 2019-10-03T11:05:29Z

Overview

This PR lays the foundation for adding GPU support to Datashader. It contains the refactoring portion of #793, without adding the CUDA specific code. After this PR is reviewed and merged, I'll open another PR that adds the cuda support on top of this.

Fix

While refactoring, I found the cause of #792, which is fixed in 30a2ae2. I think we could cherry pick this commit into a separate PR if folks would rather it be separated from this PR.

Changes

Move the data library specific modules (pandas, dask, xarray) under a new data_libraries package. When cudf support is added, a new module will be added here. (258ceca )
Replace the former Glyph._compute_x_bounds and Glyph._compute_y_bounds methods with a single Glyph._compute_bounds method. The old methods assumed the input was a numpy array. The new method checks if it is a numpy array or a pandas series. When cudf is added, this method will also check if the input is a cudf.Series object. (40c3b61 )
Refactor reduction methods to have a separate "append" method for the following cases: No reduction column is used (e.g. ds.count()), an integer reduction column is used (e.g. ds.count('int_column')), a float reduction column is used (e.g. ds.count('float_column')). This is needed because numba's cuda mode does not seem support default arguments, and it doesn't allow the use of isnan on integer columns. (7b31140)
Extract use of np.interp in preparation for swapping in a different implementation for cudf.ndarray objects. (63ec2fd)
Explicitly cast result of np.nonzero to an integer as the cudf equivalent results in scalar array instead. (cc3ff21)
Various cleanup of the _interpolate function, which is called by shade (7a10a49)
- Unify int/float/bool logic paths:
- Add masked_clip_2d function that clips an array of either float or int values. There will be a cupy specific implementation of this later.
- Copy input data once and perform offset in-place
- construct uint32 view once at the end of function rather than in branches
- Don't interpolate r/g/b unnecessarily when a single color is passed as cmap.
Isolate pandas specific code in dshape_from_pandas_helper in preparation for supporting cudf: cca1b87
Add internal _sum_zero reduction to compute sum from an array of zeros. The standard sum reduction starts with an array of nans and the append operation has a different logic branch depending on whether a pixel starts as nan. This approach is not compatible with the cuda atomic operations. For cuda, sum will be implemented by combining _sum_zero with any. (cf7f5c8)
Refactor points, line, and area rendering to extract the inner loop as a cuda compatible function. (b57e0a9)
Expand the bounds and vt tuples in the signatures of numba functions as numba.cuda.jit doesn't seem to support tuples. (b57e0a9, 7fe52db).
Convert tuples of columns passed to numba functions into 2D arrays as as numba.cuda.jit doesn't seem to support tuples. (b57e0a9).
Convert from numpy.log and numpy.isnan to math.log and math.isnan for use in numba functions because numba.cuda.jit doesn't support the numpy versions. (e073961)
parameterize relevant test functions over data frame and array types. This will make it easier to add tests for new data frame (cudf) and array (cupy) types. (5e1d180)
Add custom test equality assertion functions for xarray and pandas. These will provide a central place to handling of numpy-like (e.g. cupy) and pandas-like (e.g. cudf) objects. (45a979d)
Convert the _bases and _temps properties of Reduction objects into methods (_build_bases, _build_temps). This way we can pass configuration options to them (like whether or not to operate in cuda mode). (517fc97)

…s Series

Currently only numpy, but will make it easier to support other numpy-like libraries

Should fix #792

This ends up being a scalar array when called with a cupy argument

- Unify int/float/bool logic paths: - Add masked_clip_2d function that clips an array of either float or int values - Copy input data and offset in place - construct uint32 view once at the end of function rather than in branches - Don't interpolate r/g/b unnecessarily when a single color is passed as cmap.

…rather than nans) Prior structure was not compatible with the numba.cuda.atomic.add operation

…functions

…t using it

These will provide a central place to handling of numpy-like (e.g. cupy) and pandas-like (e.g. cudf) objects.

…objects

This way we can later pass configuration arguments this them

jonmmease · 2019-10-10T10:55:22Z

I think I've figured out a way to support line/area with axis=0 with cuda. So I'm going to do a bit more line/area refactorting on this PR.

jbednar

As of 517fc97, I've reviewed all the commits in this PR, and as far as I can tell they are all mergeable. So, once you're done with any further commits, I should be able to review those quickly and merge the whole PR. Thanks for all this!

datashader/transfer_functions.py

Fix area glyph tests: Slight changes here are due to the fact that these tests are now performed in data, rather than pixel, coordinates

jonmmease · 2019-10-11T13:02:33Z

Ok, I did a bit more refactoring of lines in d7f3650 and area in b5dbf1d.

The goal was to make it possible to support parallelization with axis=0. The reason that this was not as straightforward as axis=1 is because the prior logic to calculate breaks due to nans and out-of-bounds segments was sequential in nature.

I also managed to eliminate a fair amount of code duplication by having the draw_segment and draw_trapezoid functions input values in data, rather than pixel, coordinates. These functions now perform their own skipping/clipping logic, so this logic is no longer duplicated in every line line/area rendering function.

jbednar

The latest commits also look good, thanks. I have only one concern before merging: the changes to line and area handling look extensive enough that they could have performance implications. Have you run a basic performance comparison before and after this PR, to make sure nothing got terribly slower? As long as you've made some basic check like that, I'm happy to merge now.

jonmmease · 2019-10-11T15:25:39Z

Thanks, I've done several benchmarks and I'm seeing equivalent performance (with +-5%). e.g., for 10 million points with count aggregation

points: From 65.2 ms -> 64.3 ms (after)
line: From 256 ms -> 268 ms (after)
area: 1800 ms -> 1750 ms (after)

jbednar · 2019-10-11T15:58:07Z

Looks great, thanks!!

jonmmease added 22 commits October 3, 2019 05:47

Move pandas/dask/xarray pipeline logic to data_libraries package

258ceca

Add Glyph._compute_bounds method that can accept numpy array or panda…

40c3b61

…s Series

Refactor reduction append operations to specialize by column type

7b31140

Abstract reduction._create method over array module

dfb5497

Currently only numpy, but will make it easier to support other numpy-like libraries

Refactor dshape_from_pandas_helper to isolate numpy-specific code

cca1b87

Refactor to extract numpy.interp

63ec2fd

Fix coordinate ordering of Image returned by _colorize [#792]

30a2ae2

Should fix #792

Explicitly cast np.nonzero result to an into

cc3ff21

This ends up being a scalar array when called with a cupy argument

Add internal _sum_zero reduction to compute sum from array of zeros (…

cf7f5c8

…rather than nans) Prior structure was not compatible with the numba.cuda.atomic.add operation

Refactor points, line, and area rendering to extract cuda compatible …

b57e0a9

…functions

numpy.log10 -> math.log10

e073961

compute bounds test updates for new name

858231a

Expand vt/bounds tuples in tests

7fe52db

Remove parameterization of test_area_to_line_autorange_gap that wasn'…

1c2739c

…t using it

Add custom test eqaulity assertion functions for xarray and pandas.

45a979d

These will provide a central place to handling of numpy-like (e.g. cupy) and pandas-like (e.g. cudf) objects.

Parametrize tests so we can later add new pandas-like and numpy-like …

5e1d180

…objects

Remove stray @staticmethod decorator

0a1ae23

flake, unused variable

69475b8

Add missing @staticmethod on any reduction

4a05235

Merge branch 'master' into pre_gpu_refactor

c8328e1

Convert Reduction _bases and _temps properties to methods.

517fc97

This way we can later pass configuration arguments this them

jonmmease added the wip label Oct 10, 2019

jbednar approved these changes Oct 10, 2019

View reviewed changes

datashader/transfer_functions.py Outdated Show resolved Hide resolved

jonmmease added 3 commits October 11, 2019 08:42

Refactor lines to support axis=0 parallelization.

d7f3650

Refactor area glyph to support parallelization with axis=0

b5dbf1d

Fix area glyph tests: Slight changes here are due to the fact that these tests are now performed in data, rather than pixel, coordinates

No need to cast type before before getting shape

f68295f

flakes: unused variables

c6f2580

jonmmease removed the wip label Oct 11, 2019

Merge branch 'master' into pre_gpu_refactor

31fe827

jbednar approved these changes Oct 11, 2019

View reviewed changes

jbednar merged commit 13d651e into master Oct 11, 2019

jbednar mentioned this pull request Dec 23, 2020

Dead kernel in 3_Interactive.ipynb #976

Closed

maximlt deleted the pre_gpu_refactor branch December 25, 2021 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring towards GPU support #794

Refactoring towards GPU support #794

jonmmease commented Oct 3, 2019 •

edited by jbednar

Loading

jonmmease commented Oct 10, 2019

jbednar left a comment

jonmmease commented Oct 11, 2019

jbednar left a comment

jonmmease commented Oct 11, 2019

jbednar commented Oct 11, 2019

Refactoring towards GPU support #794

Refactoring towards GPU support #794

Conversation

jonmmease commented Oct 3, 2019 • edited by jbednar Loading

Overview

Fix

Changes

jonmmease commented Oct 10, 2019

jbednar left a comment

Choose a reason for hiding this comment

jonmmease commented Oct 11, 2019

jbednar left a comment

Choose a reason for hiding this comment

jonmmease commented Oct 11, 2019

jbednar commented Oct 11, 2019

jonmmease commented Oct 3, 2019 •

edited by jbednar

Loading