Optimize to_cupy
and values
#11648
Labels
improvement
Improvement / enhancement to an existing function
Performance
Performance related issue
Python
Affects Python cuDF API.
Milestone
Currently
series.values
and especiallyseries.to_cupy()
are substantially slower thancupy.asarray(series)
.There are at least two obvious potential culprits in
Frame._to_array
(the underlying method forto_cupy
):copy=False
.find_common_dtype
, which is slow (and slower forDataFrame
s with many columns):The implementation of
values
drops down toColumnBase.values
and requires some deeper consideration. However, since we use.values
frequently internally (and we occasionally useto_cupy
) we are likely giving up a lot of performance. We should profile these functions to determine the bottlenecks, and if there are valid reasons for them we should establish some policies on how to select the right function to use when performing these conversions to arrays internally. While this exact analogy does not hold forDataFrame
(because that doesn't support the conversion to an array), any optimization that we make forSeries
will likely also help speed upDataFrame
operations.The text was updated successfully, but these errors were encountered: