Skip to content

Commit

Permalink
Provide workaround for cupy.percentile bug(#3315)
Browse files Browse the repository at this point in the history
Ensure that the 100th quantile value returned by cupy.percentile is the maximum of the input array rather than (possibly) NaN due to cupy/cupy#4451. This eliminates an intermittent failure observed in tests of KBinsDiscretizer, which makes use of cupy.percentile. Note that this includes an alteration of the included sklearn code and should be reverted once the upstream cupy issue is resolved.

Resolve failure due to ValueError described in #2933.

Authors:
  - William Hicks <whicks@nvidia.com>

Approvers:
  - Dante Gama Dessavre
  - Victor Lafargue

URL: #3315
  • Loading branch information
wphicks authored Dec 17, 2020
1 parent 2316937 commit 550121b
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,10 @@ def fit(self, X, y=None):
elif self.strategy == 'quantile':
quantiles = np.linspace(0, 100, n_bins[jj] + 1)
bin_edges[jj] = np.asarray(np.percentile(column, quantiles))
# Workaround for https://github.com/cupy/cupy/issues/4451
# This should be removed as soon as a fix is available in cupy
# in order to limit alterations in the included sklearn code
bin_edges[jj][-1] = col_max

elif self.strategy == 'kmeans':
# Deterministic initialization with uniform spacing
Expand Down
1 change: 0 additions & 1 deletion python/cuml/test/test_preprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,6 @@ def test_robust_scale_sparse(sparse_clf_dataset, # noqa: F811
@pytest.mark.parametrize("n_bins", [5, 20])
@pytest.mark.parametrize("encode", ['ordinal', 'onehot-dense', 'onehot'])
@pytest.mark.parametrize("strategy", ['uniform', 'quantile', 'kmeans'])
@pytest.mark.xfail(strict=False)
def test_kbinsdiscretizer(blobs_dataset, n_bins, # noqa: F811
encode, strategy):
X_np, X = blobs_dataset
Expand Down

0 comments on commit 550121b

Please sign in to comment.