Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] loc-based indexing of DataFrames silently discards missing keys if at least one key is present in indexer #13379

Open
Tracked by #12793
wence- opened this issue May 18, 2023 · 1 comment · May be fixed by #13717
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@wence-
Copy link
Contributor

wence- commented May 18, 2023

Describe the bug

When performing loc indexing of a DataFrame, if one asks for a missing key, pandas raises a KeyError. In constrast, cudf only does so if none of the requested keys are in the index. If at least one requested key is present then the subsetted data frame with that key is returned and the missing keys are silently dropped.

Series loc indexing does the right thing here and raises KeyError if any keys are missing.

Steps/Code to reproduce bug

import pandas as pd
import cudf

# same failure with rangeindex too.
df = pd.DataFrame({"A": range(5)}, index=list(range(5)))
cdf = cudf.from_pandas(df)

df.loc[[0, 5]] # 5 is missing, raises KeyError

cdf.loc[[0, 5]]
#    A
# 0  0

Expected behavior

Should match pandas behaviour and raise KeyError.

@wence- wence- added bug Something isn't working Python Affects Python cuDF API. labels May 18, 2023
@wence- wence- self-assigned this May 18, 2023
@wence- wence- changed the title [BUG] loc-based indexing of DataFrames silently discards missing keys if at least one key is present in indexer [BUG] loc-based indexing of DataFrames silently discards missing keys if at least one key is present in indexer and index is sorted May 18, 2023
@wence- wence- changed the title [BUG] loc-based indexing of DataFrames silently discards missing keys if at least one key is present in indexer and index is sorted [BUG] loc-based indexing of DataFrames silently discards missing keys if at least one key is present in indexer May 18, 2023
@wence-
Copy link
Contributor Author

wence- commented May 18, 2023

TBF, pandas used to behave like this too (but it's been removed for a while).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

1 participant