-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement iloc-getitem using parse-don't-validate approach #13534
Merged
rapids-bot
merged 25 commits into
rapidsai:branch-23.08
from
wence-:wence/fea/indexing-parse
Jul 14, 2023
Merged
Implement iloc-getitem using parse-don't-validate approach #13534
rapids-bot
merged 25 commits into
rapidsai:branch-23.08
from
wence-:wence/fea/indexing-parse
Jul 14, 2023
Commits on Jun 23, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 93c1d21 - Browse repository at this point
Copy the full SHA 93c1d21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 64b093e - Browse repository at this point
Copy the full SHA 64b093eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b649bd - Browse repository at this point
Copy the full SHA 6b649bdView commit details -
Implement iloc-getitem using parse-don't-validate approach
To simplify the low-level implementation of iloc-based getitem on both Series and DataFrames, change the dispatching approach to parse the user-provided "unstructured" key into structured data (a tagged union using an enum + tuple). At the libcudf level, there are four styles of indexing we can do: 1. index by slice 2. index by mask 3. index by map 4. index by scalar iloc keys are parsed into information that tags them by type and normalises the key to an appropriate column or other low-level object. This centralises the business logic for index parsing in a single place, and ensures that downstream consumers of the validated and normalised indexer don't need to inspect it again to determine what to do. Note that we treat index by scalar as composition of index by map with get_element (since that simplifies the logic when extracting the single row of a dataframe: we want to keep it on device), but the scalar "type tag" allows us to determine this unambiguously without reinspecting the key. The major benefits will come when updating loc-based getitem (where the parsing rules are more complicated, but eventually turn into one of the above four cases). In this latter case, we will no longer attempt to turn a loc-based key into a "user-facing" key for iloc, but rather will call directly into the pre-parsed interface. That said, we already provide some performance improvements since we only do inspection once. - Closes rapidsai#13013 - Closes rapidsai#13267 - Closes rapidsai#13515
Configuration menu - View commit details
-
Copy full SHA for 5094f49 - Browse repository at this point
Copy the full SHA 5094f49View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8ad58a8 - Browse repository at this point
Copy the full SHA 8ad58a8View commit details -
Use _gather for scalar indexing
Can't use libcudf.copying.gather since we need to do some post-processing on categorical and struct columns. Staying in the Series API gets us that for free.
Configuration menu - View commit details
-
Copy full SHA for b43d93a - Browse repository at this point
Copy the full SHA b43d93aView commit details -
Introduce GatherMap and BooleanMask
Also use dataclasses as poor man's ADTs rather than tuple with tag field. Some renaming.
Configuration menu - View commit details
-
Copy full SHA for a479a34 - Browse repository at this point
Copy the full SHA a479a34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5e4af4a - Browse repository at this point
Copy the full SHA 5e4af4aView commit details -
Configuration menu - View commit details
-
Copy full SHA for bc44c3f - Browse repository at this point
Copy the full SHA bc44c3fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 503f4ae - Browse repository at this point
Copy the full SHA 503f4aeView commit details
Commits on Jun 29, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 19637fa - Browse repository at this point
Copy the full SHA 19637faView commit details -
Configuration menu - View commit details
-
Copy full SHA for dbf56b8 - Browse repository at this point
Copy the full SHA dbf56b8View commit details
Commits on Jun 30, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ad1b21a - Browse repository at this point
Copy the full SHA ad1b21aView commit details
Commits on Jul 11, 2023
-
Refactor GatherMap and BooleanMask construction
Rather than having free functions to construct the witness types, the default constructor validates correctness, and a classmethod from_column_unchecked allows one to build a witness type asserting correctness by fiat.
Configuration menu - View commit details
-
Copy full SHA for b763ebb - Browse repository at this point
Copy the full SHA b763ebbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 12e66fc - Browse repository at this point
Copy the full SHA 12e66fcView commit details -
Configuration menu - View commit details
-
Copy full SHA for b92638d - Browse repository at this point
Copy the full SHA b92638dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 99d3da1 - Browse repository at this point
Copy the full SHA 99d3da1View commit details -
Configuration menu - View commit details
-
Copy full SHA for b046539 - Browse repository at this point
Copy the full SHA b046539View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1ace86a - Browse repository at this point
Copy the full SHA 1ace86aView commit details -
Configuration menu - View commit details
-
Copy full SHA for e547372 - Browse repository at this point
Copy the full SHA e547372View commit details -
Configuration menu - View commit details
-
Copy full SHA for 892ee14 - Browse repository at this point
Copy the full SHA 892ee14View commit details
Commits on Jul 12, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 803fbc0 - Browse repository at this point
Copy the full SHA 803fbc0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 762eb1c - Browse repository at this point
Copy the full SHA 762eb1cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 943c58e - Browse repository at this point
Copy the full SHA 943c58eView commit details
Commits on Jul 13, 2023
-
Configuration menu - View commit details
-
Copy full SHA for dffdc4e - Browse repository at this point
Copy the full SHA dffdc4eView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.