This repository has been archived by the owner on Jul 3, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 37
+371
−15
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Importing `typing` and using that as a prefix was getting unsightly. So moved to importing the types explicitly.
Related to issue #191, this commit is to help surface index type issues. Specifically: 1. Warn if there are index type mismatches. 2. Require you to set your logger to debug if you want to see more details. 3. Provide a "ResultBuilder" class that uses strict index type matching so if you want to error on index type mismatches, this is the results builder to use. I don't think we should build anything more custom unless there's a clear common use case - user contributed result builders sound like an interesting idea.
skrawcz
force-pushed
the
add_pandas_index_checks
branch
from
September 22, 2022 19:26
41c65d8
to
7305ad0
Compare
Pandas dropped support for python 3.6 in something like 1.2. So pandas 1.1.5 is what we're using in our CI system, and that does not have the `NDArrayBackedExtensionIndex` type. So I'm guessing here, but looking at the 1.1.5 source, we instead want `ExtensionIndex`.
skrawcz
force-pushed
the
add_pandas_index_checks
branch
from
September 22, 2022 19:28
7305ad0
to
23eb685
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good -- curious on some decisions though around time-series
TIL: you can create a dataframe and pass in an index object and it'll happily use it as a column. So this test should exist for dataframe creation since it's a valid case. But for the index type checking, I'm adding it here even though it does not have an explicit index. Therefore, one could make the argument it doesn't qualify here. But, I'd rather push people to be explicit in their code, e.g. if they want to be strict on indexes, then they should make the index a series, rather than passing an Index object.
So that way it's clearer what's going on and why. I decided to use private functions to the static ones because I don't really want them used outside of that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is all good except the time-series stuff is confusing me.
elijahbenizzy
approved these changes
Oct 14, 2022
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds some basic index type checking to df creation
Related to issue #191, this commit is to help surface index type issues.
Specifically:
want to error on index type mismatches, this is the results builder to use.
I don't think we should build anything more custom unless there's a clear
common use case - user contributed result builders sound like an interesting idea here though.
To use the new result builder the code should be the following:
Changes
Testing
Notes
Checklist
Testing checklist
Python - local testing