-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix astype in coordinates matching when sampling data frames #126
Conversation
This looks like a great catch! Two comments:
|
@tom-andersson if I understand well, the proposed unit test should test a Dataframe to sample with different dtypes in the |
@acocac I wouldn't put the unit test in That unit test should fail before your fix, and pass afterwards. |
@tom-andersson thanks for the suggestions. I've added the unit test, however, I've found the example of sampling the toy data frame at [0,0] isn't the most useful to indicate the effect of forcing x1/x2 dtype. I think the problem relates to inaccuracies in the floats with values between 0 and 1 and their impact in np.in1d The proposed test actually fails with/without the fix, so I'd appreciate if you can have a look at the test 🙏 |
… if numpy array dtype is wrong
…re casting errors)
@acocac the unit test was failing because of a bug in the way the DataFrame is being sampled, which triggered an InvalidSamplingStrategy error. I'll fix this in a separate commit. Regarding the type issue, you were not actually passing the TaskLoader a numpy array with a different dtype to the index values. I fixed that, and then found that the unit test was still failing even after casting the dtype. Thinking more about this, I think it is not safe to do casting, as it doesn't guarantee that coords of different dtypes will match. I've decided to instead raise an informative error for the user if the dtypes differ, rather than casting under the hood, and I've updated the unit test to reflect this. Happy to merge this once tests are passing. Thanks again for finding this! |
@all-contributors please add @acocac for code, tests |
I've put up a pull request to add @acocac! 🎉 |
The proposed solution sounds reasonable. I agree it could be safer to do casting from the user-side. The warning message helps! |
The PR fixes the sampling strategy when locations are
np.ndarray
. It also amends the docs of thesample_df
function.Fixes #125