Correctly handle RaggedArray conversions to numpy arrays #1185
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1158.
This removes all warnings caused by
numpy
conversions of ragged arrays which will be errors innumpy
1.24. In fact there weren't any problems in the library code itself as if you follow the docstrings you will create ragged arrays correctly, but some of the tests used shortcuts instead of the recommended way and these have been changed in this PR.Either of these are correct ways to create a
DataFrame
series that is a ragged array to use indatashader
:The
dtype
is optional forRaggedArray
as it is inferred.The following worked in the past but are incorrect using
numpy
1.24 onwards:The first approach will immediately fail, telling you to use the second
dtype=object
approach. This works for some but not all codepaths indatashader
as it drops important dtype information. Hence avoid both.Eventually the
RaggedArray
pandas extension array withindatashader
will be replaced byawkward-array
and will simplify our code and make it more robust to future changes.