Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: .reset_index() doesn't work with GeoSeries anymore #849

Closed
thomcom opened this issue Dec 8, 2022 · 0 comments · Fixed by #856
Closed

[BUG]: .reset_index() doesn't work with GeoSeries anymore #849

thomcom opened this issue Dec 8, 2022 · 0 comments · Fixed by #856
Assignees
Labels
3 - Ready for Review Ready for review by team bug Something isn't working non-breaking Non-breaking change

Comments

@thomcom
Copy link
Contributor

thomcom commented Dec 8, 2022

Version

22.12

On which installation method(s) does this occur?

Rapids-Compose

Describe the issue

In working out #848 I tried resetting the index before using align=False.

mixed_geoseries[0:3].reset_index() results in an error. mixed_geoseries[0:3].reset_index(drop=True) returns None instead of a GeoSeries.

The error in the first case is that cudf doesn't understand the geometry type. Hopefully there is an internals-based workaround, like a cudf.frame method I can inherit from and overload, to avoid this issue. I think that .reset_index() worked when a GeoSeries was a hidden int type column, which resetting that is probably the easiest, though suboptimal, fix.

The error in the second case is unclear.

Use the mixed_geoseries object from the minimum reproducer in #848.

Minimum reproducible example

mixed_geoseries[0:3].reset_index()
mixed_geoseries[0:3].reset_index(drop=True)

Relevant log output

In [28]: mixed_geoseries[0:3].reset_index()
Out[28]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/compose/etc/conda/cuda_11.6/envs/rapids/lib/python3.8/site-packages/IPython/core/formatters.py:706, in PlainTextFormatter.__call__(self, obj)
    699 stream = StringIO()
    700 printer = pretty.RepresentationPrinter(stream, self.verbose,
    701     self.max_width, self.newline,
    702     max_seq_length=self.max_seq_length,
    703     singleton_pprinters=self.singleton_printers,
    704     type_pprinters=self.type_printers,
    705     deferred_pprinters=self.deferred_printers)
--> 706 printer.pretty(obj)
    707 printer.flush()
    708 return stream.getvalue()

File ~/compose/etc/conda/cuda_11.6/envs/rapids/lib/python3.8/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj)
    407                         return meth(obj, self, cycle)
    408                 if cls is not object \
    409                         and callable(cls.__dict__.get('__repr__')):
--> 410                     return _repr_pprint(obj, self, cycle)
    412     return _default_pprint(obj, self, cycle)
    413 finally:

File ~/compose/etc/conda/cuda_11.6/envs/rapids/lib/python3.8/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle)
    776 """A pprint that just redirects to the normal repr function."""
    777 # Find newlines and replace them with p.break_()
--> 778 output = repr(obj)
    779 lines = output.splitlines()
    780 with p.group():

File ~/compose/etc/conda/cuda_11.6/envs/rapids/lib/python3.8/contextlib.py:75, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     72 @wraps(func)
     73 def inner(*args, **kwds):
     74     with self._recreate_cm():
---> 75         return func(*args, **kwds)

File ~/cudf/python/cudf/cudf/core/dataframe.py:1862, in DataFrame.__repr__(self)
   1860 @_cudf_nvtx_annotate
   1861 def __repr__(self):
-> 1862     output = self._get_renderable_dataframe()
   1863     return self._clean_renderable_dataframe(output)

File ~/cudf/python/cudf/cudf/core/dataframe.py:1855, in DataFrame._get_renderable_dataframe(self)
   1852     lower = cudf.concat([lower_left, lower_right], axis=1)
   1853     output = cudf.concat([upper, lower])
-> 1855 output = self._clean_nulls_from_dataframe(output)
   1856 output._index = output._index._clean_nulls_from_index()
   1858 return output

File ~/cudf/python/cudf/cudf/core/dataframe.py:1772, in DataFrame._clean_nulls_from_dataframe(self, df)
   1769 if is_list_dtype(df._data[col]) or is_struct_dtype(df._data[col]):
   1770     # TODO we need to handle this
   1771     pass
-> 1772 elif df._data[col].has_nulls():
   1773     df[col] = df._data[col].astype("str").fillna(cudf._NA_REP)
   1774 else:

File column.pyx:123, in cudf._lib.column.Column.has_nulls()

File column.pyx:252, in cudf._lib.column.Column.null_count.__get__()

File column.pyx:317, in cudf._lib.column.Column.compute_null_count()

File column.pyx:318, in cudf._lib.column.Column.compute_null_count()

File column.pyx:387, in cudf._lib.column.Column._view()

File types.pyx:239, in cudf._lib.types.dtype_to_data_type()

TypeError: data type 'geometry' not understood

In [29]: mixed_geoseries[0:3].reset_index(drop=True)
Out[29]: 
In [30]:

Environment details

No response

Other/Misc.

No response

@thomcom thomcom added bug Something isn't working Needs Triage Need team to review and classify labels Dec 8, 2022
@thomcom thomcom self-assigned this Dec 12, 2022
@harrism harrism moved this to In Progress in cuSpatial Dec 12, 2022
@thomcom thomcom moved this from In Progress to Review in cuSpatial Dec 13, 2022
@thomcom thomcom added 3 - Ready for Review Ready for review by team non-breaking Non-breaking change and removed Needs Triage Need team to review and classify labels Dec 14, 2022
@rapids-bot rapids-bot bot closed this as completed in #856 Dec 15, 2022
Repository owner moved this from Review to Done in cuSpatial Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant