Skip to content

Commit

Permalink
Add additional str accessor methods for DataArray (#4622)
Browse files Browse the repository at this point in the history
* add type hints for the str accessor class

* allow str accessors to use regular expression objects for regular expressions

* implement casefold and normalize str accessor functions

* implement one-to-many str accessor functions

* implement cat, join, format, +, *, and %

* support elementwise operations in many str accessor functions

* update whats-new.rst, api.rst, and api-hidden.rst

* test fixes

* implement requested fixes

* more fixes

* typing fixes

* fix docstring

* fix more docstring

* remove encoding header

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
  • Loading branch information
toddrjen and max-sixty authored Mar 11, 2021
1 parent c195c26 commit 6ff27ca
Show file tree
Hide file tree
Showing 5 changed files with 5,261 additions and 673 deletions.
13 changes: 13 additions & 0 deletions doc/api-hidden.rst
Original file line number Diff line number Diff line change
Expand Up @@ -324,14 +324,21 @@
core.accessor_dt.TimedeltaAccessor.seconds

core.accessor_str.StringAccessor.capitalize
core.accessor_str.StringAccessor.casefold
core.accessor_str.StringAccessor.cat
core.accessor_str.StringAccessor.center
core.accessor_str.StringAccessor.contains
core.accessor_str.StringAccessor.count
core.accessor_str.StringAccessor.decode
core.accessor_str.StringAccessor.encode
core.accessor_str.StringAccessor.endswith
core.accessor_str.StringAccessor.extract
core.accessor_str.StringAccessor.extractall
core.accessor_str.StringAccessor.find
core.accessor_str.StringAccessor.findall
core.accessor_str.StringAccessor.format
core.accessor_str.StringAccessor.get
core.accessor_str.StringAccessor.get_dummies
core.accessor_str.StringAccessor.index
core.accessor_str.StringAccessor.isalnum
core.accessor_str.StringAccessor.isalpha
Expand All @@ -342,20 +349,26 @@
core.accessor_str.StringAccessor.isspace
core.accessor_str.StringAccessor.istitle
core.accessor_str.StringAccessor.isupper
core.accessor_str.StringAccessor.join
core.accessor_str.StringAccessor.len
core.accessor_str.StringAccessor.ljust
core.accessor_str.StringAccessor.lower
core.accessor_str.StringAccessor.lstrip
core.accessor_str.StringAccessor.match
core.accessor_str.StringAccessor.normalize
core.accessor_str.StringAccessor.pad
core.accessor_str.StringAccessor.partition
core.accessor_str.StringAccessor.repeat
core.accessor_str.StringAccessor.replace
core.accessor_str.StringAccessor.rfind
core.accessor_str.StringAccessor.rindex
core.accessor_str.StringAccessor.rjust
core.accessor_str.StringAccessor.rpartition
core.accessor_str.StringAccessor.rsplit
core.accessor_str.StringAccessor.rstrip
core.accessor_str.StringAccessor.slice
core.accessor_str.StringAccessor.slice_replace
core.accessor_str.StringAccessor.split
core.accessor_str.StringAccessor.startswith
core.accessor_str.StringAccessor.strip
core.accessor_str.StringAccessor.swapcase
Expand Down
20 changes: 20 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -420,38 +420,58 @@ String manipulation
:toctree: generated/
:template: autosummary/accessor_method.rst

DataArray.str._apply
DataArray.str._padder
DataArray.str._partitioner
DataArray.str._re_compile
DataArray.str._splitter
DataArray.str._stringify
DataArray.str.capitalize
DataArray.str.casefold
DataArray.str.cat
DataArray.str.center
DataArray.str.contains
DataArray.str.count
DataArray.str.decode
DataArray.str.encode
DataArray.str.endswith
DataArray.str.extract
DataArray.str.extractall
DataArray.str.find
DataArray.str.findall
DataArray.str.format
DataArray.str.get
DataArray.str.get_dummies
DataArray.str.index
DataArray.str.isalnum
DataArray.str.isalpha
DataArray.str.isdecimal
DataArray.str.isdigit
DataArray.str.islower
DataArray.str.isnumeric
DataArray.str.isspace
DataArray.str.istitle
DataArray.str.isupper
DataArray.str.join
DataArray.str.len
DataArray.str.ljust
DataArray.str.lower
DataArray.str.lstrip
DataArray.str.match
DataArray.str.normalize
DataArray.str.pad
DataArray.str.partition
DataArray.str.repeat
DataArray.str.replace
DataArray.str.rfind
DataArray.str.rindex
DataArray.str.rjust
DataArray.str.rpartition
DataArray.str.rsplit
DataArray.str.rstrip
DataArray.str.slice
DataArray.str.slice_replace
DataArray.str.split
DataArray.str.startswith
DataArray.str.strip
DataArray.str.swapcase
Expand Down
17 changes: 17 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,22 @@ New Features
- Support for `dask.graph_manipulation
<https://docs.dask.org/en/latest/graph_manipulation.html>`_ (requires dask >=2021.3)
By `Guido Imperiale <https://github.com/crusaderky>`_
- Many of the arguments for the :py:attr:`DataArray.str` methods now support
providing an array-like input. In this case, the array provided to the
arguments is broadcast against the original array and applied elementwise.
- :py:attr:`DataArray.str` now supports `+`, `*`, and `%` operators. These
behave the same as they do for :py:class:`str`, except that they follow
array broadcasting rules.
- A large number of new :py:attr:`DataArray.str` methods were implemented,
:py:meth:`DataArray.str.casefold`, :py:meth:`DataArray.str.cat`,
:py:meth:`DataArray.str.extract`, :py:meth:`DataArray.str.extractall`,
:py:meth:`DataArray.str.findall`, :py:meth:`DataArray.str.format`,
:py:meth:`DataArray.str.get_dummies`, :py:meth:`DataArray.str.islower`,
:py:meth:`DataArray.str.join`, :py:meth:`DataArray.str.normalize`,
:py:meth:`DataArray.str.partition`, :py:meth:`DataArray.str.rpartition`,
:py:meth:`DataArray.str.rsplit`, and :py:meth:`DataArray.str.split`.
A number of these methods allow for splitting or joining the strings in an
array. (:issue:`4622`)
- Thanks to the new pluggable backend infrastructure external packages may now
use the ``xarray.backends`` entry point to register additional engines to be used in
:py:func:`open_dataset`, see the documentation in :ref:`add_a_backend`
Expand All @@ -36,6 +52,7 @@ New Features
developed by `B-Open <https://www.bopen.eu>`_.
By `Aureliana Barghini <https://github.com/aurghs>`_ and `Alessandro Amici <https://github.com/alexamici>`_.


Breaking changes
~~~~~~~~~~~~~~~~
- :py:func:`open_dataset` and :py:func:`open_dataarray` now accept only the first argument
Expand Down
Loading

0 comments on commit 6ff27ca

Please sign in to comment.