Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using min() with skipna=True #3290

Closed
zxdawn opened this issue Sep 7, 2019 · 8 comments
Closed

Using min() with skipna=True #3290

zxdawn opened this issue Sep 7, 2019 · 8 comments

Comments

@zxdawn
Copy link

zxdawn commented Sep 7, 2019

MCVE Code Sample

from datetime impo

rt datetime
import xarray as xr
import os

def read_data(f, composition, west, east, north, south):
    # read data
    ds = xr.open_dataset(f, group='PRODUCT')
    # subset to region
    index = ((ds.longitude > west) & (ds.longitude < east))
    ds = ds.where(index)
    # read composition
    data = ds[composition][0,:,:]
    data_units = data.units
    # read time
    t = ds['time_utc']
    st = datetime.strptime(str(t.min(skipna=True).values), '%Y-%m-%dT%H:%M:%S.%fZ')
    et = datetime.strptime(str(t.max(skipna=True).values), '%Y-%m-%dT%H:%M:%S.%fZ')

    # read lon and lat
    lon = data.coords['longitude']
    lat = data.coords['latitude']

    return lon, lat, data, data_units, st, et

datadir = '/xin/data/TROPOMI/GZ/bug'
os.chdir(datadir)
west = 112.5; east = 114.5; north = 24; south = 22.5;

f = 'S5P_NRTI_L2__O3_____20190825T053303_20190825T053803_09659_01_010107_20190825T061441.nc'
lon, lat, data, data_units, st, et = read_data(f, 'ozone_total_vertical_column',
                                                  west, east, north, south)

Problem Description

You can download the data from google drive.
I get errors shown in details, even using skipna=True.

Traceback (most recent call last): File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/duck_array_ops.py", line 236, in f return func(values, axis=axis, **kwargs) File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/nanops.py", line 77, in nanmin 'min', dtypes.get_pos_infinity(a.dtype), a, axis) File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/nanops.py", line 69, in _nan_minmax_object data = dtypes.fill_value(value.dtype) if valid_count == 0 else data AttributeError: module 'xarray.core.dtypes' has no attribute 'fill_value'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "bug.py", line 31, in
west, east, north, south)
File "bug.py", line 16, in read_data
st = datetime.strptime(str(t.min(skipna=True).values), '%Y-%m-%dT%H:%M:%S.%fZ')
File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/common.py", line 25, in wrapped_func
skipna=skipna, allow_lazy=True, **kwargs)
File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/dataarray.py", line 1597, in reduce
var = self.variable.reduce(func, dim, axis, keep_attrs, **kwargs)
File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/variable.py", line 1354, in reduce
axis=axis, **kwargs)
File "/public/software/anaconda/anaconda3/envs/behr/lib/python3.6/site-packages/xarray-0.11.3-py3.6.egg/xarray/core/duck_array_ops.py", line 249, in f
raise NotImplementedError(msg)
NotImplementedError: min is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12 or newer to use skipna=True or skipna=None

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 | packaged by conda-forge | (default, Feb 20 2019, 02:51:38) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.0.76-0.11-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2

xarray: 0.11.3
pandas: 0.20.3
numpy: 1.13.1
scipy: 0.19.1
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
PseudonetCDF: None
rasterio: 1.0.21
cfgrib: None
iris: None
bottleneck: None
cyordereddict: None
dask: 1.1.2
distributed: None
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: 0.9.0
setuptools: 36.4.0
pip: 9.0.1
conda: None
pytest: None
IPython: None
sphinx: None

@max-sixty
Copy link
Collaborator

Thanks for the issue @zxdawn . Did you try doing this?

NotImplementedError: min is not available with skipna=False with the installed version of numpy; upgrade to numpy 1.12

@zxdawn
Copy link
Author

zxdawn commented Sep 7, 2019

@max-sixty Actually, I'm using numpy = 1.13.1 and I need skipna= True. Don't understand the error it shows.

@shoyer
Copy link
Member

shoyer commented Sep 7, 2019

I think this may have been fixed by #2924 (which removed the line with dtypes.fill_value(value.dtype) if valid_count == 0 else data)

Can you try upgrading to xarray 0.12.3?

@zxdawn
Copy link
Author

zxdawn commented Sep 8, 2019

@shoyer Thank. It works now. But, I get another question.
This is the result of t = ds['time_utc']:

<xarray.DataArray 'time_utc' (time: 1, scanline: 357, ground_pixel: 450)>
array([[[nan, nan, ..., nan, nan],
        [nan, nan, ..., nan, nan],
        ...,
        [nan, nan, ..., nan, nan],
        [nan, nan, ..., nan, nan]]], dtype=object)
Coordinates:
  * scanline      (scanline) float64 1.0 2.0 3.0 4.0 ... 354.0 355.0 356.0 357.0
  * ground_pixel  (ground_pixel) float64 1.0 2.0 3.0 4.0 ... 448.0 449.0 450.0
  * time          (time) datetime64[ns] 2019-08-25
Attributes:
    long_name:  Time of observation as ISO 8601 date-time string

If I want to get the minimum value by t.min(skipna=True), I get the strange type:

<xarray.DataArray 'time_utc' ()>
array(<xarray.core.dtypes.AlwaysGreaterThan object at 0x7f96ac188550>,
      dtype=object)

Can't convert it to string by str(t.min(skipna=True)).

@keewis
Copy link
Collaborator

keewis commented Sep 8, 2019

do you actually have any non-nan values in your array? From what I understand of how nanops work is that AlwaysGreaterThan should only be returned by min() if there are no non-nan values.

@zxdawn
Copy link
Author

zxdawn commented Sep 8, 2019

@keewis I tried to using np.isnan(t.values).all() to check whether it's all nan. But, I got this error:

    print (np.isnan(t.values).all())
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

This is the type of t.values: <class 'numpy.ndarray'>

@shoyer
Copy link
Member

shoyer commented Sep 8, 2019 via email

@zxdawn
Copy link
Author

zxdawn commented Sep 8, 2019

@shoyer Thanks. It's not datetime64 arrays, this is the result of np.isnat(t):

  File "/public/software/anaconda/anaconda3/envs/python36/lib/python3.6/site-packages/xarray-0.12.3-py3.6.egg/xarray/core/arithmetic.py", line 69, in __array_ufunc__
    dask='allowed')
  File "/public/software/anaconda/anaconda3/envs/python36/lib/python3.6/site-packages/xarray-0.12.3-py3.6.egg/xarray/core/computation.py", line 969, in apply_ufunc
    keep_attrs=keep_attrs)
  File "/public/software/anaconda/anaconda3/envs/python36/lib/python3.6/site-packages/xarray-0.12.3-py3.6.egg/xarray/core/computation.py", line 217, in apply_dataarray_vfunc
    result_var = func(*data_vars)
  File "/public/software/anaconda/anaconda3/envs/python36/lib/python3.6/site-packages/xarray-0.12.3-py3.6.egg/xarray/core/computation.py", line 564, in apply_variable_ufunc
    result_data = func(*input_data)
TypeError: ufunc 'isnat' is only defined for datetime and timedelta.

I use pd.isnull(t).all() to check it, it works. Actually it's all nan.
There's something wrong with the nc file, I will contact the data center.
Thank you for all your help :)

@zxdawn zxdawn closed this as completed Sep 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants