-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument #1849
Comments
Not sure if the attachment came through. Trying again: |
Thanks for the report. This seems like a bug to me and I'm frankly not sure why it isn't working. I'll look into it more. |
This happened to me today after introducing some modifications in a code that was working fine. I have tried to trace it without success. Finally, I found a workaround which consist on removing the "contiguous" entry from the .encoding attributes. This works with gerritholl's file:
So it seems that this entry in the encoding dictionaries is triggering the error. OK, so I guess that this explains it, from the netCDF4 documentation: "contiguous: if True (default False), the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error." This is quite an obscure error right now, so maybe we could force contiguous to be False when unlimited_dims is being used, or either raise a more informative error. |
@markelg - thanks for digging into this a bit. Based on what you're saying, I think we need to raise an informative error here. |
I also ran into this issue : to_netcdf fails for my dataset. Here is how to reproduce the error : (the testfile is attached here 1.zip )
And the ouput I get :
|
Workaround for pydata/xarray#1849
I apparently have this problem too. |
@jmccreight Are you up for sending in a PR to raise an informative error message? |
I could be persuaded. I just dont understand how 'contiguous' gets set on the encoding of these variables and if that is appropriate. Does that seem obvious/clear to anyone? I still dont understand why this is happening for me. I made some fairly small modifications to some code that never threw this error in the past. The small mods could have done it, but the identical code on my laptop did not throw this error on a small sample dataset. Then I went to cheyenne, where all bets are off! |
does ncdump -sh show whether contiguous is true? |
Here's what I understand so far.
The error that is thrown is, just the tail end of it:
If I go to line 464 in
but the ncdump -sh shows it's actually chunked. I'm not sure this is exactly what's raising the error down the line, but these two things seem to be at odds. My current question is "why does If you have any insights let me know. I probably wont have time to mess with this until next week. |
Because it's set in your input file. Both example files in this thread have
When you ask xarray to write out an unlimited dimension, it doesn't delete It's probable that the underlying software you're using to write has probably changed versions and is setting it by default. You can check this by comparing the output of If this is right, the solution would be either My preference is for (a). |
@dcherian Thanks, First, I think you're right that the Second, my example shows something more slightly complicated than the original example which was also not clear to me. In my case the unlimited dimension ( This makes sense upon a slightly more nuanced reading of the netcdf4 manual (as quoted my markelg)
The last sentence apparently means that for any variable with an unlimited dimension the use of I propose that the solution should be both A final question: should the encoding['contiguous'] be removed from the xarray variable or should it just be removed for purposes of writing it to ncdf4 on disk? I suppose a user could be writing the xarray dataset to another format that might allow what netcdf does not allow. This should be an easy detail. I'll make a PR with the above and we can evaluate the concrete changes. |
For some datafiles with properties I cannot quite reproduce,
.to_netcdf
leads to aRuntimeError: NetCDF: Invalid argument
if and only if I pass anunlimited_dims
corresponding toy
. The problem is hard to reproduce. It happens to this particular dataset, but not to seemingly identical ones created from scratch. I attachsample.nc
(gzipped so github would let me upload it).Output of
xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.6.3.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
xarray: 0.10.0+dev39.ge31cf43
pandas: 0.22.0
numpy: 1.14.0
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: None
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.16.1
distributed: None
matplotlib: 2.1.2
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 38.4.0
pip: 9.0.1
conda: 4.3.16
pytest: 3.1.2
IPython: 6.1.0
sphinx: 1.6.2
sample.nc.gz
The text was updated successfully, but these errors were encountered: