Replies: 1 comment 1 reply
-
I have had this issue too. I've found it very difficult to debug; my sense is that it's a dask memory leak, for which there are many issues. If anyone has other insight or reproducible examples, it would be useful to making progress on these. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
This is related to thread #5367
I have bunch of
.nc
files inside my local directory. I can read them easily using:This loads the data perfectly. All of
.nc
files stacked alongnew_dim
. Now I want to write this stacked dataset back to local directory using dask chunking to avoid memory issues, something like this:all_ds.to_netcdf("huge_dataset.nc"))
However my memory size increases incrementally and I get Memory Error. Any idea why is dask not doing its magic here ?
INSTALLED VERSIONS
commit: None
python: 3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-64-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.18.2
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.6.3
netCDF4: 1.5.6
pydap: None
h5netcdf: None
h5py: 3.2.1
Nio: None
zarr: 2.8.1
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.2.2
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.04.1
distributed: 2021.04.1
matplotlib: 3.4.1
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 52.0.0.post20210125
pip: 21.0.1
conda: 4.10.1
pytest: 6.2.4
IPython: 7.22.0
sphinx: None
Beta Was this translation helpful? Give feedback.
All reactions