-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge memory consumption in batch jobs on 3D variables #16
Comments
Use the following command to run on multiple cores with dask (no need to use In the python script, add the following lines at start of the script, so dask is aware of multiple cores
Memory issues still persist. Not sure what is going wrong. |
Memory issue is mainly with The code seems to work fine with single core run Also see related issues
|
There is not clear solution yet. Nevertheless, the following seems to help a bit. Using context manager for reading nc files with preprocess kwarg could be helpful in autoclosing the data files that are not required any more.
Also see issue 5322 on dask distributed. There is some on information on file lock worker. Could be related. |
Memory blow-up issues could be related to dask. Dask released a huge update in Nov 2022 (https://www.coiled.io/blog/reducing-dask-memory-usage) and the dask-mpi implementation has improved since then. More testing is required to make sure that it works fine for all data. |
Observation: Rechunking within the code leads to memory blow-up Specifying chunks while reading data (as below) works fine If rechecking is performed within the code (e.g. as below), then the Dask-mpi fails with "out_of_memory" error Cause is not clear. Needs investigation. |
The drift calculation code works fine on jasmin notebooks. However, it returns memory issues on lotus batch jobs. For some reason, the code starts to consume a lot of memory even though it does not require it. The same code works fine for 2D vars. There is even performance improvement with dask-mpi. This needs investigation.
The text was updated successfully, but these errors were encountered: