Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency issue causes significant memory leak on RMG Py3 #1850

Closed
kspieks opened this issue Dec 13, 2019 · 4 comments
Closed

Dependency issue causes significant memory leak on RMG Py3 #1850

kspieks opened this issue Dec 13, 2019 · 4 comments
Assignees

Comments

@kspieks
Copy link
Contributor

kspieks commented Dec 13, 2019

Bug Description

Significant memory leaks have been observed when using an rmg_env created around early Oct 2019. Memory leaks are not observed when using a recently created rmg_env, whose various dependencies are at an updated version.

Below is the output from diffing the output from conda list from the two rmg_env (this output is tab spaced and easier to read that exporting the environment and diffing the yml files). The older rmg_env is listed first, hence the older version number for each dependency.

It is unclear which of these dependencies is responsible for the memory leak, but no memory leak is observed with the newer dependencies. The recommendation is to remove and recreate the rmg_env if you created an rmg_env for Python 3.7 around early Oct 2019 and you think you are experiencing memory leaks. Using the exact dependencies in this file resolves the issue (just change the extension to .yml since GitHub only lets me upload .txt files, not .yml files), though just using conda env create -f environment.yml and letting conda pull the latest dependencies should also work well.
Functional_rmg_env.txt

diff Original_RMG_Server_rmg_env.txt Erebor_rmg_env.txt 
6c6
< absl-py                   0.8.0                    py37_0  
> absl-py                   0.8.1                    py37_0  
9c9
< attrs                     19.2.0                     py_0  
> attrs                     19.3.0                     py_0  
22c22
< cffi                      1.12.3           py37h2e261b9_0  
> cffi                      1.13.2           py37h2e261b9_0  
24a25
> cloudpickle               1.2.2                      py_0  
30,31c31,32
< dbus                      1.13.6               h746ee38_0  
< decorator                 4.4.0                    py37_1  
> dbus                      1.13.12              h746ee38_0  
> decorator                 4.4.1                      py_0  
33c34
< descriptastorus           2.0.0                      py_0    rmg
> descriptastorus           2.2.0                      py_0    rmg
43c44
< future                    0.17.1                   py37_0  
> future                    0.18.2                   py37_0  
45c46
< glib                      2.56.2               hd408876_0  
> glib                      2.63.1               h5a9c865_0  
47c48
< google-pasta              0.1.7                      py_0  
> google-pasta              0.1.8                      py_0  
58c59
< hyperopt                  0.1.2                      py_0    conda-forge
> hyperopt                  0.2.2                      py_0    conda-forge
60c61
< importlib-metadata        1.2.0                    pypi_0    pypi
> importlib_metadata        0.23                     py37_0  
62,63c63,64
< ipykernel                 5.1.2            py37h39e3cac_0  
< ipython                   7.8.0            py37h39e3cac_0  
> ipykernel                 5.1.3            py37h39e3cac_0  
> ipython                   7.9.0            py37h39e3cac_0  
69c70
< joblib                    0.13.2                   py37_0  
> joblib                    0.14.0                     py_0  
71c72
< jsonschema                3.0.2                    py37_0  
> jsonschema                3.1.1                    py37_0  
75c76
< jupyter_core              4.6.0                    py37_0  
> jupyter_core              4.6.1                    py37_0  
86c87
< libprotobuf               3.9.2                hd408876_0  
> libprotobuf               3.10.1               hd408876_0  
89c90
< libtiff                   4.0.10               h2733197_2  
> libtiff                   4.1.0                h2733197_0  
103c104
< mkl_fft                   1.0.14           py37ha843d7b_0  
> mkl_fft                   1.0.15           py37ha843d7b_0  
107c108
< more-itertools            8.0.2                    pypi_0    pypi
> more-itertools            7.2.0                    py37_0  
109c110
< nbconvert                 5.6.0                    py37_1  
> nbconvert                 5.6.1                    py37_0  
112c113
< networkx                  2.3                        py_0  
> networkx                  2.4                        py_0  
115c116
< notebook                  6.0.1                    py37_0  
> notebook                  6.0.2                    py37_0  
123c124
< pandas                    0.25.1           py37he6710b0_0  
> pandas                    0.25.3           py37he6710b0_0  
132,133c133,134
< pillow                    6.2.0            py37h34e0f95_0  
< pip                       19.2.3                   py37_0  
> pillow                    6.2.1            py37h34e0f95_0  
> pip                       19.3.1                   py37_0  
138,139c139,140
< protobuf                  3.9.2            py37he6710b0_0  
< psutil                    5.6.3            py37h7b6447c_0  
> protobuf                  3.10.1           py37he6710b0_0  
> psutil                    5.6.5            py37h7b6447c_0  
149c150
< pyparsing                 2.4.2                      py_0  
> pyparsing                 2.4.5                      py_0  
152c153
< pyrsistent                0.15.4           py37h7b6447c_0  
> pyrsistent                0.15.5           py37h7b6447c_0  
154,155c155,156
< python                    3.7.4                h265db76_1  
< python-dateutil           2.8.0                    py37_0  
> python                    3.7.5                h0371630_0  
> python-dateutil           2.8.1                      py_0  
164c165
< rdkit                     2019.03.4.0      py37hc20afe1_1    rdkit
> rdkit                     2019.09.1.0      py37hc20afe1_1    rdkit
170c171
< setuptools                41.4.0                   py37_0  
> setuptools                41.6.0                   py37_0  
172,173c173,174
< six                       1.12.0                   py37_0  
< sqlite                    3.30.0               h7b6447c_0  
> six                       1.13.0                   py37_0  
> sqlite                    3.30.1               h7b6447c_0  
183c184
< testpath                  0.4.2                    py37_0  
> testpath                  0.4.4                      py_0  
187c188
< tqdm                      4.36.1                     py_0  
> tqdm                      4.38.0                     py_0  
199c200
< zipp                      0.6.0                    pypi_0    pypi
> zipp                      0.6.0                      py_0  
201c202
< zstd                      1.3.7                h0b5b093_0 
\ No newline at end of file
> zstd                      1.3.7                h0b5b093_0  
\ No newline at end of file

How To Reproduce

Recreating an rmg_env with the identical dependencies using this yml file should reproduce the issue, but don't do that to yourself! (GitHub doesn't let me upload .yml files. Just change the extension from .txt to .yml)
Memory_Leak_mg_env.txt

@kspieks kspieks self-assigned this Dec 13, 2019
@kspieks
Copy link
Contributor Author

kspieks commented Dec 13, 2019

@mliu49 @amarkpayne @mjohnson541
This issue can be closed anytime. It's just here to document the differences and provide a list of working dependencies in case that ever becomes useful.

@mliu49
Copy link
Contributor

mliu49 commented Dec 13, 2019

I think this is most likely due to rdkit/rdkit#2639. We do a lot of RDKit atom creation during RMG jobs (for aromaticity perception and identifier generation). The timing for their fix to the issue also seems to align well.

If we confirm this, then we should increase the RDKit version requirement in our environment file.

@kspieks
Copy link
Contributor Author

kspieks commented Dec 13, 2019

Indeed, taking the good rmg_env and downgrading to rdkit 2019.03.4.0 reproduces the memory leak exactly. Interestingly, the mystery 5th process when requesting 4 processors is not present; so this is either due to another dependency or perhaps how that dependency interacts with the RMG server vs Erebor. Either way, the conclusion is to update the rdkit version using conda install -c rdkit rdkit=2019.09.1.0 to avoid significant memory leaks. Please view PR #1851 to update our environment.yml file.
@everyone: Please check your statistic.xls files occasionally to see if memory usage is much higher than usual. If so, please check the dependencies. Thanks!

@JacksonBurns
Copy link
Contributor

Closing since this change was implemented. I will, however, tag this issue in the latest RDKit upgrade that we are undergoing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants