-
Notifications
You must be signed in to change notification settings - Fork 634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
output_dft should use parallel I/O under MPI #1707
Comments
The same problem happens with |
cc @oskooi I often see similar behavior (although never this extreme). Usually each "field dump" operation (e.g. when pulling fields after forward and adjoint runs) induces a spike in memory usage that's 25-35% the current consumption (e.g. from 150 GB to 200 GB and then back down). This might be due to the gathering that happens when the user makes a call like this. Simply put, every proc receives a copy of all the DFT fields (this is certainly the case when using the python interface). The distributed-memory paradigm breaks down at these IO junctions. We've talked about getting around this for adjoint optimization (where the forward and adjoint fields are always locally stored on each proc that owns them, and then the final recombination step is also performed locally). It might be nice to generalize that approach (e.g. when you just want to dump the fields and aren't necessarily doing adjoint optimization). An even better solution (IMHO) is to use the hybrid multithreading/multiprocessing approach in #1628. This only requires one process per node. |
For For |
It looks like the Line 1129 in ef2c9ca
This should really be fixed so that each process only computes and writes its own portion of the DFT data. |
@stevengj @HomerReid Is it crucial to set Line 1129 in ef2c9ca
|
No, simply changing |
I use meep with python interface under MPI on a cluster with 50-100 processes. With certain test parameters memory usage during simulation is ~25GB and
run()
finishes successfully. After the simulation I try to save the frequency domain fields to HDF5 usingoutput_dft()
. Onoutput_dft()
rapid increase in memory usage to >100GB occurs which exceeds the memory allocated for the simulation so I lose the simulation results. I also tried withget_dft_array()
and saving to HDF5 manually with h5py and experienced the same problem. It is important that when usingget_dft_array()
memory leak occurs onget_dft_array()
operation itself i.e, before the saving to HDF5. So the issue seems not to be related to HDF5. As I understand,output_dft()
method just calls corresponding C++ method and creates no python objects. So this issue, as I can see, is neither related to python interface.Is such behavior normal when using many MPI processes or is this a problem with meep?
I built meep from source from master branch.
Thanks in advance.
The text was updated successfully, but these errors were encountered: