Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigint-syrk-blas: account for MPI shared memory limits #207

Closed
4 tasks done
vasdommes opened this issue Mar 3, 2024 · 0 comments
Closed
4 tasks done

bigint-syrk-blas: account for MPI shared memory limits #207

vasdommes opened this issue Mar 3, 2024 · 0 comments
Assignees
Milestone

Comments

@vasdommes
Copy link
Collaborator

Subtasks:

See details in https://github.com/davidsd/sdpb/blob/bigint-syrk-blas/src/sdp_solve/SDP_Solver/run/bigint_syrk/Readme.md

For large problems, shared memory windows for P and/or Q do not fit into memory. Note also that sometimes HPC administrators may set limit for shared window size e.g. to 50% of available RAM, so that exceeding this limit may result in a warning and (most probably) crash:

It appears as if there is not enough space for
/tmp/openmpi-sessions-***@n131702_0/23041/1/0/shared_window_4.n131702
(the shared-memory backing file).
It is likely that your MPI job will now either abort or experience performance degradation.

Local host: n131702
Space Requested: 6710890760 B
Space Available: 6433673216 B

We can introduce new command-line argument for SDPB, e.g. --maxSharedMemory=128G #203

If total size for P and Q windows exceeds maxSharedMemory, we should split them and process part by part, #204 #205

We can also limit shared window size automatically, using our memory estimates for blocks together with MemAvailable from /proc/meminfo #206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant