Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make interpolation after crash optional #113

Merged
merged 11 commits into from
Jul 11, 2024
2 changes: 1 addition & 1 deletion .github/workflows/run-unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
- name: Install Micro Manager and uninstall pyprecice
working-directory: micro-manager
run: |
pip3 install --user .
pip3 install --user .[sklearn]
pip3 uninstall -y pyprecice

- name: Run micro_manager unit test
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## latest

- Make `mpi4py` and `sklearn` optional dependencies
tjwsch marked this conversation as resolved.
Show resolved Hide resolved
- Set time step of micro simulation in the configuration, and use it in the coupling https://github.com/precice/micro-manager/pull/112
- Add a base class called `MicroManager` with minimal API and member function definitions, rename the existing `MicroManager` class to `MicroManagerCoupling` https://github.com/precice/micro-manager/pull/111
- Handle calling `initialize()` function of micro simulations written in languages other than Python https://github.com/precice/micro-manager/pull/110
Expand Down
7 changes: 7 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,13 @@ The Micro Manager uses the output functionality of preCICE, hence these data set
</participant>
```

## Interpolate a crashed micro simulation

If the optional dependency `sklearn' is installed, the Micro Manager can interpolate a crashed micro simulation. To interpolate a crashed micro simulation, set
`"interpolate_crash": "True"` in the `simulation_params` section of the configuration file.
tjwsch marked this conversation as resolved.
Show resolved Hide resolved

For more details on the interpolation see the [interpolation documentation](tooling-micro-manager-running.html/#what-happens-when-a-micro-simulation-crashes).
tjwsch marked this conversation as resolved.
Show resolved Hide resolved

## Next step

After creating a configuration file you are ready to [run the Micro Manager](tooling-micro-manager-running.html).
12 changes: 11 additions & 1 deletion docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,13 @@ The Micro Manager package has the name [micro-manager-precice](https://pypi.org/
pip install --user micro-manager-precice
```

Unless already installed, the dependencies will be installed by `pip` during the installation procedure. preCICE itself needs to be installed separately. If you encounter problems in the direct installation, see the [dependencies section](#required-dependencies) below.
Unless already installed, the dependencies will be installed by `pip` during the installation procedure. To [interpolate](tooling-micro-manager-running.html/#what-happens-when-a-micro-simulation-crashes) micro simulation results after a micro crash, the optional dependency `sklearn` is required. To install `micro-manager-precice` with `sklearn`, run
tjwsch marked this conversation as resolved.
Show resolved Hide resolved

```bash
pip install --user micro-manager-precice[sklearn]
```

preCICE itself needs to be installed separately. If you encounter problems in the direct installation, see the [dependencies section](#required-dependencies) and [optional dependency section](#optional-dependencies) below.

### Option 2: Install manually

Expand All @@ -31,6 +37,10 @@ Ensure that the following dependencies are installed:
* [numpy](https://numpy.org/install/)
* [mpi4py](https://mpi4py.readthedocs.io/en/stable/install.html)

#### Optional dependencies

* [sklearn](https://scikit-learn.org/stable/index.html)

#### Clone the Micro Manager

```bash
Expand Down
2 changes: 1 addition & 1 deletion docs/running.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ mpiexec -n micro-manager-precice micro-manager-config.json

## What Happens When a Micro Simulation Crashes?

If a micro simulation crashes, the Micro Manager attempts to continue running. The error message from the micro simulation, along with the macro location are logged in the Micro Manager log file. The Micro Manager continues the simulation run even if a micro simulation crashes. Results of the crashed micro simulation are generated by interpolating results of a certain number of similar running simulations. The [inverse distance weighed](https://en.wikipedia.org/wiki/Inverse_distance_weighting) method is used. If more than 20% of global micro simulations crash or if locally no neighbors are available for interpolation, the Micro Manager terminates.
If a micro simulation crashes and the Micro Manager is [configured to interpolate](tooling-micro-manager-configuration.html/#Interpolate-a-crashed-micro-simulation) a crashed micro simulation, the Micro Manager attempts to continue running. The error message from the micro simulation, along with the macro location are logged in the Micro Manager log file. The Micro Manager continues the simulation run even if a micro simulation crashes. Results of the crashed micro simulation are generated by interpolating results of a certain number of similar running simulations. The [inverse distance weighed](https://en.wikipedia.org/wiki/Inverse_distance_weighting) method is used. If more than 20% of global micro simulations crash or if locally no neighbors are available for interpolation, the Micro Manager terminates.
tjwsch marked this conversation as resolved.
Show resolved Hide resolved
17 changes: 17 additions & 0 deletions micro_manager/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ def __init__(self, logger, config_filename):

self._output_micro_sim_time = False

self._interpolate_crash = False

self._adaptivity = False
self._adaptivity_type = "local"
self._data_for_adaptivity = dict()
Expand Down Expand Up @@ -200,6 +202,10 @@ def read_json(self, config_filename):
self._write_data_names["active_state"] = False
self._write_data_names["active_steps"] = False

if "interpolate_crash" in data["simulation_params"]:
if data["simulation_params"]["interpolate_crash"] == "True":
self._interpolate_crash = True

try:
diagnostics_data_names = data["diagnostics"]["data_from_micro_sims"]
assert isinstance(
Expand Down Expand Up @@ -445,3 +451,14 @@ def get_micro_dt(self):
Size of the micro time window.
"""
return self._micro_dt

def interpolate_crashed_micro_sim(self):
"""
Check if user wants crashed micro simulations to be interpolated.

Returns
-------
interpolate_crash : bool
True if crashed micro simulations need to be interpolated, False otherwise.
"""
return self._interpolate_crash
109 changes: 73 additions & 36 deletions micro_manager/micro_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,11 @@
from .adaptivity.local_adaptivity import LocalAdaptivityCalculator
from .domain_decomposition import DomainDecomposer
from .micro_simulation import create_simulation_class
from .interpolation import Interpolation

try:
from .interpolation import Interpolation
except ImportError:
Interpolation = None
IshaanDesai marked this conversation as resolved.
Show resolved Hide resolved

sys.path.append(os.getcwd())

Expand Down Expand Up @@ -67,8 +71,16 @@ def __init__(self, config_file: str) -> None:
self._is_micro_solve_time_required = self._config.write_micro_solve_time()

# Parameter for interpolation in case of a simulation crash
self._crash_threshold = 0.2
self._number_of_nearest_neighbors = 4
self._interpolate_crashed_sims = self._config.interpolate_crashed_micro_sim()
if self._interpolate_crashed_sims:
if Interpolation is None:
self._logger.info(
"Interpolation is turned off as the required package is not installed."
)
self._interpolate_crashed_sims = False
else:
self._crash_threshold = 0.2
self._number_of_nearest_neighbors = 4
IshaanDesai marked this conversation as resolved.
Show resolved Hide resolved

self._mesh_vertex_ids = None # IDs of macro vertices as set by preCICE
self._micro_n_out = self._config.get_micro_output_n()
Expand Down Expand Up @@ -235,24 +247,27 @@ def solve(self) -> None:
micro_sims_output = self._solve_micro_simulations(micro_sims_input, dt)

# Check if more than a certain percentage of the micro simulations have crashed and terminate if threshold is exceeded
crashed_sims_on_all_ranks = np.zeros(self._size, dtype=np.int64)
self._comm.Allgather(
np.sum(self._has_sim_crashed), crashed_sims_on_all_ranks
)

if self._is_parallel:
crash_ratio = (
np.sum(crashed_sims_on_all_ranks) / self._global_number_of_sims
if self._interpolate_crashed_sims:
crashed_sims_on_all_ranks = np.zeros(self._size, dtype=np.int64)
self._comm.Allgather(
np.sum(self._has_sim_crashed), crashed_sims_on_all_ranks
)
else:
crash_ratio = np.sum(self._has_sim_crashed) / len(self._has_sim_crashed)

if crash_ratio > self._crash_threshold:
self._logger.info(
"{:.1%} of the micro simulations have crashed exceeding the threshold of {:.1%}. "
"Exiting simulation.".format(crash_ratio, self._crash_threshold)
)
sys.exit()
if self._is_parallel:
crash_ratio = (
np.sum(crashed_sims_on_all_ranks) / self._global_number_of_sims
)
else:
crash_ratio = np.sum(self._has_sim_crashed) / len(
self._has_sim_crashed
)

if crash_ratio > self._crash_threshold:
self._logger.info(
"{:.1%} of the micro simulations have crashed exceeding the threshold of {:.1%}. "
"Exiting simulation.".format(crash_ratio, self._crash_threshold)
)
sys.exit()

self._write_data_to_precice(micro_sims_output)

Expand Down Expand Up @@ -383,7 +398,8 @@ def initialize(self) -> None:

# Setup for simulation crashes
self._has_sim_crashed = [False] * self._local_number_of_sims
self._interpolant = Interpolation(self._logger)
if self._interpolate_crashed_sims:
self._interpolant = Interpolation(self._logger)

micro_problem = getattr(
importlib.import_module(
Expand Down Expand Up @@ -668,21 +684,32 @@ def _solve_micro_simulations(self, micro_sims_input: list, dt: float) -> list:
self._logger.error(error_message)
self._has_sim_crashed[count] = True

# If interpolate is off, terminate after crash
if not self._interpolate_crashed_sims:
crashed_sims_on_all_ranks = np.zeros(self._size, dtype=np.int64)
self._comm.Allgather(
np.sum(self._has_sim_crashed), crashed_sims_on_all_ranks
)
if sum(crashed_sims_on_all_ranks) > 0:
self._logger.info("Exiting simulation after micro simulation crash.")
sys.exit()

# Interpolate result for crashed simulation
unset_sims = [
count for count, value in enumerate(micro_sims_output) if value is None
]

# Iterate over all crashed simulations to interpolate output
for unset_sim in unset_sims:
self._logger.info(
"Interpolating output for crashed simulation at macro vertex {}.".format(
self._mesh_vertex_coords[unset_sim]
if self._interpolate_crashed_sims:
for unset_sim in unset_sims:
self._logger.info(
"Interpolating output for crashed simulation at macro vertex {}.".format(
self._mesh_vertex_coords[unset_sim]
)
)
micro_sims_output[unset_sim] = self._interpolate_output_for_crashed_sim(
micro_sims_input, micro_sims_output, unset_sim
)
)
micro_sims_output[unset_sim] = self._interpolate_output_for_crashed_sim(
micro_sims_input, micro_sims_output, unset_sim
)

return micro_sims_output

Expand Down Expand Up @@ -772,23 +799,33 @@ def _solve_micro_simulations_with_adaptivity(
self._logger.error(error_message)
self._has_sim_crashed[active_id] = True

# If interpolate is off, terminate after crash
if not self._interpolate_crashed_sims:
crashed_sims_on_all_ranks = np.zeros(self._size, dtype=np.int64)
self._comm.Allgather(
np.sum(self._has_sim_crashed), crashed_sims_on_all_ranks
)
if sum(crashed_sims_on_all_ranks) > 0:
self._logger.info("Exiting simulation after micro simulation crash.")
sys.exit()
# Interpolate result for crashed simulation
unset_sims = []
for active_id in active_sim_ids:
if micro_sims_output[active_id] is None:
unset_sims.append(active_id)

# Iterate over all crashed simulations to interpolate output
for unset_sim in unset_sims:
self._logger.info(
"Interpolating output for crashed simulation at macro vertex {}.".format(
self._mesh_vertex_coords[unset_sim]
if self._interpolate_crashed_sims:
for unset_sim in unset_sims:
self._logger.info(
"Interpolating output for crashed simulation at macro vertex {}.".format(
self._mesh_vertex_coords[unset_sim]
)
)
)

micro_sims_output[unset_sim] = self._interpolate_output_for_crashed_sim(
micro_sims_input, micro_sims_output, unset_sim, active_sim_ids
)
micro_sims_output[unset_sim] = self._interpolate_output_for_crashed_sim(
micro_sims_input, micro_sims_output, unset_sim, active_sim_ids
)

# For each inactive simulation, copy data from most similar active simulation
if self._adaptivity_type == "global":
Expand Down
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ build-backend = "setuptools.build_meta"
name="micro-manager-precice"
dynamic = [ "version" ]
dependencies = [
"pyprecice>=3.1", "numpy", "mpi4py", "scikit-learn"
"pyprecice>=3.1", "numpy", "mpi4py"
]
requires-python = ">=3.8"
authors = [
Expand All @@ -27,6 +27,9 @@ classifiers=[
"Topic :: Scientific/Engineering",
]

[project.optional-dependencies]
sklearn = ["scikit-learn"]

[project.urls]
Homepage = "https://precice.org"
Documentation = "https://precice.org/tooling-micro-manager-overview.html"
Expand Down
1 change: 1 addition & 0 deletions tests/unit/micro-manager-config_crash.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
},
"simulation_params": {
"macro_domain_bounds": [0.0, 25.0, 0.0, 25.0, 0.0, 25.0],
"interpolate_crash": "True",
"adaptivity": "True",
"adaptivity_settings": {
"type": "local",
Expand Down
1 change: 1 addition & 0 deletions tests/unit/test_interpolation.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ def test_nearest_neighbor(self):
"""
Test if finding nearest neighbor works as expected if interpolation point
itself is not part of neighbor coordinates.
Note: running this test requires the sci-kit learn package to be installed.
"""
neighbors = [[0, 2, 0], [0, 3, 0], [0, 0, 4], [-5, 0, 0], [0, 0, 0]]
inter_coord = [0, 0, 0]
Expand Down
2 changes: 2 additions & 0 deletions tests/unit/test_micro_simulation_crash_handling.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ def test_crash_handling(self):
"""
Test if the Micro Manager catches a simulation crash and handles it adequately.
A crash if caught by interpolation within _solve_micro_simulations.
Note: running this test requires the sci-kit learn package to be installed.
"""

macro_data = []
Expand Down Expand Up @@ -74,6 +75,7 @@ def test_crash_handling_with_adaptivity(self):
"""
Test if the micro manager catches a simulation crash and handles it adequately with adaptivity.
A crash if caught by interpolation within _solve_micro_simulations_with_adaptivity.
Note: running this test requires the sci-kit learn package to be installed.
"""

macro_data = []
Expand Down