Merge pull request #108 from NREL/feature/calibration-demo

Feature/calibration demo
NREL · Feb 21, 2024 · f031964 · f031964
2 parents dde3d02 + ef1bc5f
commit f031964
Show file tree

Hide file tree

Showing 18 changed files with 8,206 additions and 126 deletions.
diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md
@@ -4,4 +4,5 @@
 - [Documentation](./fastsim-doc.md)
     - [Python](./python-doc.md)
     - [Rust](./rust-doc.md)
+- [Calibration/Validation](./cal_and_val.md)
 - [How to Update This Book](./how-to-update.md)
diff --git a/docs/src/cal_and_val.md b/docs/src/cal_and_val.md
@@ -0,0 +1,59 @@
+# Calibration and Validation of Vehicle Models
+FASTSim powertrain models can have varying levels of calibration and resolution based on available calibration and validation data. In the simplest (US) cases, the only available validation data for a powertrain model is the EPA "window sticker" energy consumption rates. However, there are also situations in which detailed dynamometer or on-road data is available for a particular vehicle, enabling much more detailed model calibration. This documentation is meant to summarize these various calibration levels and the tools available to help with more detailed calibration.
+
+## Calibration/Validation Levels
+
+| Level | Calibration | Validation | 
+| --- | --- | --- | 
+| 0 | Vehicle is parameterized without any fitting to performance data. This is called __parameterization__, not calibration.  | Could be none or could be validated against aggregate energy consumption data like EPA window sticker values. | 
+| 1 | Vehicle parameters are adjusted so that model results reasonably match test data for aggregate, cycle-level data (e.g. fuel usage, net SOC change). | Model results reasonably match at least some aggregate, cycle-level test data not used in any calibration process. |
+| 2 | Vehicle parameters are adjusted so that model results reasonably match test data for time-resolved test data (e.g. instantaneous fuel usage, instantaneous cumulative fuel usage, instantaneous SOC). | Model results reasonably match at least some time-resolved test data not used in any calibration process. |
+| 3 | Some amount of component-level thermal modeling is included and vehicle parameters are adjusted so that model results reasonably match test data for time-resolved test data (e.g. instantaneous fuel usage, instantaneous cumulative fuel usage, instantaneous SOC). | Model results reasonably match time-resolved test data not used in any calibration process that covers various temperatures and/vehcile transient thermal states. |
+
+Examples of calibration levels 0, 2, and 3 from the [FASTSim Validation Report](https://www.nrel.gov/docs/fy22osti/81097.pdf):
+
+![image](https://github.com/NREL/fastsim/assets/4818940/1b7dae5d-b328-406e-9e2c-07abadff7a3a)
+
+![image](https://github.com/NREL/fastsim/assets/4818940/530f6a15-8400-4618-a97a-da67609f6ecd)
+
+![image](https://github.com/NREL/fastsim/assets/4818940/8483661f-dee4-4d59-9d69-e6d54dae0100)
+
+## Calibration Level 2 Guidelines
+- Copy
+  [calibration_demo.py](https://github.com/NREL/fastsim/blob/fastsim-2/python/fastsim/demos/calibration_demo.py)
+  to your project directory and modify as needed.
+- By default, this script selects the model that minimizes the euclidean error across
+  all objectives, which may not be the way that you want to select your final design.
+  By looking at the plots that get generated in `save_path`, you can use both the time
+  series and parallel coordinates plots to down select an appropriate design.
+- Because PyMOO is a multi-objective optimizer that finds a multi-dimensional Pareto
+  surface, it will not necessarily return a single _best_ result -- rather, it will
+  produce a pareto-optimal set of results, and you must down select.  Often, the design
+  with minimal euclidean error will be the best design, but it's good to pick a handful
+  of designs from the pareto set and check how they behave in the time-resolved plots
+  that can be optionally generated by the optimization script.
+- Run `python calibration_demo.py --help` to see details about how to run calibration
+  and validation. Greater population size typically results in faster convergence at the
+  expense of increased run time for each generation.  There's no benefit in having a
+  number of processes larger than the population size.  `xtol` and `ftol` (see CLI help)
+  can be used to adjust when the minimization is considered converged.  If the
+  optimization is terminating when `n_max_gen` is hit, then that means it has not
+  converged, and you may want to increase `n_max_gen`.
+- Usually, start out with an existing vehicle model that is reasonably close to the
+  new vehicle, and make sure to provide as many explicit parameters as possible.  In
+  some cases, a reasonable engineering judgment is appropriate.
+- Resample data to 1 Hz.  This is a good idea because higher frequency data will cause
+  fastsim to run more slowly.  This can be done with `fastsim.resample.resample`.  Be
+  sure to specify `rate_vars` (e.g. fuel power flow rate [W]), which will be time
+  averaged over the previous time step in the new frequency.
+- Identify test data signals and corresponding fastsim signals that need to match.
+  These pairs of signals will be used to construct minimization objectives.  See
+  where `obj_names` is defined in `calibration_demo.py` for an example.
+- See where `cycs[key]` gets assigned to see an example of constructing a Cycle from a dataframe.  
+- Partition out calibration/validation data by specifying a tuple of  regex patterns
+  that correspond to cycle names.  See where `cal_cyc_patterns` is defined for an
+  example. Typically, it's good to reserve about 25-33% of your data for validation.  
+- To set parameters and corresponding ranges that the optimizer is allowed to use in
+  getting the model to match test data, see where `params_and_bounds` is defined
+  below.  
+
diff --git a/pyproject.toml b/pyproject.toml
@@ -27,7 +27,8 @@ dependencies = [
     "pyyaml",
     "pytest",
     "setuptools<=65.6.3", # suppresses pkg_resources deprecation warning
-    "openpyxl>=3.1.2"
+    "openpyxl>=3.1.2",
+    "plotly==5.18",
 ]
 
 [project.urls]

diff --git a/python/fastsim/__init__.py b/python/fastsim/__init__.py
@@ -6,14 +6,14 @@
 import logging
 import traceback
 
-from . import fastsimrust
-from . import fastsimrust as fsr
-from . import parameters as params
-from . import utils
-from . import simdrive, vehicle, cycle, calibration, tests
-from . import calibration as cal
-from .resample import resample
-from . import auxiliaries
+from fastsim import parameters as params
+from fastsim import utils
+from fastsim import simdrive, vehicle, cycle, calibration, tests
+from fastsim import calibration as cal
+from fastsim.resample import resample
+from fastsim import auxiliaries
+from fastsim import fastsimrust
+from fastsim import fastsimrust as fsr
 
 
 def package_root() -> Path:

diff --git a/python/fastsim/calibration.py b/python/fastsim/calibration.py
@@ -6,6 +6,7 @@
 import argparse
 import pandas as pd
 import numpy as np
+import numpy.typing as npt
 import json
 import time
 import matplotlib.pyplot as plt
@@ -48,20 +49,22 @@
 import fastsim.fastsimrust as fsr
 
 
-def get_error_val(model, test, time_steps):
-    """Returns time-averaged error for model and test signal.
-    Arguments:
-    ----------
-    model: array of values for signal from model
-    test: array of values for signal from test data
-    time_steps: array (or scalar for constant) of values for model time steps [s]
-    test: array of values for signal from test
+def get_error_val(
+    model: npt.NDArray[np.float64], 
+    test: npt.NDArray[np.float64], 
+    time_steps: npt.NDArray[np.float64]
+) -> float:
+    """
+    Returns time-averaged error for model and test signal.
 
-    Output:
-    -------
-    err: integral of absolute value of difference between model and
-    test per time"""
+    Args:
+        model (npt.NDArray[np.float64]): array of values for signal from model
+        test (npt.NDArray[np.float64]): array of values for signal from test data
+        time_steps (npt.NDArray[np.float64]): array (or scalar for constant) of values for model time steps [s]
 
+    Returns:
+        float: integral of absolute value of difference between model and test per time
+    """
     assert len(model) == len(test) == len(
         time_steps), f"{len(model)}, {len(test)}, {len(time_steps)}"
 
@@ -73,7 +76,7 @@ class ModelObjectives(object):
     Class for calculating eco-driving objectives
     """
 
-    # dictionary of bincode models to be simulated
+    # dictionary of json models to be simulated
     models: Dict[str, str]
 
     # dictionary of test data to calibrate against
@@ -124,20 +127,23 @@ def get_errors(
     ) -> Union[
         Dict[str, Dict[str, float]],
         # or if return_mods is True
-        Dict[str, fsim.simdrive.SimDrive],
+        Tuple[Dict[str, fsim.simdrive.SimDrive], Dict[str, Dict[str, float]]]
     ]:
         """
         Calculate model errors w.r.t. test data for each element in dfs/models for each objective.
-        Arguments:
-        ----------
-            - sim_drives: dictionary with user-defined keys and SimDrive or SimDriveHot instances
-            - return_mods: if true, also returns dict of solved models
-            - plot: if true, plots objectives using matplotlib.pyplot
-            - plot_save_dir: directory in which to save plots.  If `None` (default), plots are not saved.   
-            - plot_perc_err: whether to include % error axes in plots
-            - show: whether to show matplotlib.pyplot plots
-            - fontsize: plot font size
-            - plotly: whether to generate plotly plots, which can be opened manually in a browser window
+
+        Args:
+            sim_drives (Dict[str, fsr.RustSimDrive  |  fsr.SimDriveHot]): dictionary with user-defined keys and SimDrive or SimDriveHot instances
+            return_mods (bool, optional): if true, also returns dict of solved models. Defaults to False.
+            plot (bool, optional): if true, plots objectives using matplotlib.pyplot. Defaults to False.
+            plot_save_dir (Optional[str], optional): directory in which to save plots. If None, plots are not saved. Defaults to None.
+            plot_perc_err (bool, optional): whether to include % error axes in plots. Defaults to False.
+            show (bool, optional): whether to show matplotlib.pyplot plots. Defaults to False.
+            fontsize (float, optional): plot font size. Defaults to 12.
+            plotly (bool, optional): whether to generate plotly plots, which can be opened manually in a browser window. Defaults to False.
+
+        Returns:
+            Objectives and optionally solved models
         """
         # TODO: should return type instead be `Dict[str, Dict[str, float]] | Tuple[Dict[str, Dict[str, float]], Dict[str, fsim.simdrive.SimDrive]]`
         # This would make `from typing import Union` unnecessary
@@ -256,13 +262,13 @@ def update_params(self, xs: List[Any]):
         assert len(xs) == len(self.params), f"({len(xs)} != {len(self.params)}"
         paths = [fullpath.split(".") for fullpath in self.params]
         t0 = time.perf_counter()
-        # Load instances from bincode strings
+        # Load instances from json strings
         if not self.use_simdrivehot:
-            sim_drives = {key: fsr.RustSimDrive.from_bincode(
-                model_bincode) for key, model_bincode in self.models.items()}
+            sim_drives = {key: fsr.RustSimDrive.from_json(
+                model_json) for key, model_json in self.models.items()}
         else:
-            sim_drives = {key: fsr.SimDriveHot.from_bincode(
-                model_bincode) for key, model_bincode in self.models.items()}
+            sim_drives = {key: fsr.SimDriveHot.from_json(
+                model_json) for key, model_json in self.models.items()}
         # Update all model parameters
         for key in sim_drives.keys():
             sim_drives[key] = fsim.utils.set_attrs_with_path(
@@ -526,40 +532,100 @@ def run_minimize(
             columns=[param for param in problem.mod_obj.params],
         )
 
-        Path(save_path).mkdir(exist_ok=True, parents=True)
-        # with open(Path(save_path) / "pymoo_res.pickle", 'wb') as file:
-        #     pickle.dump(res, file)
+        if save_path is not None:
+            Path(save_path).mkdir(exist_ok=True, parents=True)
 
         res_df = pd.concat([x_df, f_df], axis=1)
         res_df['euclidean'] = (
             res_df.iloc[:, len(problem.mod_obj.params):] ** 2).sum(1).pow(1/2)
-        res_df.to_csv(Path(save_path) / "pymoo_res_df.csv", index=False)
+        if save_path is not None:
+            res_df.to_csv(Path(save_path) / "pymoo_res_df.csv", index=False)
 
         t1 = time.perf_counter()
         print(f"Elapsed time to run minimization: {t1-t0:.5g} s")
 
         return res, res_df
 
 
-def get_parser() -> argparse.ArgumentParser:
+def get_parser(
+    def_description:str="Program for calibrating fastsim models.",
+    def_p:int=4,
+    def_n_max_gen:int=500,
+    def_pop_size:int=12,
+    def_save_path:Optional[str]="pymoo_res"
+
+) -> argparse.ArgumentParser:
     """
     Generate parser for optimization hyper params and misc. other params
+
+    Args:
+        def_p (int, optional): default number of processes. Defaults to 4.
+        def_n_max_gen (int, optional): max allowed generations. Defaults to 500.
+        def_pop_size (int, optional): default population size. Defaults to 12.
+        def_save_path (str, optional): default save path.  Defaults to `pymoo_res`.
+
+    Returns:
+        argparse.ArgumentParser: _description_
     """
-    parser = argparse.ArgumentParser(description='...')
-    parser.add_argument('-p', '--processes', type=int,
-                        default=4, help="Number of pool processes.")
-    parser.add_argument('--n-max-gen', type=int, default=500,
-                        help="PyMOO termination criterion: n_max_gen.")
-    parser.add_argument('--pop-size', type=int, default=12,
-                        help="PyMOO population size in each generation.")
-    parser.add_argument('--skip-minimize', action="store_true",
-                        help="If provided, load previous results.")
-    parser.add_argument('--save-path', type=str, default="pymoo_res",
-                        help="File location to save results.")
-    parser.add_argument('--show', action="store_true",
-                        help="If provided, shows plots.")
-    parser.add_argument("--make-plots", action="store_true",
-                        help="Generates plots, if provided.")
-    parser.add_argument("--use-simdrivehot", action="store_true",
-                        help="Use fsr.SimDriveHot rather than fsr.RustSimDrive.")
+    parser = argparse.ArgumentParser(description=def_description)
+    parser.add_argument(
+        '-p', 
+        '--processes', 
+        type=int,
+        default=def_p, 
+        help=f"Number of pool processes. Defaults to {def_p}"
+    )
+    parser.add_argument(
+        '--n-max-gen', 
+        type=int, 
+        default=def_n_max_gen,
+        help=f"PyMOO termination criterion: n_max_gen. Defaults to {def_n_max_gen}"
+    )
+    parser.add_argument(
+        '--xtol',
+        type=float,
+        default=DMOT().x.termination.tol,
+        help=f"PyMOO termination criterion: xtol. Defaluts to {DMOT().x.termination.tol}"
+    )
+    parser.add_argument(
+        '--ftol',
+        type=float,
+        default=DMOT().f.termination.tol,
+        help=f"PyMOO termination criterion: ftol. Defaults to {DMOT().f.termination.tol}"
+    )
+    parser.add_argument(
+        '--pop-size', 
+        type=int, 
+        default=def_pop_size,
+        help=f"PyMOO population size in each generation. Defaults to {def_pop_size}"
+    )
+    parser.add_argument(
+        '--skip-minimize', 
+        action="store_true",
+        help="If provided, load previous results."
+    )
+    parser.add_argument(
+        '--save-path', 
+        type=str, 
+        default=def_save_path,               
+        help="File location to save results dataframe with rows of parameter and corresponding" 
+            + " objective values and any optional plots." 
+            + (" If not provided, results are not saved" if def_save_path is None else "")
+    )
+    parser.add_argument(
+        '--show', 
+        action="store_true",
+        help="If provided, shows plots."
+    )
+    parser.add_argument(
+        "--make-plots", 
+        action="store_true",
+        help="Generates plots, if provided."
+    )
+    parser.add_argument(
+        "--use-simdrivehot", 
+        action="store_true",
+        help="Use fsr.SimDriveHot rather than fsr.RustSimDrive."
+    )
+
     return parser