bandframework · asemposki · Sep 7, 2024 · Aug 30, 2024 · Aug 30, 2024 · Aug 30, 2024
diff --git a/book/BMM_intro.md b/book/BMM_intro.md
@@ -1,8 +1,29 @@
 # Bayesian Model Mixing: a short introduction to the field
 
-Coming soon:
-- What is Bayesian model mixing? 
-- How is it used?
-- How does it differ from Bayesian model averaging?
-- What methods have so far been developed? Which are included in Taweret?
-- What are the frontiers of the field? Where could a user contribute?
+Uncertainty quantification using Bayesian methods is a still at the frontiers of modern research, and Bayesian model mixing is one of the branches of this field. Bayesian model mixing (BMM) combines the predictions from multiple models such that the fidelity of each model is preserved in the final, mixed result. However, as of yet, no comprehensive review article or package exists to facilitate easy access to both the concepts and numerics of this research area. This package helps to satisfy the latter need, containing state of the art techniques to perform Bayesian model mixing on many types of scenarios provided by the user, as well as some toy models already encoded for practise with these methods. Currently, Taweret contains three BMM techniques, each pertaining to a different general problem, and has been designed as a conveniently modular package so that a user can add their own model mixing method or model to the framework. Taweret is also generally formulated so that it can be used in more than just the nuclear physics branch of science in which it was originally tested.
+
+## What is Bayesian model mixing?
+
+Bayesian model mixing attempts to combine various models that describe the same underlying system in several different ways, the appropriate one for the scenario decided by the specifics of the problem and the information available. The overarching theme of BMM, however, is that it uses input-dependent weights to combine the models at hand---a technique different from the more well-known Bayesian model averaging (BMA). The latter technique often uses the model evidences as global weights to average the models over the input space. It is arguable that BMM is an improvement over BMA due to its dependence on location in the input space, allowing more precise model weighting and maximizing the predictive power of each individual model over the region in which it dominates.
+
+Two common groups used to describe Bayesian model mixing are called "mean mixing" and "density mixing"---the former involving a weighting of the moments of two or more models, and the latter a mixing of the entire posterior distribution of each model. Mean mixing can be summarized by
+
+$$
+E[Y | x] = \sum_{k=1}^{K} w_{k}(x) f_{k}(x), 
+$$
+
+where $E[Y | x]$ is the mean of $Y$ observations given the input parameter vector $x$. $f_{k}(x)$ is the mean prediction of each $k$th model $M_{k}$, and $w_{k}(x)$ are the input-dependent weights of each model. Density mixing can be written as
+
+$$
+p[Y_{0} | x_{0}, Y] = \sum_{k=1}^{K} w_{k}(x_{0}) p(Y_{0} | x_{0}, Y, M_{k}), 
+$$
+
+where $p(Y_{0} | x_{0}, Y, M_{k})$ is the predictive density of a future observation $Y_{0}$ given the location $x_{0}$ and the $k$th model $M_{k}$.
+
+It can be easily seen that the weight function is the most difficult decision to make in each individual problem encountered. In Taweret, weight functions are defined for each method, as users will see in the package structure and in the following tutorials. 
+
+## Outline of this Book
+
+In the subsequent chapters, there are numerous tutorials to help a new user get comfortable with the model mixing methods in this package, and to test each to determine which one is best for their use case. We begin with linear model mixing, which works best when experimental data is available to help train hyperparameters of the mixing function chosen. We then move to multivariate model mixing, where each model is formulated as a Gaussian distribution, and precision-weighting is used to combine the models across the input space.
+
+We encourage users to play with the codes and improve them, or implement their own model mixing methods and toy models in the future. As the field of Bayesian model mixing expands, Taweret should expand with it to contain as many techniques as possible so that users have a single package to turn to when they desire to try out different model mixing on their problems, whether they be in nuclear physics or meteorology, statistics, or any other field requiring input from Bayesian inference.
diff --git a/book/_config.yml b/book/_config.yml
@@ -4,6 +4,8 @@ copyright: "2024"
 logo: taweret_logo.PNG
 repository:
   url: https://github.com/bandframework/Taweret
+  path_to_book: book/  # Optional path to your book, relative to the repository root
+  branch: main        # Which branch of the repository should be used when creating links (optional)
 execute:
   execute_notebooks: force
   timeout: -1
@@ -18,5 +20,9 @@ html:
   use_issues_button         : true  # Whether to add an "open an issue" button
   home_page_in_navbar       : true  # Whether to have the home page in the side bar
 
+launch_buttons:
+  notebook_interface: "jupyterlab"  # or "classic"
+  colab_url: "https://colab.research.google.com"   # Adding Colab launch option
+
 # disable building anything not included in the TOC
-only_build_toc_files: true
+only_build_toc_files: true
diff --git a/book/landing.md b/book/landing.md
@@ -6,6 +6,10 @@ pertaining to each Bayesian model mixing (BMM) method in the package. The first
 ## Documentation
 The full documentation of the package can be found here: https://taweretdocs.readthedocs.io/en/latest/index.html.
 
+## Running the notebooks
+This Jupyter Book is able to launch to Google Colab, so that you can run the notebooks contained here live. There are dependences that
+are commented out in the notebooks---simply uncomment them to install the notebook-specific packages used there.
+
 ## Citing Taweret
 If you have benefited from Taweret, please cite our software using the following format:
 

diff --git a/book/notebooks/Biv_BMM/Bivariate_MM.md b/book/notebooks/Biv_BMM/Bivariate_MM.md
@@ -1,3 +1,21 @@
-# Bivariate Bayesian model mixing: an introduction
+# Multivariate Bayesian model mixing: an introduction
 
-Stuff will also go here.
+Multivariate model mixing is designed to use the precision of the model at each point in the input space to weight each individual model locally. This method does not require the definition of a mixing function, since the weights are pre-determined by the model variances at each point in the space, eliminating the bias of choosing a functional form for the mixing from the result. This method is purely moment-based mixing, or mean-mixing, and can be described as
+
+$$
+\mathcal{M}_{\dagger} \sim \mathcal{N}(f_{\dagger}, Z_{P}^{-1}): \quad f_{\dagger} = \frac{1}{Z_{P}} \sum_{k=1}^{K} \frac{1}{\sigma_{k}^{2}} f_{k}, \quad Z_{P} \equiv \sum_{k=1}^{K} \frac{1}{\sigma_{k}^{2}},
+$$
+
+hence
+
+$$
+\mathcal{M}_{k} \sim \mathcal{N}(f_k(x), \sigma_{k}^{2}(x)),
+$$
+
+where $Z_{P}$ is the precision of the models, or the inverse of the variances. $f_{\dagger}$ is the desired mean result of the mixed model, $f_{k}(x)$ the mean of the $k$th model, and $\sigma_{k}^{2}$ the variance of the $k$th model at each point in $x$. 
+
+This method is also one-dimensional at present, and requires the full PPD of the models it is mixing, but future developments hope to include simultaneous model calibration and mixing. 
+
+:::{seealso}
+See [this work](https://doi.org/10.1103/PhysRevC.106.044002) for more details of using this package on a toy model for effective field theories (EFTs), and [this package](https://github.com/asemposki/SAMBA) for the original code containing this mixing method.
+:::
diff --git a/book/notebooks/LMM/LMM.md b/book/notebooks/LMM/LMM.md
@@ -1,3 +1,30 @@
-# Bayesian linear model mixing: an introduction
+# Bivariate linear model mixing: an introduction
 
-Stuff will go here.
+Bivariate linear model mixing is able to combine two independent models using either the density- or mean-mixing strategies discussed in the previous chapter. The posterior predictive distribution can be written generally, for the density-based mixing, as
+
+$$
+ p(F(x)|\theta) \sim \alpha(x;\theta) F_1(x) +  (1-\alpha(x;\theta)) F_2(x),
+$$
+
+where $F(x)$ is the underlying theory we wish to describe with the combination of the individual models, $F_1(x)$ and $F_2(x)$. Here, $\alpha(x, \theta)$ is the mixing function, dependent on input space $x$ and hyperparameters $\theta$, chosen by the user to combine the two models. This choice should be informed by the understanding of the system at hand, and hence influences the result of the model mixing considerably. The current possible choices for this mixing function are
+
+- Step function: $\Theta(\beta_{0} - x)$;
+- Asymmetric 2-step: $\zeta \Theta(\beta_{0} -x) + (1 - \zeta) \Theta(\beta_{0} -x)$;
+- Sigmoid: $\exp((x - \beta_{0})/\beta_{1})$;
+- Cosine: 
+
+$$
+\alpha(x; \theta) = 
+    \begin{cases} 
+        1, & x \leq \theta_{1}; \\
+        \frac{1}{2}\left[1 + \cos(\frac{\pi}{2} \left(\frac{x-\theta_{1}}{\theta_{2} - \theta_{1}}\right))\right], & \theta_{1} < x \leq \theta_{2}; \\
+        \frac{1}{2}\left[1 + \cos\left(\frac{\pi}{2} \left(1 + \frac{x - \theta_{2}}{\theta_{3} - \theta_{2}} \right) \right) \right],  & \theta_{2} < x \leq \theta_{3}; \\
+        0, & x > \theta_{3}.
+    \end{cases}
+$$
+
+In all of the above functions, $\beta_{0}, \beta_{1}$ correspond to the shape parameters of the mixing function, $\zeta$ is a mixing hyperparameter, and $\theta_{i}$ are also mixing function hyperparameters.
+
+:::{important}
+In this method, the models are expected to be one-dimensonally mixed, but there is currently capability to simultaneously mix the models and determine the hyperparameters of each model (calibration), which is not a feature built into any of the other model mixing methods included in this package at the present time. Future work will look to include this simultaneous calibration and mixing a reality in all model mixing scenarios.
+:::
diff --git a/book/notebooks/LMM/Linear_BMM_with_cdf_function_for_SAMBA_models.ipynb b/book/notebooks/LMM/Linear_BMM_with_cdf_function_for_SAMBA_models.ipynb
@@ -48,6 +48,10 @@
     "\n",
     "# For plotting\n",
     "import matplotlib.pyplot as plt\n",
+    "\n",
+    "! pip install seaborn    # comment if installed\n",
+    "! pip install ptemcee    # comment if installed\n",
+    "\n",
     "import seaborn as sns\n",
     "sns.set_context('poster')\n",
     "# To define priors. (uncoment if not using default priors)\n",
@@ -243,7 +247,7 @@
     "                    'burn_in_fixed_discard': 50,\n",
     "                    'nsamples': 2000,\n",
     "                    'threads': 6,\n",
-    "                    'printdt': 0}\n",
+    "                    'printdt': 9000}\n",
     "\n",
     "result = mix_model.train(x_exp=g, y_exp=y_exp, y_err=y_err, outdir = 'outdir/samba_bivariate', \n",
     "label='cdf_mix', kwargs_for_sampler=kwargs_for_sampler)"

diff --git a/book/notebooks/LMM/Linear_BMM_with_cdf_function_for_coleman_models.ipynb b/book/notebooks/LMM/Linear_BMM_with_cdf_function_for_coleman_models.ipynb
@@ -44,6 +44,10 @@
     "\n",
     "# For plotting\n",
     "import matplotlib.pyplot as plt\n",
+    "\n",
+    "! pip install seaborn    # comment if installed\n",
+    "! pip install ptemcee    # comment if installed\n",
+    "\n",
     "import seaborn as sns\n",
     "sns.set_context('poster')\n",
     "# To define priors. (uncoment if not using default priors)\n",
@@ -847,7 +851,7 @@
     "                    'burn_in_fixed_discard': 50,\n",
     "                    'nsamples': 2000,\n",
     "                    'threads': 6,\n",
-    "                    'printdt': 0}\n",
+    "                    'printdt': 10000}\n",
     "\n",
     "result = mix_model.train(x_exp=g, y_exp=exp_data[0], y_err=exp_data[1],\n",
     "                        label='cdf_mix', \n",

diff --git a/book/notebooks/LMM/Linear_BMM_with_step_function_for_SAMBA_models.ipynb b/book/notebooks/LMM/Linear_BMM_with_step_function_for_SAMBA_models.ipynb
@@ -48,6 +48,10 @@
     "\n",
     "# For plotting\n",
     "import matplotlib.pyplot as plt\n",
+    "\n",
+    "! pip install seaborn    # comment if installed\n",
+    "! pip install ptemcee    # comment if installed\n",
+    "\n",
     "import seaborn as sns\n",
     "sns.set_context('poster')\n",
     "# To define priors. (uncoment if not using default priors)\n",
@@ -748,7 +752,7 @@
     "                    'burn_in_fixed_discard': 50,\n",
     "                    'nsamples': 2000,\n",
     "                    'threads': 6,\n",
-    "                    'printdt': 0}\n",
+    "                    'printdt': 9000}\n",
     "\n",
     "result = mix_model.train(x_exp=g, y_exp=y_exp, y_err=y_err, outdir = 'outdir/samba_bivaraite', label='step_mix',\n",
     "                          kwargs_for_sampler=kwargs_for_sampler)"

diff --git a/book/notebooks/LMM/Linear_BMM_with_switchcos_function_for_SAMBA_models.ipynb b/book/notebooks/LMM/Linear_BMM_with_switchcos_function_for_SAMBA_models.ipynb
@@ -59,10 +59,14 @@
     "\n",
     "# For plotting\n",
     "import matplotlib.pyplot as plt\n",
+    "\n",
+    "! pip install seaborn    # comment if installed\n",
+    "! pip install ptemcee    # comment if installed\n",
+    "\n",
     "import seaborn as sns\n",
     "sns.set_context('poster')\n",
     "# To define priors. (uncoment if not using default priors)\n",
-    "# ! pip install bilby     # uncomment if not already installed\n",
+    "! pip install bilby     # comment if already installed\n",
     "import bilby\n",
     "\n",
     "# For other operations\n",
@@ -806,7 +810,7 @@
     "                    'nsamples': 2000,\n",
     "                    #'threads': 6,\n",
     "                    'npool':1,\n",
-    "                    'printdt': 0}\n",
+    "                    'printdt': 9000}\n",
     "result = mix_model.train(x_exp=g, y_exp=y_exp, y_err=y_err,outdir = 'outdir/samba_bivariate_1', label='switchcos_mix', \n",
     "                         kwargs_for_sampler=kwargs_for_sampler)"
    ]

diff --git a/...oks/LMM/Linear_BMM_with_switchcos_function_for_SAMBA_models_Bilby_constrained_prior.ipynb b/...oks/LMM/Linear_BMM_with_switchcos_function_for_SAMBA_models_Bilby_constrained_prior.ipynb
@@ -59,6 +59,10 @@
     "\n",
     "# For plotting\n",
     "import matplotlib.pyplot as plt\n",
+    "\n",
+    "! pip install seaborn    # comment if installed\n",
+    "! pip install ptemcee    # comment if installed\n",
+    "\n",
     "import seaborn as sns\n",
     "sns.set_context('poster')\n",
     "# To define priors. (uncomment if not using default priors)\n",
@@ -747,7 +751,7 @@
     "                    'burn_in_fixed_discard': 50,\n",
     "                    'nsamples': 2000,\n",
     "                    'threads': 6,\n",
-    "                    'printdt': 0}\n",
+    "                    'printdt': 9000}\n",
     "\n",
     "result = mix_model.train(x_exp=g, y_exp=exp_data[0], y_err=exp_data[1],outdir = 'outdir/samba_bivariate', \n",
     "                         label='switchcos_mix_constrained',\n",

diff --git a/book/notebooks/LMM/coleman_models_BMM_comparative_study.ipynb b/book/notebooks/LMM/coleman_models_BMM_comparative_study.ipynb
@@ -46,6 +46,10 @@
     "\n",
     "# For plotting\n",
     "import matplotlib.pyplot as plt\n",
+    "\n",
+    "! pip install seaborn    # comment if installed\n",
+    "! pip install ptemcee    # comment if installed\n",
+    "\n",
     "import seaborn as sns\n",
     "sns.set_context('poster')\n",
     "# To define priors. (uncoment if not using default priors)\n",
@@ -338,7 +342,7 @@
     "                'burn_in_fixed_discard':500,\n",
     "                'nsamples':3000,\n",
     "                'threads':6,\n",
-    "                'printdt':0}\n",
+    "                'printdt':10000}\n",
     "                #'safety':2,\n",
     "                #'autocorr_tol':5}"
    ]

diff --git a/book/notebooks/Trees_BMM/Trees.md b/book/notebooks/Trees_BMM/Trees.md
@@ -1,3 +1,15 @@
 # Bayesian model mixing with regression trees: an introduction
 
-Again, stuff will go here.
+The third mixing method in Taweret is the combination of models using decision trees to determine the weights of each. This allows for adaptive learning of the weights, and again eliminates the bias of choosing a mixing function form. The weights can be described as
+
+$$
+w_{k}(x) = \sum_{j=1}^{m} g_{k}(x;T_{j};M_{j}), \quad \textrm{for}~~~k = 1, \dots, K.
+$$
+
+Here, $g_{k}(x;T_{j};M_{j})$ is the $k$th output of the $j$th tree, given by $T_{j}$. The set of parameters associated with this tree is $M_{j}$. The weights are normalized to a prior interval $[0,1]$, but this condition is not strictly enforced, allowing for values outside of this interval.
+
+This mixing method interfaces with the BART (Bayesian Additive Regression Trees) C++ package, [`openBT`](https://bitbucket.org/mpratola/openbt/wiki/Home), and is able to handle multi-dimensional model mixing. 
+
+:::{seealso}
+See [this paper](https://doi.org/10.1080/00401706.2023.2257765) for more details. 
+:::