Merge: Add Active Learning Userguide (emdgroup#273)

Adds a small active learning userguide. - This is important for visibility that baybe can be used for active learning - For completeness I also added the `PosteriorStandardDeviation` acquisition function as its relevant in this case - Link to doc on fork: https://scienfitz.github.io/baybe-dev/userguide/active_learning.html
AVHopp · Jul 19, 2024 · 9a09f1e · 9a09f1e
2 parents 6ff4fe0 + d162514
commit 9a09f1e
Show file tree

Hide file tree

Showing 6 changed files with 103 additions and 5 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -21,8 +21,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `SubspaceDiscrete.to_searchspace` and `SubspaceContinuous.to_searchspace`
   convenience constructor
 - Validators for `Campaign` attributes
-_ `_optional` subpackage for managing optional dependencies
-- Acquisition function for active learning: `qNIPV`
+- `_optional` subpackage for managing optional dependencies
+- New acquisition functions for active learning: `qNIPV` (negative integrated posterior
+  variance) and `PSTD` (posterior standard deviation)
 - Acquisition function: `qKG` (knowledge gradient)
 - Abstract `ContinuousNonlinearConstraint` class
 - Abstract `CardinalityConstraint` class and
@@ -42,6 +43,7 @@ _ `_optional` subpackage for managing optional dependencies
 - Validation and translation tests for kernels
 - `BasicKernel` and `CompositeKernel` base classes
 - Activated `pre-commit.ci` with auto-update
+- User guide for active learning
 
 ### Changed
 - Passing an `Objective` to `Campaign` is now optional

diff --git a/baybe/acquisition/__init__.py b/baybe/acquisition/__init__.py
@@ -4,6 +4,7 @@
     ExpectedImprovement,
     LogExpectedImprovement,
     PosteriorMean,
+    PosteriorStandardDeviation,
     ProbabilityOfImprovement,
     UpperConfidenceBound,
     qExpectedImprovement,
@@ -18,6 +19,7 @@
 )
 
 PM = PosteriorMean
+PSTD = PosteriorStandardDeviation
 qSR = qSimpleRegret
 EI = ExpectedImprovement
 qEI = qExpectedImprovement
@@ -36,8 +38,9 @@
     ######################### Acquisition functions
     # Knowledge Gradient
     "qKnowledgeGradient",
-    # Posterior Mean
+    # Posterior Statistics
     "PosteriorMean",
+    "PosteriorStandardDeviation",
     # Simple Regret
     "qSimpleRegret",
     # Expected Improvement
@@ -57,8 +60,9 @@
     ######################### Abbreviations
     # Knowledge Gradient
     "qKG",
-    # Posterior Mean
+    # Posterior Statistics
     "PM",
+    "PSTD",
     # Simple Regret
     "qSR",
     # Expected Improvement

diff --git a/baybe/acquisition/acqfs.py b/baybe/acquisition/acqfs.py
@@ -149,14 +149,26 @@ class qKnowledgeGradient(AcquisitionFunction):
 
 
 ########################################################################################
-### Posterior Mean
+### Posterior Statistics
 @define(frozen=True)
 class PosteriorMean(AcquisitionFunction):
     """Posterior mean."""
 
     abbreviation: ClassVar[str] = "PM"
 
 
+@define(frozen=True)
+class PosteriorStandardDeviation(AcquisitionFunction):
+    """Posterior standard deviation."""
+
+    abbreviation: ClassVar[str] = "PSTD"
+
+    maximize: bool = field(default=True, validator=instance_of(bool))
+    """If ``True``, points with maximum posterior standard deviation are selected.
+    If ``False``, the acquisition function value is negated, yielding a selection
+    with minimal posterior standard deviation."""
+
+
 ########################################################################################
 ### Simple Regret
 @define(frozen=True)

diff --git a/docs/userguide/active_learning.md b/docs/userguide/active_learning.md
@@ -0,0 +1,77 @@
+# Active Learning
+When deciding which experiments to perform next, e.g. for a data acquisition campaign
+to gather data for a machine learning model, it can be beneficial to follow a guided
+approach rather than selecting experiments randomly. If this is done via iteratively
+measuring points according to a criterion reflecting the current model's uncertainty,
+the method is called **active learning**.
+
+Active learning can be seen as a special case of Bayesian optimization: If we have the
+above-mentioned criterion and set up a Bayesian optimization campaign to recommend
+points with the highest uncertainty, we achieve active learning via Bayesian
+optimization. In practice, this is procedure is implemented by setting up a
+probabilistic model of our measurement process that allows us to quantify uncertainty
+in the form of a posterior distribution, from which we can then construct an
+uncertainty-based acquisition function to guide the exploration process.
+
+Below you find which acquisition functions in BayBE are suitable for this endeavor, 
+including a few guidelines.
+
+## Local Uncertainty Reduction
+In BayBE, there are two types of acquisition function that can be chosen to search for
+the points with the highest predicted model uncertainty:
+- [`PosteriorStandardDeviation`](baybe.acquisition.acqfs.PosteriorStandardDeviation) (`PSTD`)
+- [`UpperConfidenceBound`](baybe.acquisition.acqfs.UpperConfidenceBound) (`UCB`) / 
+  [`qUpperConfidenceBound`](baybe.acquisition.acqfs.qUpperConfidenceBound) (`qUCB`)
+  with high `beta`:  
+  Increasing values of `beta` effectively eliminate the effect of the posterior mean on
+  the acquisition value, yielding a selection of points driven primarily by the
+  posterior variance. However, we generally recommend to use this acquisition function
+  only if a small exploratory component is desired – otherwise, the
+  [`PosteriorStandardDeviation`](baybe.acquisition.acqfs.PosteriorStandardDeviation) 
+  acquisition function is what you are looking for.
+
+## Global Uncertainty Reduction
+BayBE also offers the 
+[`qNegIntegratedPosteriorVariance`](baybe.acquisition.acqfs.qNegIntegratedPosteriorVariance) 
+(`qNIPV`), which integrates 
+the posterior variance over the entire search space.
+Choosing candidates based on this acquisition function is tantamount to selecting the
+set of points resulting in the largest reduction of global uncertainty when added to
+the already existing experimental design.
+
+Because of its ability to quantify uncertainty on a global scale, this approach is often
+superior to using a point-based uncertainty criterion as acquisition function. 
+However, due to its computational complexity, it can be prohibitive to integrate over
+the entire search space. For this reason, we offer the option to sub-sample parts of it,
+configurable via the constructor:
+
+```python
+from baybe.acquisition import qNIPV
+from baybe.utils.sampling_algorithms import DiscreteSamplingMethod
+
+# Will integrate over the entire search space
+qNIPV()
+
+# Will integrate over 50% of the search space, randomly sampled
+qNIPV(sampling_fraction=0.5)
+
+# Will integrate over 250 points, chosen by farthest point sampling
+# Both lines are equivalent
+qNIPV(sampling_n_points=250, sampling_method="FPS")
+qNIPV(sampling_n_points=250, sampling_method=DiscreteSamplingMethod.FPS)
+```
+
+```{admonition} Sub-Sampling Method
+:class: note
+Sampling of the continuous part of the search space will always be random, while 
+sampling of the discrete part can be controlled by providing a corresponding 
+[`DiscreteSamplingMethod`](baybe.utils.sampling_algorithms.DiscreteSamplingMethod) for 
+`sampling_method`.
+```
+
+```{admonition} Purely Continuous Search Spaces
+:class: important
+Please be aware that in case of a purely continuous search space, the number of points 
+to sample for integration must be specified via `sampling_n_points` (since providing
+a fraction becomes meaningless).
+```
diff --git a/docs/userguide/userguide.md b/docs/userguide/userguide.md
@@ -2,6 +2,7 @@
 
 ```{toctree}
 Campaigns <campaigns>
+Active Learning <active_learning>
 Constraints <constraints>
 Environment Vars <envvars>
 Objectives <objectives>

diff --git a/tests/hypothesis_strategies/acquisition.py b/tests/hypothesis_strategies/acquisition.py
@@ -6,6 +6,7 @@
     ExpectedImprovement,
     LogExpectedImprovement,
     PosteriorMean,
+    PosteriorStandardDeviation,
     ProbabilityOfImprovement,
     UpperConfidenceBound,
     qExpectedImprovement,
@@ -49,6 +50,7 @@ def _qNIPV_strategy(draw: st.DrawFn):
     st.builds(ProbabilityOfImprovement),
     st.builds(UpperConfidenceBound, beta=finite_floats(min_value=0.0)),
     st.builds(PosteriorMean),
+    st.builds(PosteriorStandardDeviation, maximize=st.sampled_from([True, False])),
     st.builds(LogExpectedImprovement),
     st.builds(qExpectedImprovement),
     st.builds(qProbabilityOfImprovement),