-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathgetting_started.Rmd
75 lines (60 loc) · 3.31 KB
/
getting_started.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
title: Getting Started with gptools in R
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Getting Started with gptools in R}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
params:
EVAL: !r identical(Sys.getenv("NOT_CRAN"), "true")
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
eval = params$EVAL
)
```
```{r, include = FALSE}
# Make sure cmdstanr is all set up. But we don't need to show the reader this.
cmdstanr::check_cmdstan_toolchain(fix = TRUE)
cmdstanr::install_cmdstan(version = "2.36.0")
```
`gptoolsStan` is a minimal package to publish Stan code for efficient Gaussian process inference. The package can be used with the [`cmdstanr`](https://mc-stan.org/cmdstanr/) interface for Stan in R. Unfortunately, [`Rstan`](https://mc-stan.org/rstan/) is not supported because it [does not provide an option to specify include paths](https://discourse.mc-stan.org/t/specifying-include-paths-in-rstan/32182/2). If you're already familiar with `cmdstanr`, dive in below. If not, have a look at the [getting started guide](https://mc-stan.org/cmdstanr/articles/cmdstanr.html) for the `cmdstanr` interface.
This vignette demonstrates the package by sampling from a simple Gaussian process using Fourier methods (see the accompanying publication ["Scalable Gaussian Process Inference with Stan"](https://arxiv.org/abs/2301.08836) for background on the approach). Here is the model definition in Stan.
```{r, results='markup', comment='', echo=FALSE}
cat(readLines("getting_started.stan"), sep = "\n")
```
Here, we assume that `cmdstanr` is installed and that the `cmdstan` compiler is available. See [here](https://mc-stan.org/cmdstanr/#installation) if you need help getting set up with `cmdstanr`. Let's compile the model.
```{r}
library(cmdstanr)
library(gptoolsStan)
model <- cmdstan_model(
stan_file="getting_started.stan",
include_paths=gptools_include_path(),
)
```
As indicated in the `data` section of the Stan program, we need to define the number of elements `n` of the Gaussian process and the [real fast Fourier transform](https://en.wikipedia.org/wiki/Fast_Fourier_transform#FFT_algorithms_specialized_for_real_or_symmetric_data) (RFFT) of the covariance kernel `cov_rfft`. We'll use 100 elements and a [squared exponential covariance kernel](https://en.wikipedia.org/wiki/Gaussian_process#Usual_covariance_functions) which allows us to evaluate the RFFT directly.
```{r}
n <- 100
length_scale <- n / 10
freq <- 1:(n %/% 2 + 1) - 1
# See appendix B of https://arxiv.org/abs/2301.08836 for details on the expression.
cov_rfft <- exp(- 2 * (pi * freq * length_scale / n) ^ 2) + 1e-9
```
Let's obtain prior realization by sampling from the model.
```{r}
fit <- model$sample(
data=list(n=n, cov_rfft=cov_rfft),
seed=123,
chains=1,
iter_warmup=1000,
iter_sampling=5,
)
```
We extract the draws from the `fit` object and plot a realization of the process.
```{r, fig.width=6, fig.height=5}
f <- fit$draws("f", format="draws_matrix")
plot(1:n, f[1,], type="l", xlab="covariate x", ylab="Gaussian process f(x)")
```
This vignette illustrates the use of gptools with an elementary example. Further details can be found in the [more comprehensive documentation](http://gptools-stan.readthedocs.io/) although using the [`cmdstanpy`](https://mc-stan.org/cmdstanpy/) interface.