Skip to content

Commit

Permalink
more docs fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
MartinuzziFrancesco committed Dec 12, 2023
1 parent 09af471 commit e9e2cc5
Show file tree
Hide file tree
Showing 4 changed files with 39 additions and 39 deletions.
13 changes: 10 additions & 3 deletions docs/src/api/esn.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,23 @@
# Echo State Networks
The core component of an ESN is the `ESN` type. It represents the entire Echo State Network and includes parameters for configuring the reservoir, input scaling, and output weights. Here's the documentation for the `ESN` type:

```@docs
ESN
```

In addition to all the components that can be explored in the documentation, a couple components need a separate introduction. The ```variation``` arguments can be
## Variations
In addition to the standard `ESN` model, there are variations that allow for deeper customization of the underlying model. Currently, there are two available variations: `Default` and `Hybrid`. These variations provide different ways to configure the ESN. Here's the documentation for the variations:

```@docs
Default
Hybrid
```
The `Hybrid` variation is the most complex option and offers additional customization. Note that more variations may be added in the future to provide even greater flexibility.

## Training

These arguments detail a deeper variation of the underlying model, and they need a separate call. For the moment, the most complex is the ```Hybrid``` call, but this can and will change in the future.
All ESN models can be trained using the following call:
To train an ESN model, you can use the `train` function. It takes the ESN model, training data, and other optional parameters as input and returns a trained model. Here's the documentation for the train function:
```@docs
train
```
With these components and variations, you can configure and train ESN models for various time series and sequential data prediction tasks.
30 changes: 5 additions & 25 deletions docs/src/general/different_training.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,40 +8,20 @@ readout_layer = train(my_model, train_data, training_algo)
In this section, it is possible to explore how to properly build the `training_algo` and all the possible choices available. In the example section of the documentation it will be provided copy-pasteable code to better explore the training algorithms and their impact on the model.

## Linear Models
The library includes a standard implementation of ridge regression, callable using `StandardRidge(regularization_coeff)` where the default value for the regularization coefficient is set to zero. This is also the default model called when no model is specified in `train()`. This makes the default call for training `train(my_model, train_data)` use Ordinary Least Squares (OLS) for regression.
The library includes a standard implementation of ridge regression, callable using `StandardRidge(regularization_coeff)`. The default regularization coefficient is set to zero. This is also the default model called when no model is specified in `train()`. This makes the default call for training `train(my_model, train_data)` use Ordinary Least Squares (OLS) for regression.

Leveraging [MLJLinearModels](https://juliaai.github.io/MLJLinearModels.jl/stable/) it is possible to expand the choices of linear models used for the training. The wrappers provided are structured in the following way:
Leveraging [MLJLinearModels](https://juliaai.github.io/MLJLinearModels.jl/stable/) you can expand your choices of linear models for training. The wrappers provided follow this structure:
```julia
struct LinearModel
regression
solver
regression_kwargs
end
```
to call the ridge regression using the MLJLinearModels APIs, one can use `LinearModel(;regression=LinearRegression)`. It is also possible to use a specific solver, by calling `LinearModel(regression=LinearRegression, solver=Analytical())`. For all the available solvers, please refer to the [MLJLinearModels documentation](https://juliaai.github.io/MLJLinearModels.jl/stable/models/). To change the regularization coefficient in the ridge example, using for example `lambda = 0.1`, it is needed to pass it in the `regression_kwargs` like so `LinearModel(;regression=LinearRegression, solver=Analytical(), regression_kwargs=(lambda=lambda))`. The nomenclature of the coefficients must follow the MLJLinearModels APIs, using `lambda, gamma` for `LassoRegression` and `delta, lambda, gamma` for `HuberRegression`. Again, please check the [relevant documentation](https://juliaai.github.io/MLJLinearModels.jl/stable/api/) if in doubt. When using MLJLinearModels based regressors, do remember to specify `using MLJLinearModels`.
To call the ridge regression using the MLJLinearModels APIs, you can use `LinearModel(;regression=LinearRegression)`. You can also choose a specific solver by calling, for example, `LinearModel(regression=LinearRegression, solver=Analytical())`. For all the available solvers, please refer to the [MLJLinearModels documentation](https://juliaai.github.io/MLJLinearModels.jl/stable/models/).

## Gaussian Processes
Another way to obtain the readout layer is possible using Gaussian regression. This is provided through a wrapper of [GaussianProcesses](http://stor-i.github.io/GaussianProcesses.jl/latest/) structured in the following way:
```julia
struct GaussianProcess
mean
kernel
lognoise
optimize
optimizer
end
```
While it is necessary to specify a `mean` and a `kernel`, the other defaults are `lognoise=-2, optimize=false, optimizer=Optim.LBFGS()`. For the choice of means and kernels, please refer to the proper documentation, [here](http://stor-i.github.io/GaussianProcesses.jl/latest/mean/) and [here](http://stor-i.github.io/GaussianProcesses.jl/latest/kernels/), respectively.
To change the regularization coefficient in the ridge example, using for example `lambda = 0.1`, it is needed to pass it in the `regression_kwargs` like so `LinearModel(;regression=LinearRegression, solver=Analytical(), regression_kwargs=(lambda=lambda))`. The nomenclature of the coefficients must follow the MLJLinearModels APIs, using `lambda, gamma` for `LassoRegression` and `delta, lambda, gamma` for `HuberRegression`. Again, please check the [relevant documentation](https://juliaai.github.io/MLJLinearModels.jl/stable/api/) if in doubt. When using MLJLinearModels based regressors, do remember to specify `using MLJLinearModels`.

Building on the simple example given in the GaussianProcesses documentation, it is possible to build an intuition of how to use these algorithms for training ReservoirComputing.jl models.
```julia
mZero = MeanZero() #Zero mean function
kern = SE(0.0,0.0) #Squared exponential kernel (note that hyperparameters are on the log scale)
logObsNoise = -1.0

gp = GaussianProcess(mZero, kern, lognoise=logObsNoise)
```
Like in the previous case, if one uses GaussianProcesses based regressors, it is necessary to specify `using GaussianProcesses`. Additionally, if the optimizer chosen is from an external package, i.e. Optim, that package needs to be used in the script as well by adding `using Optim`.

## Support Vector Regression
Contrary to the `LinearModel`s and `GaussianProcess`es, no wrappers are needed for support vector regression. By using [LIBSVM.jl](https://github.com/JuliaML/LIBSVM.jl), LIBSVM wrappers in Julia, it is possible to call both `epsilonSVR()` or `nuSVR()` directly in `train()`. For the full range of kernels provided and the parameters to call, we refer the user to the official [documentation](https://www.csie.ntu.edu.tw/~cjlin/libsvm/). Like before, if one intends to use LIBSVM regressors, it is necessary to specify `using LIBSVM`.
Contrary to the `LinearModel`s, no wrappers are needed for support vector regression. By using [LIBSVM.jl](https://github.com/JuliaML/LIBSVM.jl), LIBSVM wrappers in Julia, it is possible to call both `epsilonSVR()` or `nuSVR()` directly in `train()`. For the full range of kernels provided and the parameters to call, we refer the user to the official [documentation](https://www.csie.ntu.edu.tw/~cjlin/libsvm/). Like before, if one intends to use LIBSVM regressors, it is necessary to specify `using LIBSVM`.
12 changes: 8 additions & 4 deletions docs/src/general/predictive_generative.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
# Generative vs Predictive
The library provides two different methods for prediction, denoted as `Predictive()` and `Generative()`, following the two major applications of Reservoir Computing models found in the literature. Both of these methods are given as arguments for the trained model. While copy-pasteable example swill be provided further on in the documentation, it is better to clarify the difference early on to focus more on the library implementation going forward.
The library provides two different methods for prediction, denoted as `Predictive()` and `Generative()`. These methods correspond to the two major applications of Reservoir Computing models found in the literature. This section aims to clarify the differences between these two methods before providing further details on their usage in the library.

## Predictive
In the first method, the user can use Reservoir Computing models similarly as standard Machine Learning models. This means using a set of features as input and a set of labels as outputs. In this case, the features and labels can be vectors of different dimensions, as ``X=\{x_1,...,x_n\} \ x_i \in \mathbb{R}^{N}`` and ``Y=\{y_1,...,y_n\} \ y_i \in \mathbb{R}^{M}`` where ``X`` is the feature set and ``Y`` the label set. Given the difference in dimensionality for the prediction call, it will be needed to feed to the function the feature set to be labeled, for example by calling `Predictive(X)` using the set given in this example.
In the first method, users can utilize Reservoir Computing models in a manner similar to standard Machine Learning models. This involves using a set of features as input and a set of labels as outputs. In this case, both the feature and label sets can consist of vectors of different dimensions. Specifically, let's denote the feature set as ``X=\{x_1,...,x_n\}`` where ``x_i \in \mathbb{R}^{N}``, and the label set as ``Y=\{y_1,...,y_n\}`` where ``y_i \in \mathbb{R}^{M}``.

!this allows for one step ahead or h steps ahead prediction.
To make predictions using this method, you need to provide the feature set that you want to predict the labels for. For example, you can call `Predictive(X)` using the feature set ``X`` as input. This method allows for both one-step-ahead and multi-step-ahead predictions.

## Generative
The generative method allows the user to extend the forecasting capabilities of the model, letting the predicted results to be fed back into the model to generate the next prediction. By doing so, the model can run autonomously, without any feature dataset as input. The call for this model needs only the number of steps that the user intends to forecast, for example calling `Generative(100)` to generate one hundred time steps.
The generative method provides a different approach to forecasting with Reservoir Computing models. It enables you to extend the forecasting capabilities of the model by allowing predicted results to be fed back into the model to generate the next prediction. This autonomy allows the model to make predictions without the need for a feature dataset as input.

To use the generative method, you only need to specify the number of time steps that you intend to forecast. For instance, you can call `Generative(100)` to generate predictions for the next one hundred time steps.

The key distinction between these methods lies in how predictions are made. The predictive method relies on input feature sets to make predictions, while the generative method allows for autonomous forecasting by feeding predicted results back into the model.
23 changes: 16 additions & 7 deletions docs/src/general/states_variation.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,28 @@
# Altering States
In every ReservoirComputing model is possible to perform some alteration on the states in the training stage. Depending on the chosen modification, this can improve the results of the prediction. Or more simply, they can be used to reproduce results in the literature. The alterations are divided into two possibilities: the first concerns padding or extending the states, and the second concerns non-linear algorithms performed over the states.
In ReservoirComputing models, it's possible to perform alterations on the reservoir states during the training stage. These alterations can improve prediction results or replicate results found in the literature. Alterations are categorized into two possibilities: padding or extending the states, and applying non-linear algorithms to the states.

## Padding and Extending States
Extending the states means appending to them the corresponding input values. If ``\textbf{x}(t)`` is the reservoir state at time t corresponding to the input ``\textbf{u}(t)`` the extended state will be represented as `` [\textbf{x}(t); \textbf{u}(t)]`` where ``[;]`` is intended as vertical concatenation. This procedure is, for example, used in [Jaeger's Scholarpedia](http://www.scholarpedia.org/article/Echo_state_network) description of Echo State Networks. The extension of the states can be obtained in every ReservoirComputing.jl model by using the keyword argument `states_type` and calling the method `ExtendedStates()`. No argument is needed.
### Extending States

Padding the states means appending a constant value, 1.0 for example, to each state. Using the notation introduced before, we can define the padded states as ``[\textbf{x}(t); 1.0]``. This approach is detailed in the [seminal guide](https://mantas.info/get-publication/?f=Practical_ESN.pdf) to Echo State Networks by Mantas Lukoševičius. By using the keyword argument `states_type` the user can call the method `PaddedStates(padding)` where `padding` represents the value that will be concatenated to the states. As default, the value is set to unity, so the majority of the time, calling `PaddedStates()` will suffice.
Extending the states involves appending the corresponding input values to the reservoir states. If \(\textbf{x}(t)\) represents the reservoir state at time \(t\) corresponding to the input \(\textbf{u}(t)\), the extended state is represented as \([\textbf{x}(t); \textbf{u}(t)]\), where \([;]\) denotes vertical concatenation. This procedure is commonly used in Echo State Networks and is described in [Jaeger's Scholarpedia](http://www.scholarpedia.org/article/Echo_state_network). You can extend the states in every ReservoirComputing.jl model by using the `states_type` keyword argument and calling the `ExtendedStates()` method. No additional arguments are needed.

Though not easily found in the literature, it is also possible to pad the extended states by using the method `PaddedExtendedStates(padding)` that has unity as `padding` default as well.
### Padding States
Padding the states involves appending a constant value, such as 1.0, to each state. In the notation introduced earlier, padded states can be represented as \([\textbf{x}(t); 1.0]\). This approach is detailed in the [seminal guide](https://mantas.info/get-publication/?f=Practical_ESN.pdf) to Echo State Networks by Mantas Lukoševičius. To pad the states, you can use the `states_type` keyword argument and call the `PaddedStates(padding)` method, where `padding` represents the value to be concatenated to the states. By default, the padding value is set to 1.0, so most of the time, calling `PaddedStates()` will suffice.

Of course, it is also possible to not apport any of these changes to the states by calling `StandardStates()`. This is also the default choice for the states.
Additionally, you can pad the extended states by using the `PaddedExtendedStates(padding)` method, which also has a default padding value of 1.0.

You can choose not to apply any of these changes to the states by calling `StandardStates()`, which is the default choice for the states.

## Non-Linear Algorithms
First introduced in [^1] and expanded in [^2] these are nonlinear combinations of the columns of the matrix states. There are three such algorithms implemented. Using the keyword argument `nla_type` it is possible to choose in every model in ReservoirComputing.jl the specific non-linear algorithm to use. The default value is set to `NLADefault()`, where no non-linear algorithm takes place.
First introduced in [^1] and expanded in [^2], non-linear algorithms are nonlinear combinations of the columns of the matrix states. There are three such algorithms implemented in ReservoirComputing.jl, and you can choose which one to use with the `nla_type` keyword argument. The default value is set to `NLADefault()`, which means no non-linear algorithm is applied.

The available non-linear algorithms are:

- `NLAT1()`
- `NLAT2()`
- `NLAT3()`

Following the nomenclature used in [^2], the algorithms can be called as `NLAT1()`, `NLAT2()` and `NLAT3()`. To better explain what they do, let ``\textbf{x}_{i, j}`` be elements of the state matrix, with ``i=1,...,T \ j=1,...,N`` where ``T`` is the length of the training and ``N`` is the reservoir size.
These algorithms perform specific operations on the reservoir states. To provide a better understanding of what they do, let ``\textbf{x}_{i, j}`` be elements of the state matrix, with ``i=1,...,T \ j=1,...,N`` where ``T`` is the length of the training and ``N`` is the reservoir size.

**NLAT1**
```math
Expand Down

0 comments on commit e9e2cc5

Please sign in to comment.