An R interface for libeemd C library for ensemble empirical mode decomposition (EEMD) and its complete variant (CEEMDAN). These methods decompose possibly nonlinear and/or nonstationary time series data into a finite amount of components (called IMFs, insintric mode functions) separated by instantaneous frequencies. This decomposition provides a powerful method to look into the different processes behind a given time series, and provides a way to separate short time-scale events from a general trend.
If you use Rlibeemd/libeemd for scientific work please cite Luukko, P.J.J., Helske, J., Räsänen, E., Comput. Stat. 31, 545 (2016) (also on arXiv). This article also describes in detail what libeemd actually computes. You should definitely read it if you are unsure about what EMD, EEMD and CEEMDAN are.
Current CRAN policies do not allow the use of SHLIB_OPENMP_CFLAGS
combined with linking with C++. Therefore the CRAN version does not use OpenMP at all anymore (OpenMP flags have been removed from Makevars
), but the the version on GitHub version does. So if you want to use parallel version of the Rlibeemd
, please install the package via
devtools::install_github("helske/Rlibeemd")
Note that this installs the package from source, so you need to have GSL installed. For Linux, use something like sudo apt-get install libgsl2 libgsl-dev
, whereas in Windows you can download GSL files from here:
https://www.stats.ox.ac.uk/pub/Rtools/goodies/multilib/ (file local323.zip
or equivalent). You also need to add environmental variable LIB_GSL=<path/to/gsl>
. For Windows, binaries are also available (see the latest release) which is probably easier option:
install.packages("https://github.com/helske/Rlibeemd/releases/download/v1.4.2/Rlibeemd_1.4.2.zip", repos = NULL)
Please file an issue if you encounter portability issues (so far none found), or if you figure out a way to enable OpenMP in CRAN version without CRAN checks complaining.
Here a CEEMDAN decomposition is performed for the UK gas consumption series (length n = 108).
By default, ceemdan
extracts [log_2(n)] components, so here we get five IMFs and the residual.
library("Rlibeemd")
data(UKgas, package = "datasets")
imfs <- ceemdan(UKgas, ensemble_size = 1000)
plot(imfs, main = "Five IMFs and residual extracted by CEEMDAN algorithm")
The residual components shows smooth trend whereas the first IMF contains clear multiplicative trend. The remaining IMFs are bit more complex, and one could argue that they are partly seasonal, trend or just some irregularity i.e. noise.
Let us compare the decomposition with basic structural time series model fit from StructTS
(for smoothing of more complex state space models, one could use KFAS)
bsm <- tsSmooth(StructTS(UKgas))
plot(bsm[, c(1, 3)], main = "Local linear trend and seasonal components by StructTS")
StructTS
decomposes the data for three components, where one of the components is (possibly time varying) slope, which has no direct effect to overall signal (it is the slope of the level component).
ts.plot(cbind(UKgas, imfs[, ncol(imfs)], rowSums(imfs[, 5:6]), bsm[,"level"]), col = 1:4,
main = "Quarterly UK gas consumption", ylab = "Million therms")
legend("topleft", c("Observations", "Residual", "Last IMF + residual", "Trend from BSM"),
col = 1:4, lty = 1)
The IMF_5 + residual is quite close to the trend obtained by structural time series model of StructTS
.