-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.Rmd
157 lines (106 loc) · 6.5 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
---
output:
github_document
bibliography: inst\\REFERENCES.bib
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
eval = TRUE
)
```
# ir <img src='man/figures/logo-hex.png' align="right" height="139" alt="logo" style="float:right; height:200px;" />
<!-- badges: start -->
[![DOI](https://zenodo.org/badge/234117897.svg)](https://zenodo.org/badge/latestdoi/234117897)
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN status](https://www.r-pkg.org/badges/version/ir)](https://CRAN.R-project.org/package=ir)
<!-- badges: end -->
## Overview
'ir' is an R package that contains simple functions to import, handle and preprocess infrared spectra. Infrared spectra are stored as list columns in data frames to enable efficient storage of metadata along with the spectra and support further analyses containing other data for the same samples.
**Supported file formats for import are:**
1. .csv files with individual spectra.
2. Thermo Galactic's .spc files with individual spectra.
**Provided functions for preprocessing and general handling are:**
1. baseline correction with:
* a polynomial baseline
* a convex hull baseline
* a Savitzky-Golay baseline [@Lasch.2012].
2. binning.
3. clipping.
4. interpolating (resampling, linearly).
5. replacing selected parts of a spectrum by a straight line.
6. averaging spectra within specified groups.
7. normalizing spectra:
* to the maximum intensity
* to the intensity at a specific x value
* so that all intensity values sum to 1.
8. smoothing:
* Savitzky-Golay smoothing
* Fourier smoothing.
9. computing derivatives of spectra using Savitzky-Golay smoothing.
10. mathematical transformations (addition, subtraction, multiplication, division).
11. computing the variance of intensity values (optionally after subtracting reference spectra).
12. computing maxima, minima, median, and ranges of intensity values of spectra.
13. plotting.
14. [tidyverse](https://www.tidyverse.org/) methods.
### How to install
You can install 'ir' from CRAN using R via:
```{r installation-cran, eval = FALSE}
install.packages("ir")
```
You can install 'ir' from GitHub using R via:
```{r installation-github, eval = FALSE}
remotes::install_github(repo = "henningte/ir")
```
### How to use
You can load 'ir' in R with:
```{r load_ir}
# load ir package
library(ir)
# load additional packages needed for this tutorial
library(ggplot2)
```
For brief introductions, see below and the two vignettes:
1. [`r rmarkdown::yaml_front_matter("vignettes/ir-introduction.Rmd")$title`](https://henningte.github.io/ir/articles/ir-introduction.html)
2. [`r rmarkdown::yaml_front_matter("vignettes/ir-class.Rmd")$title`](https://henningte.github.io/ir/articles/ir-class.html)
#### Sample workflow
A simple workflow would be, for example, to baseline correct the spectra, then bin them to bins with a width of 10 wavenumber units, then normalize them so that the maximum intensity value is 1 and the minimum intensity value is 0 and then plot the baseline corrected spectra for each sample and sample type. Here's the 'ir' code using the built-in sample data `ir_sample_data`.
```{r sample_data_workflow}
ir_sample_data %>% # data
ir::ir_bc(method = "rubberband") %>% # baseline correction
ir::ir_bin(width = 10) %>% # binning
ir::ir_normalize(method = "zeroone") %>% # normalization
plot() + ggplot2::facet_wrap(~ sample_type) # plot
```
#### Data structure
You can load the sample data with:
```{r ir_sample_data_load}
ir::ir_sample_data
```
`ir_sample_data` is an object of class `ir`. An Object of class `ir` is basically a data frame where each row represents one infrared measurement and column `spectra` contains the infrared spectra (one per row).
This allows effectively storing repeated measurements for the same sample in the same table, as well as any metadata and accessory data (e.g. nitrogen content of the sample).
The column `spectra` is a list column of data frames, meaning that each cell in `sample_data` contains for column `spectra` a data frame. For example, the first element of `ir_sample_data$spectra` represents the first spectrum as a data frame:
```{r ir_sample_data_inspect_spectra}
# View the first ten rows of the first spectrum in ir_sample_data
ir::ir_get_spectrum(ir_sample_data, what = 1)[[1]] %>%
head(10)
```
Column `x` represents the x values (in this case wavenumbers [cm<sup>-1</sup>]) and column `y` the corresponding intensity values.
### How to cite
Please cite this R package as:
> Henning Teickner (`r format(Sys.Date(), "%Y")`). _ir: Functions to Handle and Preprocess Infrared Spectra_. DOI: 10.5281/zenodo.5747170. Accessed `r format(Sys.Date(), "%d %b %Y")`. Online at <https://zenodo.org/record/5747170>.
### Companion packages
[irpeat](https://github.com/henningte/irpeat/) builds on 'ir'. irpeat provides functions to analyze infrared spectra of peat (humification indices, prediction models).
### Licenses
**Text and figures :** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
**Code :** See the [DESCRIPTION](DESCRIPTION) file
**Data :** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) attribution requested in reuse. See the sources section for data sources and how to give credit to the original author(s) and the source.
### Contributions
We welcome contributions from everyone. Before you get started, please see our [contributor guidelines](CONTRIBUTING.md). Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
### Sources
The complete data in this package is derived from @Hodgkins.2018 and was restructured to match the requirements of 'ir'. The original article containing the data can be downloaded from https://www.nature.com/articles/s41467-018-06050-2 and is distributed under the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/). The data on Klason lignin and holocellulose content was originally derived from @LaCruz.2016.
This packages was developed in R (`r R.Version()$version.string`) [@RCoreTeam.2019] using functions from devtools [@Wickham.2019], usethis [@Wickham.2019b], rrtools [@Marwick.2019] and roxygen2 [@Wickham.2019c].
### References