-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.qmd
83 lines (51 loc) · 4.59 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
title: "Overview"
---
### Welcome!
This workshop provides an overview of many of the packages included in the Tidyverse suite of packages for the R programming language. The Tidyverse is a veritable universe of tools though that no single workshop could hope to cover so **we are focusing here on an introductory approach that focuses primarily on some fundamentals to tidying data in R**. We are always happy to improve workshop content so please don't hesitate to [post an Issue](https://github.com/lter/workshop-tidyverse/issues) on our GitHub repository if you see clear areas for improvement!
<img src="images/logos/hex_tidyverse.png" alt="Hex logo for the Tidyverse R package" align="right" width="17%" />
To maximize the value of this workshop to you, we recommend that you take the following steps **before the day of the workshop**. If anything is unclear, feel free to reach out to us; our contact information can be found [here](https://lter.github.io/scicomp/staff.html)
## Workshop Preparation
**For those of you with a dedicated IT team that has sole power to install software on your computer**: you may need to contact them before the workshop to do the installation bits of the prep steps we outline below.
### 1. Install R
Install [R](https://www.r-project.org/). If you already have R, check that you have at least version 4.0.0 by running the following code:
```{r check-r-version}
#| eval: false
version$version.string
```
If your version starts with a 3 (e.g., the above code returns "R version 3..."), please update R to make sure all packages behave as expected.
### 2. Install RStudio
Once you have R (ver. ≥4.0), install [RStudio](https://www.rstudio.com/products/rstudio/download/). If you already have RStudio installed, you may want to make sure that you're using a recent version to take advantage of some quality of life improvements that are broadly useful.
### 3. Install Some R Packages
**Install the `tidyverse` and `palmerpenguins` R packages** using the following code:
```{r install-packages, eval = F}
#| eval: false
install.packages("librarian")
librarian::shelf(tidyverse, palmerpenguins)
```
**Please run the above code even if you already have these packages** to update these packages and ensure that your code aligns with the examples and challenges introduced during the workshop.
### 4. Celebrate!
After following all the previous preparation steps, your setup should now be complete.
## Example Data - Penguins
The data we'll be using for this workshop comes from the `palmerpenguins` package, maintained by [Allison Horst](https://allisonhorst.com/). The "penguins" dataset from this package contains size measurements for adult foraging penguins near Palmer Station, Antarctica. Data were collected and made available by Dr. Kristen Gorman and the Palmer Station Long Term Ecological Research (LTER) Program. Let's take a look at it!
```{r libraries-index, echo = F, message = F}
library(tidyverse); library(palmerpenguins)
```
```{r penguin-data}
penguins
```
The "penguins" dataset has 344 rows and 8 columns.
The columns are as follows:
- `species`: a factor denoting penguin species (Adélie, Chinstrap and Gentoo)
- `island`: a factor denoting island in Palmer Archipelago, Antarctica (Biscoe, Dream or Torgersen)
- `bill_length_mm`: a number denoting bill length (millimeters)
- `bill_depth_mm`: a number denoting bill depth (millimeters)
- `flipper_length_mm`: an integer denoting flipper length (millimeters)
- `body_mass_g`: an integer denoting body mass (grams)
- `sex`: a factor denoting penguin sex (female, male)
- `year`: an integer denoting the study year (2007, 2008, or 2009)
This dataset is an example of **tidy data**, which means that each **variable** is in its own **column** and each **observation** is in its own **row**. Generally speaking, functions from packages in the Tidyverse expect tidy data though they can be used in some cases to help get data into tidy format! Regardless, the penguins dataset is what we'll use for all examples in this workshop so be sure that you install the `palmerpenguins` R package. The examples on this page were adapted from [Allison Horst's `dplyr` tutorial](https://allisonhorst.shinyapps.io/dplyr-learnr/#section-welcome)!
## Websites to Visit
### Supplemental Material
While not technically necessary to attend the workshop, if you'd like you can see the content that created the workshop website you are viewing by visiting our [GitHub repository here](https://github.com/lter/workshop-tidyverse).
Also, check out **NCEAS' [Learning Hub](https://www.nceas.ucsb.edu/learning-hub)** for a complete list of workshops and trainings offered by NCEAS.