-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path26-iteration.Rmd
252 lines (167 loc) · 8.87 KB
/
26-iteration.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
# Iteration
**Learning objectives:**
- **Reduce duplication** in code by **iterating** over a pattern using a **`for` loop.**
- **Modify an object** using a **`for` loop.**
- Recognize the **three basic ways** to **loop over a vector** using a **`for` loop.**
- Handle **unknown output lengths** by using a **more complex result** and then **combining** the result after the **`for` loop.**
- Handle **unknown sequence lengths** with a **`while` loop.**
- **Reduce duplication** in code by passing a function as an argument to a function **(functional programming).**
- Make iteration code **easier to read** with the **`map` family** of functions from **`{purrr}`.**
- Make `map` code **faster to type** with **shortcuts.**
- **Compare** the **`map` family** of functions to the base R **`apply` family.**
- **Deal with errors** and other output using the **`{purrr}` adverbs: `safely`, `possibly`, and `quietly`.**
- Map over **multiple arguments** with the **`map2` family** and the **`pmap` family** from `{purrr}`.
- Map over **multiple functions** with the **`invoke_map` family** from `{purrr}` and its replacements (see `?purrr::invoke_map`, "Life cycle").
- Call a function for its **side effects** using the **`walk`** family of functions from `{purrr}`.
- Recognize the **other iteration functions** from `{purrr}`.
## Intro
**Good iteration - Pre-allocate the shape/structure and then fill in the data**.
**Imperative programming** - `for` loops and `while` loops make iteration very explicit.
**Functional programming** - common `for` loop patterns get their own functions, so you can streamline your code even more.
Packages: `library(tidyverse)`, `library(purrr)`
## For loops
Each loop has three components:
- **output** - Allocate enough space for your `for` loop. A loop that grows at each iteration will be very "slow".
- **sequence** - What to loop over.
- `seq_along()` is a safe version of the familiar `1:length(x)`: if you have a zero-length vector, it will tell you.
- **body** - This is the code that does the work.
## For loop variations
There are four variations on the basic theme of the `for` loop:
1. Modifying an existing object, instead of creating a new object.
2. Looping over names or values, instead of indices.
- There are three basic ways to loop over a vector.
- `for (i in seq_along(df))`
- `for (x in xs)` - good for **side-effects**
- `for (nm in names(xs))` - creates name to access value with `x[[nm]]`
3. Handling outputs of unknown length.
- `unlist()` flattens a list of vectors into a single vector,
- `purrr::flatten_dbl()` is stricter --- it will throw an error if the input isn't a list of doubles.
4. Handling sequences of unknown length.
- Use a `while` loop.
## For loops vs. functionals
`For` loops are not as important in R as they are in other languages because R is a functional programming language.
This means that it's possible to put `for` loops in a function, and call that function instead of using the `for` loop directly.
Base R and the `purrr` package have functions for many common loops.
## The map functions
- `map(.x, .f, ...)` makes a list.
- `map_lgl(.x, .f, ...)` makes a logical vector.
- `map_int(.x, .f, ...)` makes an integer vector.
- `map_dbl(.x, .f, ...)` makes a double vector.
- `map_chr(.x, .f, ...)` makes a character vector.
**Shortcuts with `purrr`:**
- a one-sided formula creates an anonymous function
- use a string to extract named components
- Use an integer to select elements by position
**Base R:**
- `lapply(X, FUN, ...)` makes a list (`map` is more consistent)
- `sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)` wrapper around `lapply`
- `vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)` safe alternative to `sapply`, can make a matix
## Dealing with failure
`safely()` - designed to work with `map`
- `result` is the original result. If there was an error, this will be `NULL`.
- `error` is an error object. If the operation was successful, this will be `NULL`.
`possibly()` - you give it a default value to return when there is an error.
`quietly()` - instead of capturing errors, it captures printed output, messages, and warnings.
## Mapping over multiple arguments
`map2()` and `pmap()` functions are for multiple related inputs that you need iterate along in parallel.
**Invoking different functions**
*Read the help in purrr on these functons. If you need to do this read Advanced R. `invoke_map` is retired.*
## Walk
`walk()` is an alternative to `map()` that you use when you want to call a function for its side effects, rather than for its return value.
`walk2` and `pwalk` - more useful than `walk`.
## Other patterns of for loops
**Predicate functions -** return `TRUE` or `FALSE`
- `keep` and `discard` - keep elements are `TRUE` and discard are `FALSE`.
- `some` and `every` - determine if `TRUE` for any (some) or all (every)
- `detect` and `detect_index` - detect finds first element where `TRUE`, and detect_index returns the position
- `head_while()` and `tail_while()` - while `TRUE` take elements from start (head_while) or end (tail_while)
**Reduce and accumulate**
- `reduce()` takes a "binary" function, applies it repeatedly to a list until there is only a single element left.
- `accumulate()` keeps all the interim results.
## Meeting Videos
### Cohort 5
`r knitr::include_url("https://www.youtube.com/embed/0rsV1jlxhws")`
<details>
<summary> Meeting chat log </summary>
```
00:03:23 Becki R. (she/her): I'm having trouble with a buzz again
00:03:37 Njoki Njuki Lucy: I so look forward to the discussion. I have been struggling with understanding this particular chapter! :)
00:26:58 Jon Harmon (jonthegeek): > x <- purrr::set_names(month.name, month.abb)
> x
00:27:21 Jon Harmon (jonthegeek): x[["Jan"]]
00:30:12 Jon Harmon (jonthegeek): results <- purrr::set_names(vector("list", length(x)), names(x))
00:35:10 lucus w: A data frame is simply a list in disguise
00:45:37 Njoki Njuki Lucy: It makes sense now, thanks!
00:48:05 Ryan Metcalf: Sorry team, I have to drop. Sister-in-law is stranded and needs a jump-start. I’ll finish watching the recording and catch any questions.
00:51:19 Jon Harmon (jonthegeek): > paste(month.name, collapse = "")
[1] "JanuaryFebruaryMarchAprilMayJuneJulyAugustSeptemberOctoberNovemberDecember"
> paste(month.name, collapse = " ")
[1] "January February March April May June July August September October November December"
01:09:10 Njoki Njuki Lucy: that's so cool! I wondered! Ha!!
01:10:30 Federica Gazzelloni: thanks Jon!!
```
</details>
`r knitr::include_url("https://www.youtube.com/embed/CEKPAUWTA3c")`
<details>
<summary> Meeting chat log </summary>
```
00:10:17 Becki R. (she/her): brb!
00:26:00 Njoki Njuki Lucy: does this mean, in this case, we will have 3 regression models?
00:26:21 Federica Gazzelloni: can you specify: mtcars%>%map(.f=
00:27:12 Federica Gazzelloni: mtcars%>%split(.$cyl)%>%map(.f=lm(….))
00:27:41 Ryan Metcalf: I’m reading it as `mpg` being dependent (y) and `wt` being independent.
00:29:08 Jon Harmon (jonthegeek): # A more realistic example: split a data frame into pieces, fit a
# model to each piece, summarise and extract R^2
mtcars %>%
split(.$cyl) %>%
map(~ lm(mpg ~ wt, data = .x)) %>%
map(summary) %>%
map_dbl("r.squared")
00:29:55 Jon Harmon (jonthegeek): mtcars %>%
split(.$cyl) %>%
map(.f = ~ lm(mpg ~ wt, data = .x))
00:30:22 Jon Harmon (jonthegeek): mtcars %>%
split(.$cyl) %>%
map(.f = lm(mpg ~ wt, data = .x))
00:45:11 Federica Gazzelloni: coalesce()
01:17:01 Ryan Metcalf: Great Job Becki!!!
01:17:25 Becki R. (she/her): Thanks :)
```
</details>
### Cohort 6
`r knitr::include_url("https://www.youtube.com/embed/NVUHFpYUmA4")`
<details>
<summary> Meeting chat log </summary>
```
00:04:39 Marielena Soilemezidi: I'll be back in 3'! No haste with the setup:)
00:04:52 Adeyemi Olusola: Ok
00:07:22 Adeyemi Olusola: Let me know when you return
```
</details>
`r knitr::include_url("https://www.youtube.com/embed/YnZSfzMGhTE")`
<details>
<summary> Meeting chat log </summary>
```
00:11:10 Marielena Soilemezidi: hello! :)
00:20:31 Marielena Soilemezidi: yep, it looks good!
00:46:18 Daniel Adereti: How does it get the list of 3?
00:47:55 Marielena Soilemezidi: sorry, got disconnected for some time, but I'm back!
00:48:05 Daniel Adereti: No worries!
00:58:28 Adeyemi Olusola: olusolaadeyemi.ao@gmail.com
```
</details>
### Cohort 7
`r knitr::include_url("https://www.youtube.com/embed/vPEgWgs0q7s")`
<details>
<summary> Meeting chat log </summary>
```
00:06:31 Oluwafemi Oyedele: Hi Tim!!!
00:06:47 Tim Newby: Hi Oluwafemi, can you here me?
00:06:48 Oluwafemi Oyedele: We will start in 6 minutes time!!!
00:13:26 Oluwafemi Oyedele: start
00:56:02 Oluwafemi Oyedele: https://adv-r.hadley.nz/functionals.html
00:56:25 Oluwafemi Oyedele: stop
```
</details>
### Cohort 8
`r knitr::include_url("https://www.youtube.com/embed/TQabUIBbJKs")`