-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathfoundations_ls06_rmarkdown.qmd
496 lines (310 loc) · 22.6 KB
/
foundations_ls06_rmarkdown.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
---
title: 'R Markdown'
output:
html_document:
number_sections: true
toc: true
toc_float: true
css: !expr here::here("global/style/style.css")
highlight: kate
editor_options:
markdown:
wrap: 100
canonical: true
chunk_output_type: console
---
```{r, include = FALSE, warning = FALSE, message = FALSE}
## Load packages
if(!require(pacman)) install.packages("pacman")
pacman::p_load(tidyverse, highcharter, here, htmltools)
## Source functions
source(here("global/functions/misc_functions.R"))
## knitr settings
knitr::opts_chunk$set(warning = F, message = F,
class.source = "tgc-code-block", error = T)
```
## Introduction
The {rmarkdown} package enables you to generate dynamic documents by combining formatted text and results produced by R code. With R Markdown, you can create documents in various formats such as HTML, PDF, Word, and many others, making it a versatile tool for exporting, communicating, and sharing your analysis results.
This document itself was created using R Markdown. While there is an entire book dedicated to R Markdown, we will cover some of the essential concepts here.
Note that working with R Markdown requires using a lot of the graphical user interface (GUI) tools in RStudio. Because of this, the written notes in this lesson will not be as detailed as in other lessons. For deeper understanding, we recommend that you follow along with the accompanying video tutorial.
## Learning objectives {.unlisted .unnumbered}
- Create and knit an R Markdown document that includes code and free text
- Output documents in multiple formats, including HTML, PDF, Word, PowerPoint, and flexdashboards
- Understand basic Markdown syntax
- Use R chunk options, such as *eval*, *echo*, and *message*
- Know the syntax for inline R code
- Recognize useful packages for table formatting in R Markdown
- Understand how to use the {here} package to set the project folder as the working directory in R Markdown files
## Project setup
To begin, open RStudio and click on the *File* menu. Select *New Project...* and then click on *New Directory*. Choose a name for your project and specify the directory where you want to store it. Remember the location for future reference. Once you have filled out these fields, click *Create Project*.
Next, let's set up some folders within the project. In the *Files* pane, click on *New Folder* and name it "data". Click *OK*. This folder will store the project's data. Create another folder called "rmd" to store your R Markdown documents.
## Create a new document
An R Markdown document is a simple text file with the `.Rmd` extension.
To create a new R Markdown document in RStudio, go to the *File* menu, choose *New file*, and then select *R Markdown...*. If prompted, install the necessary packages. Once RStudio has the required packages, the following dialog box will appear:
![](images/new_rmd_document.jpg){width="355"}
For now, keep the default values and click OK. A file with sample content will be displayed.
Experiment with editing some of the text in the file. Notice that it consists of free text and code sections.
Save your file using `Cmd/Ctrl` + `S`, and make sure to give it the ".Rmd" extension. For example, "ebola_analysis.Rmd". Save it in the "rmd" folder you created earlier.
To render the document, click on the "knit" button at the top right:
![](images/rmarkdown_knit_button.jpg){width="271"}
This will generate an HTML output that looks like this:
![](images/rmarkdown_default_html_output.jpg){width="544"}
The rendered file will be stored in the same directory as your Rmd file, with the same name but ending in ".html" instead of ".rmd".
::: {.callout-note title='Vocab'}
HTML stands for Hypertext Markup Language and is the standard format used for most documents on the web.
:::
## R Markdown Header (YAML)
Let's return to the rest of the Rmd file and examine it part by part.
The first part of the document is its *header*, also known as "YAML" (Yet Another Markup Language). The name is intended to be humorous.
``` yaml
---
title: "Untitled"
output: html_document
date: "2022-10-09"
---
```
The YAML header must be located at the very beginning of the document, delimited by three dashes (`---`) before and after.
This header contains the document's metadata, such as its title, author, date, and various options that allow you to configure and customize the entire document and its rendering. For example, the line `output: html_document` specifies that the generated document should be in HTML format.
You can change the `html_document` text to experiment with other formats.
### Word Document
If you set the output to "word_document", and click to tknit the file, the rendered document will look like this:
![Image of the R Markdown document open in the Microsoft Word program](images/rmarkdown_word_screen.jpg)
A ".docx" version of your document will be created in the "rmd" folder.
### PowerPoint Document
When the output is set to "powerpoint_document", the result will be:
![Image of the R Markdown document open in the Microsoft PowerPoint program](images/rmarkdown_powerpoint_screen.jpg)
### PDF Document
If you change the output setting to "pdf_document", you can obtain the same document in PDF format (you may be prompted to install tinytex on your computer, see below):
![](images/rmd_pdf_format.jpg){width="418"}
::: {.callout-note title='Key Point'}
For PDF generation, you must have a working `LaTeX` installation on your system. If not, Yihui Xie's `tinytex` extension aims to simplify the installation of a minimal `LaTeX` distribution regardless of your machine's operating system.
To use it, first install the extension with `install.packages('tinytex')`, then run the following command in the console (expect a download of about 200MB): `tinytex::install_tinytex()`. More information is available on [the tinytex website](https://yihui.name/tinytex/).
:::
### Prettydoc
To try the "prettydoc" format, type `install.packages('prettydoc')` into the console and press *Enter*. The output format for prettydoc is slightly different from the previous three. You need to use `prettydoc::html_pretty` in the `output` section. When you knit a prettydoc, you should see something like this:
![Image of the R Markdown document as a prettydoc](images/rmarkdown_prettydoc_screen.jpg)
### Flexdashboard
You can even create a simple dashboard format. First, run `install.packages('flexdashboard')`. Then, set the `output` to `flexdashboard::flex_dashboard` and knit. The result will be similar to the following:
![Image of the R Markdown document as a flexdashboard](images/rmarkdown_flexdashboard_screen.jpg)
Note that it does not yet have tabs. To create tabs in a flexdashboard, change some of your double hashtags `##` to single hashtags `#`. This will modify the header style for those sections, and flexdashboard will render those headers as tabs.
Many other formats are available, and we encourage you to explore them on your own!
## Visual vs Source mode
Rmarkdown documents can be edited in either a "Source" mode or a "Visual" mode.
You can switch into visual mode for a given document using the toolbars. There is a pair of buttons to toggle between the modes:
![](images/visual_mode_toggle.jpg){width="427"}
What's the difference between these two modes?
In source mode, you see the raw markdown syntax.
::: {.callout-note title='Vocab'}
**Markdown** is a simple set of conventions for adding formatting to plain text. For example, to italicize text, you wrap it in asterisks `*text here*`, and to start a new header, you use the pound sign `#`. We will learn these in detail below.
:::
In visual mode, you see a Microsoft Word-like view with a toolbar for easy formatting.
This means you don't have to remember the syntax for markdown elements. For example, if you want to make a section of text bold, you can simply highlight that piece of text and click on the bold button in the toolbar.
While visual mode is much easier to use, we will teach you markdown syntax here for three reasons:
1. Visual mode can sometimes be buggy, and to debug this, you'll need to switch to source mode.
2. Understanding markdown syntax is useful outside of Rmarkdown.
3. Visual mode is not available in RStudio's collaborative mode, which you may want to use.
## Markdown syntax
In the "Help" tab of the top RStudio menu, if you look up "Markdown Quick Reference", you will find a wide variety of RMD options available.
You can define titles of different levels by starting a line with one or more `#`:
```
## Level 1 title
### Level 2 Title
#### Level 3 Title
```
The body of the document consists of text that follows the *Markdown* syntax. A Markdown file is a text file that contains lightweight markup to help set heading levels or format text. For example, the following text:
```
This is text with *italics* and **bold**.
You can define bulleted lists:
- first element
- second element
```
Will generate the following formatted text:
> This is text with *italics* and **bold**.
>
> You can define bulleted lists:
>
> - first element
> - second element
Note that you need spaces before and after lists, as well as keeping the listed items on separate lines. Otherwise, they will all crunch together rather than making a list.
We see that words placed between asterisks are italicized, and lines that begin with a dash are transformed into a bulleted list.
The Markdown syntax allows for other formatting, such as the ability to insert links or images. For example, the following code:
```
[Example Link](https://example.com)
```
... will give the following link:
> [Example Link](https://example.com)
We can also embed images. If you're in *Source* mode, type:
`![what you want the subtitle to say](images/picture_name.jpg)`, replacing "what you want the subtitle to say" (it can also be blank), "images" with the name of the image folder in your project, and "picture_name.jpg" with the name of the image you want to use. In *Visual* mode, you can open the folder that holds your image on your computer and drag-and-drop the image from the folder onto the page you're building. Alternatively, place the cursor where you want the image, click the button above marked with a "picture" icon, follow the prompts, and insert your image where the cursor is. This will also create an "images" folder in your project (if it doesn't already exist) and put the image file into the "images" folder.
When titles have been defined, clicking on the *Show document outline* icon on the far right of the toolbar associated with the R Markdown file will display a table of contents automatically generated from the titles, allowing for easy navigation within the document:
![Dynamic TOC](images/rstudio_rmd_toc.png)
### Customizing the generated document
The generated document can be customized by modifying options in the document's preamble. RStudio offers a graphical interface to change these options more easily. To access it, click on the gear icon to the right of the *Knit* button and choose *Output Options...*
![R Markdown Output Options](images/rstudio_rmd_output_options.png)
A dialog box will appear, allowing you to select the desired output format and various options depending on the format:
![R Markdown Output Options Dialog](images/rstudio_rmd_output_dialog.png)
For example, with the HTML format, the *General* tab allows you to specify if you want a table of contents, its depth, the themes to apply for the document and the syntax highlighting of the R blocks, etc. The *Figures* tab allows you to change the default dimensions of the generated graphics.
When you change options, RStudio will modify the preamble of your document. For instance, if you choose to show a table of contents and change the syntax highlighting theme, your header will become something like:
```
---
title: "R Markdown Review"
output:
html_document:
highlight: kate
toc: yes
---
```
You can also modify the options directly by editing the preamble.
Note that it is possible to specify different options depending on the format, for example:
```
---
title: "R Markdown Review"
output:
html_document:
highlight: kate
toc: yes
pdf_document:
fig_caption: yes
highlight: kate
---
```
The complete list of possible options is available on [the official documentation site](http://rmarkdown.rstudio.com/formats.html) (which is very comprehensive and well-made) and on the cheat sheet and reference guide, accessible from RStudio via the *Help* menu, then *Cheatsheets*.
## R code chunks
In addition to free text in Markdown format, an R Markdown document contains, as its name suggests, R code. This is included in blocks (*chunks*) written the following way in *Source* mode:
\`\`\`{r}\
r_code \<- 2+2\
\`\`\`
Which will produce the following in *Visual* mode:
```{r}
r_code <- 2+2
```
As this sequence of characters is not very easy to enter, you can use the *Insert* menu of RStudio and choose *R*[\^3], or use the keyboard shortcut `Command+Option+i` on Mac or `Ctrl+Alt+i` on Windows.
Note that it is possible to use other languages in code chunks.
![Code block insertion menu](images/rstudio_insert_chunk.png)
In RStudio blocks of R code are usually displayed with a slightly different background color to distinguish them from the rest of the document.
When your cursor is in a block, you can enter the R code you want and execute it with Command + Enter. You can also execute all the code contained in a block by clicking on the green "play" button at the top right of the code chunk.
### Chunk output inline vs in condole
In RStudio, by default, the results of a block of code (text, table or graphic) are displayed directly *in* the document editing window, allowing them to be easily viewed and kept for the duration of the session.
This behavior can be changed by clicking the gear icon on the toolbar and choosing *Chunk Output in Console.*
### R code chunk options
It is also possible to pass options to each block of R code to modify its behavior.
Remember that a block of code looks like this:
```{r echo = FALSE, comment = ""}
cat(
"```{r}
x <- 1:5")
```
The options of a code block are to be placed inside the braces `{r}`, with a comma separating each option.
### Block name
The first possibility is to give a *name* to the block. This is indicated directly after the `r`:
`{r block_name}`
It is not mandatory to name a block, but it can be useful in the event of a compilation error, to identify the block that caused the problem. Be careful, you cannot have two blocks with the same name.
### Options
In addition to a name, a block can be passed a series of options in the form `option=value`. Here is an example of a block with a name and options:
```{r echo = FALSE, comment = ""}
cat(
"```{r blockName, echo = FALSE, warning = TRUE}
x <- 1:5")
```
And an example of an unnamed block with options:
```{r echo = FALSE, comment = ""}
cat(
"```{r echo = FALSE, warning = FALSE}
x <- 1:5")
```
One of the useful options is the `echo` option. By default `echo` is `TRUE`, and the block of R code is inserted into the generated document, like this:
```{r}
x <- 1:5
print(x)
```
But if we set the `echo=FALSE` option, then the R code is no longer inserted into the document, and only the result is visible:
```{r echo=FALSE}
x <- 1:5
print(x)
```
Here is a list of some of the available options:
| Option | Values | Description |
|-----------|-----------|---------------------------------------------------|
| echo | TRUE/FALSE | Show (or hide) this R code chunk in the resulting knitted document |
| eval | TRUE/FALSE | Run (or not) the code in this code chunk in the resulting knitted document |
| include | TRUE/FALSE | Combines the options "echo and eval"; either show and run, or hide and don't run |
| message | TRUE/FALSE | Show (or hide) any system messages generated by running this code chunk in the resulting knitted document |
| warning | TRUE/FALSE | Show (or hide) any warnings generated by running this code chunk in the resulting knitted document |
There are many other options described in particular in [R Markdown reference guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf){target = "\_blank"} (PDF in English).
### Change options
It is possible to modify the options manually by editing the header of the code block, but you can also use a small graphical interface offered by RStudio. To do this, simply click on the gear icon located to the right of the header line of each block:
![Code Block Options Menu](images/rstudio_chunk_options.png)
You can then modify the most common options, and click on *Apply* to apply them.
### Global Options
You may want to apply an option to all the blocks in a document. For example, one may wish by default not to display the R code of each block in the final document.
You can set an option globally using the `knitr::opts_chunk$set()` function. For example, inserting `knitr::opts_chunk$set(echo = FALSE)` into a code block will set the `echo = FALSE` option to default for all subsequent blocks.
In general, we place all these global modifications in a special block called `setup` and which is the first block of the document:
```{r echo=FALSE, comment=""}
cat(
"```{r, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
")
```
## Inline Code
It is also possible to write code chunks embedded in the text. If you go to *Source* mode and type
"The sum of a pair of 2s is \` r 2+2 \`"
and then knit the RMD, the resulting document will evaluate the r code between the backticks. Note that you have to include the "r" at the beginning of your inline code chunk to get it to recognize it as R code.
You could also pass variables around your document just like in a regular R program. For example, on one line you could run,
\`\`\` {r} max_height \<- max(women\$height) \`\`\`
"The maximum height in the women data set is \` r max_height \` ."
The advantages of such a system are numerous:
- a single document can show your entire analysis workflow, since the code, results and text explanations are included
- the document can be very easily regenerated and updated, for example if the source data has been modified.
- the variety of output formats (HTML, PDF, Word, slides, dashboards, etc.) makes it easy to present your work to others.
## Display tables
There are a number of ways for R Markdown Documents to show data tables. To start, you can see how our RMD displays a table with no formatting:
```{r}
women
```
It looks pretty basic. Next, to follow along you'll want to load the following packages:
```{r}
pacman::p_load(flextable, gt, reactable)
```
Flextable is better for showing simple tables supported by many formats. GT is better for showing complex tables in HTML documents. Reactable is better for showing very large tables in HTML by giving your audience the option to scroll through the tables.
```{r}
"This is a flextable"
flextable::flextable(women)
```
```{r}
"This is a GT table"
gt::gt(women)
```
```{r}
"This is a reactable"
reactable::reactable(women)
```
You can see many other types of table formats people have created at <https://www.rstudio.com/blog/rstudio-table-contest-2022/>
## Document Templates
We have seen here the production of "classic" documents, but R Markdown allows you to create many other things.
The extension's documentation site offers [a gallery](http://rmarkdown.rstudio.com/gallery.html) of the different possible outputs. You can create slides, websites or even entire books, like this document.
### Slides
An interesting use is the creation of slideshows for presentations in the form of slides. The principle remains the same: we mix text in Markdown format and R code, and R Markdown transforms everything into presentations in HTML or PDF format. In general, the different slides are separated at certain heading levels.
Some slide templates are included with R Markdown, including:
- `ioslides` and `Slidy` for HTML presentations
- `beamer` for PDF presentations via `LaTeX`
When you create a new document in RStudio, these templates are accessible via the *Presentation* entry:
![Create an R Markdown presentation](images/rstudio_rmd_presentation.png)
Other extensions, which must be installed separately, also allow slideshows in various formats. These include in particular:
- [xaringan](https://slides.yihui.name/xaringan/%20for) for HTML presentations based on [remark.js](https://remarkjs.com/)
- [revealjs](https://github.com/rstudio/revealjs) for HTML presentations based on [reveal.js](http://lab.hakim.se/reveal-js/#/)
- [rmdshower](https://github.com/mangothecat/rmdshower) for HTML slideshows based on [shower](https://github.com/shower/shower)
Once the extension is installed, it generally offers a starting *template* when creating a new document in RStudio. These are accessible from the *From Template* entry.
![Create a presentation from a template](images/rstudio_rmd_templates.png)
### Templates
There are also different *templates* allowing you to change the format and presentation of the generated documents. A list of these formats and their associated documentation can be accessed from the [formats](http://rmarkdown.rstudio.com/formats.html) documentation page.
Note in particular:
- the [Distill](https://rstudio.github.io/distill/) format, suitable for scientific or technical publications on the Web
- the [Tufte Handouts](http://rmarkdown.rstudio.com/tufte_handout_format.html) format which allows you to produce PDF or HTML documents in a format similar to that used by Edward Tufte for some of his publications
- [rticles](https://github.com/rstudio/rticles), package that offers LaTeX templates for several scientific journals
Finally, the [rmdformats](https://github.com/juba/rmdformats) extension offers several HTML templates particularly suitable for long documents.
Again, most of the time, these document templates offer a starting *template* when creating a new document in RStudio (entry *From Template*):
![Create a document from a template](images/rstudio_rmd_templates.png)
## Resources
Below are some resources to help you learn more about R Markdown:
The book *R for data science*, available online, contains [a chapter dedicated to R Markdown](http://r4ds.had.co.nz/r-markdown.html).
The [extension's official site](http://rmarkdown.rstudio.com/) contains very complete documentation, both for beginners and for advanced users.
Finally, the RStudio help (*Help* menu then *Cheatsheets*) provides access to two summary documents: a synthetic "cheat sheet" (*R Markdown Cheat Sheet*) and a more complete "reference guide" ( *R Markdown Reference Guide*).