generated from allisonhorst/meds-distill-template
-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathday4-sharing_things.qmd
141 lines (75 loc) · 7.48 KB
/
day4-sharing_things.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---
title: "Sharing Things"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## GitHub pages & the R Markdown & quarto family
You have been using it all week!! and before!! All the course material has been developed as an R Markdown based website, distill website to be precise (see [here](https://rstudio.github.io/distill/website.html) for more), and it is also a great way to publish and share your note book with your team and the broader community. Also checkout [Quarto](https://quarto.org/) that is a cross-language tool to render documents from code!
## Interactive web applications
When you have complex data that you want to not only visualize but also to let the user interact with your data or customize parameters used in an analysis, interactive web applications are a great way to increase engagement. The good news is that you do not need to be a web developer anymore to spin such applications or dashboards.
### R Shiny <img style="float: right;width: 30px;" src=img/shiny_logo.png />
R Shiny is a very interesting framework that lets you write R code that will then be translated into javascript for you and thus let you develop web application without having to learn any new programming language. Note that you will need a server to host the application.
Check out this gallery of shiny apps: <https://shiny.rstudio.com/gallery/>
Getting started with Shiny: <https://shiny.rstudio.com/tutorial/>
### Html widgets
Html widgets offer some great interactive data visualization. It is more limited that Shiny because you can not modify parameters to modify the data use, but it has the advantage that it does not need a server to run the widget and it can be inserted directly into an R Markdown.
Here to get started: <https://www.htmlwidgets.org/>
## plotly <img style="float: right;width: 60px;" src=img/logo-plotly.svg />
[`plotly`](https://plotly.com/) enable you to develop great interactive data visualizations. It has the advantage to be available both in [R](https://plotly.com/r/) and [Python](https://plotly.com/python/) (and javascript). One great thing in R is that if you are a ggplot master, you can write your plot code using ggplot and transform it into a plotly plot with one line of code ([here](https://plotly.com/ggplot2/getting-started/) for an example)
Note there are also other python libraries to create interactive plots, here are a few: <https://mode.com/blog/python-interactive-plot-libraries/>
# Code Repositories
- GitHub
- GitLab (open source!!): <https://about.gitlab.com/>
- Bitbucket: <https://bitbucket.org/product>
- ...
**Don't forget to provide information about the versions of software and libraries that were used when running this specific analysis**
It is also a great idea to add license to your project so people know how they can use your code: <https://choosealicense.com/>
## Binder & jupyter <img style="float: right;width: 60px;" src=img/binder_logo.svg />
Transform your `git` based repo into an interactive jupyter notebook <https://mybinder.org/>!! So other researchers can run your code without having to install anything!
![](https://media.giphy.com/media/9SIAXIrMh4AWBx4GBB/giphy.gif)
**Try it:** <https://mybinder.org/>
## Citing your code
Note that it is also possible to assign a [DOI](https://en.wikipedia.org/wiki/Digital_object_identifier) to cite a specific version of your repository. For example, see [here](https://guides.github.com/activities/citable-code/) for more information on how to link [Zenodo](https://zenodo.org/) and GitHub.
# Data Repositories
As we discussed earlier, code repositories are note necessarily the best home for your data sets, especially if their format is not text based and if their size is large (>100MB). In addition data repositories offer better support to metadata standard that will help you to describe your data and thus make them more discoverable. Like code repositories, data repositories will version your data creating an history of your data sets that you can navigate. In addition, most of the data repositories will also offer to mint a DOI to cite your data (to be precise a specific version of your data) in a convenient and non ubiquitous way.
We will talk more in depth about data repositories in the Fall, but for now we will mention to entryways to environmental data federation:
- DataONE: a federation of data repositories, https://www.dataone.org/
- re3data: a registry of research data repositories, https://www.re3data.org/
Starting by searching data in your field and see where other researchers are archiving their data is often a great way to determine which data repository could be a good home for your own data.
# You as an Author <img style="float: right;width: 35px;" src=img/orcid_logo.svg />
It is important to be able to reference to yourself as a researcher and as an author of your work in a non ambiguous manner. From their website: _[ORCID](https://info.orcid.org/what-is-orcid/) is a great way to create a persistent digital identifier (an ORCID iD) that you own and control, and that distinguishes you from every other researcher._ ORCID is also more and more use as an authentication system for many services (e.g. data repositories).
# What about your computing environment
## Session info
Your analysis was done with specific versions both of the program used but also of all the packages involved, as well as the specifications of Operating System (OS) that was used. The good use is that there ar tools to let you capture this information in a systematic manner.
```{r}
sessionInfo()
```
or even better:
```{r}
devtools::session_info()
```
You can save all this content to an `session_info.txt` file and upload it to your repository.
In python, using `pip freeze > requirements.txt` or `conda list --export > requirements.txt` will create a text file listing all the libraries (and their versions) used in a specific python environment. You can actually use this file to (re)install all the packages and specific versions into a new python environment. It is also great practice to add this file to your repository.
## Containers
A helpful abstraction for capturing the computing environment is a container, whereby a container is created from a set of instructions in a recipe. For the most common containerisation software, Docker, this recipe is called a Dockerfile. Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure and ship the containers to others. A Docker container can be seen as a computer inside your computer.
```{r, fig.cap="http://jsta.github.io/r-docker-tutorial/", echo=FALSE}
knitr::include_graphics("img/docker_drawing.jpeg")
```
A few good readings:
- Docker for scientific reproducibility: <https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008316>
- The Whole Tale project: combining containers with data repositories <https://wholetale.org/>
- Docker tutorial: <http://jsta.github.io/r-docker-tutorial/>
- Environment Management with Docker: <https://environments.rstudio.com/docker>
- Sharing and Running R code using Docker: <https://aboland.ie/Docker.html>
---
# Code friendly Presentations
## Xarigan
Xarigan is an R package to create slide deck using R Markdown: <https://github.com/yihui/xaringan>
```{r, eval=FALSE}
remotes::install_github('yihui/xaringan')
```
Here is a good introduction to it: <https://www.favstats.eu/post/xaringan_tut/>
## Quarto Presentations
<https://quarto.org/docs/presentations/>
<https://meghan.rbind.io/blog/quarto-slides/>