generated from allisonhorst/meds-distill-template
-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathday1-coding_together.qmd
62 lines (37 loc) · 3.6 KB
/
day1-coding_together.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
title: "Coding together"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Learning Objectives
In this part of the lesson, you will learn:
- Why __git__ is useful for reproducible analysis
- How to use __git__ to track changes to your work over time
- How to use __GitHub__ to collaborate with others
- How to write effective commit messages
- How to structure your commits so your changes are clear to others
- How to fork a repository to contribute to its content
- How to create a pull request
- How to review a pull request
## Why collaborative coding
### [Slide deck](https://EDS-214.github.io/eds214-slides/collaborative_coding.html)
Environmental Data Science (EDS), as many other data-driven research fields, requires a transdisciplinary approach to tackle challenges that often span across several domains of expertise. Working as a team will leverage know-how from diverse collaborators and be the most efficient way to tackle complex problems in EDS. Consequently collaborative skills are required to work effectively as a member of a team. No matter their focus, highly effective teams share certain characteristics:
* Right size
* Diverse group of people with the right mix of skills, knowledge, and competencies
* Aligned purpose and incentives
* Effective organizational structure
* Strong individual contributions
* Supportive team processes and culture
**Since Analytical Workflows are rarely linear!** and are developed iteratively, the most efficient way to iterate quickly on your analysis is to use scripts and leave copy-pasting behind. _Programming as part of a team is different than writing a script for your(present)self_. However learning programming as part of a team is not only critical to the efficacy of your team, it will also you help you to grow as a programmer by:
* Motivating you to document well your work
* Helping you to think how to make your work reusable (by you, your future you and others)
* Learning to read code from collaborators to build upon each others work
* Gain further knowledge in software development tools, such as version control
**Developing those skills will accelerate your research and open the door for you to contribute to open source projects.**
## How to code together
It is important to acknowledge that there are many solutions to the complex research questions you will be facing in EDS. Each of those solutions will have several possible implementations, meaning that more likely you might code this implementation differently than your collaborators. Integrated software engineer teams generally try to mitigate this by developing coding standards and conventions that will guide how to write code and develop specific implementation. In scientific teams in which the collaboration is more loose and maybe more ephemeral as well, developing detailed coding standards will be too much of an overhead. However, we think it is important to acknowledge that coding style may varies among the data scientists of a project and it is a good discussion to have among the team at the beginning of the project. For example, in R it could be trying to use the tidyverse approach as much as possible. We also think there are two activities that will make the team more efficient: Code Review and Pair Programming.
### Tools
The good news is there are several tools out there that have been designed to make developing code as a team more efficient. In this course, we will focus on getting familiar with the following:
- Version control system: say goodbye to `save as`
- Code repository: where we share code and communicate ideas and feedback