-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
152 lines (112 loc) · 8.19 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/",
out.width = "100%"
)
```
# Résilience Côtière <a href=''><img src='man/figures/logo.png' align="right" height="175" /></a>
<!-- badges: start -->
[](https://choosealicense.com/licenses/gpl-2.0/)
[](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[](#)

<!-- badges: end -->
This repository contains the *research compendium* for the project *Résilience Côtière*. It contains all the code required to import, format, and integrate the data needed for this project, as well as the code used to perform the analyses, figures, and the project report.
### How to cite
Please cite this research compendium as follows:
> **{{ PLEASE ADD A CITATION }}**
## Structure and Content
This research compendium is designed to facilitate reproducible research by organizing data, scripts, and outputs within a structured framework. By emulating the structure of an R package, the compendium combines the rigour of package development with the flexibility required for complex analytical workflows. This structure ensures transparency, reproducibility, and ease of navigation for researchers and collaborators. The compendium not only supports the organization and documentation of workflows but also allows the seamless integration of R tools and functions. Adopting an R package-like structure provides several advantages:
- **Reproducibility**: The standardized structure ensures that all components (data, code, outputs) are easily accessible and linked, reducing the likelihood of errors in replication.
- **Portability**: The compendium can be shared and installed like an R package, enabling collaborators to reproduce the work on their systems.
- **Documentation**: Built-in support for documentation (e.g., `man/`, `README.md`) enhances understanding and usability for current and future users.
While not including the data directly, the *research compendium* contains all the resources making it possible to access and transform the raw data and prepare the threat layers for this project. It also contains the code creating figures, tables and this report. Only sensitive data for which confidentiality agreements have been signed remain inaccessible; still, these are stored on Google Cloud Storage in a secure bucket that can be accessed programmatically with an access key. This ensures that the whole project remains fully reproducible even if access to some data is limited.
The research compendium is organized into the following components:
```
myResearchCompendium/
│
├── _targets/
│
├── data/
│
├── docs/
│
├── figures/
│
├── man/
│
├── pubs/
│
├── R/
│
├── workspace/
│ ├── bibliographies/
│ ├── config/
│ ├── credentials/
│ ├── data/
│ │ ├── harvested/
│ │ └── analyzed/
│ ├── pipelines/
│ │ ├── harvesting/
│ │ └── analytical/
│ └── script/
│
├── _targets.R
├── DESCRIPTION
├── LICENSE.md
├── NAMESPACE
├── README.md
└── README.Rmd
```
Below is a description of each component of the research compendium.
### Root-Level Files
- **`_targets.R`**: The central configuration file for the [`targets`](https://CRAN.R-project.org/package=targets) R package, which manages and tracks the execution of analytical workflows. This file defines the targets (steps) in the analysis and their dependencies.
- **`DESCRIPTION`**: Provides metadata about the compendium, including its title, version, author information, and dependencies. This file mirrors the `DESCRIPTION` file in R packages, enabling compatibility with R's package ecosystem.
- **`LICENSE.md`**: Contains the licensing terms under which the compendium is distributed, ensuring clarity regarding usage and redistribution rights.
- **`NAMESPACE`**: Specifies the exported functions and imports from other packages, similar to an R package, to manage the scope and dependencies of functions within the compendium.
- **`README.md`** and **`README.Rmd`**: Provide an overview of the project, its goals, and instructions for setup and use. The R Markdown file (`README.Rmd`) can be rendered to create the Markdown file (`README.md`).
### Directories
#### **`_targets/`**
This directory contains internal files used by the `targets` package to manage workflow execution. It tracks dependencies, outputs, and progress, ensuring reproducibility and enabling efficient re-execution of only the steps affected by changes. This folder is present once the `_targets.R` file has been run once.
#### **`data/`**
This directory stores raw and cleaned data files that are essential to the analyses but not directly produced by the workflows. This allows the compendium to maintain a clear separation between input data and processed outputs.
#### **`docs/`**
Documentation files for the project, such as user guides, vignettes, and any additional explanatory materials that provide context for the workflows and outputs.
#### **`figures/`**
A repository for plots, charts, and visualizations generated by the analytical workflows. This directory helps centralize all visual outputs for reporting and publication.
#### **`man/`**
Documentation for functions included in the compendium. This directory mirrors the `man/` folder in R packages and contains `.Rd` files that describe each function's purpose, usage, and arguments.
#### **`pubs/`**
A location for storing draft manuscripts, reports, and other publications derived from the project. This ensures that research outputs are connected to their analytical source.
#### **`R/`**
Contains R scripts defining functions and utilities used across the workflows. This is the primary location for reusable, well-documented R functions that are central to the analyses.
#### **`workspace/`**
A comprehensive directory for project-specific resources and configurations. It is further divided into:
- **`bibliographies/`**: Bibliographic files, such as `.bib` files, used for citations in reports and publications.
- **`config/`**: Configuration files (e.g., YAML or JSON) that specify pipeline parameters and global settings for the analyses.
- **`credentials/`**: Secure storage for authentication keys and other sensitive information required for accessing data sources.
- **`data/`**: Organized into two subdirectories:
- **`harvested/`**: Raw data files downloaded or collected through the harvesting pipelines.
- **`analyzed/`**: Processed data files generated by the analytical pipelines.
- **`pipelines/`**: Divided into:
- **`harvesting/`**: YAML configurations and scripts for harvesting data from external sources.
- **`analytical/`**: YAML configurations and scripts for performing analyses on harvested data.
- **`script/`**: Scripts and functions used by the targets master workflow.
## Navigating and Using the Compendium
1. **Setting Up the Workspace**:
- The `README.md` provides instructions for setting up the compendium, including installing dependencies, configuring paths, and loading required libraries.
2. **Credentials**:
- This project requires credentials to access data from two different sources:
- Secure google cloud storage managed by inSileco (`pof-stac-insileco.json`)
- These credentials have to be stored in `workspace/credentials/`
3. **Running the Pipelines**:
- Running the master pipeline contained in `_targets.R` is achieved by executing the `targets::tar_make()` command in `R` from the root of the research compendium. Depending on your computer and internet connection, the first run will likely take a full day to complete.
4. **Exploring Outputs**:
- Processed data is stored in `workspace/data/analyzed/`.
- Visualizations and figures are available in `figures/`, while reports and publications can be found in `pubs/`.