Skip to content

Latest commit

 

History

History
90 lines (58 loc) · 5.48 KB

README.md

File metadata and controls

90 lines (58 loc) · 5.48 KB

The {traits.build} R package

R-CMD-check Codecov test coverage Lifecycle: deprecated

Imagine you wanted to build a database of traits. You might start by compiling data from existing datasets, but you'd quickly find that there are many ways to name and measure the same trait, that different studies use different units, or use an outdated name for a species or taxon.

The traits.build package provides a workflow for harmonising data from disconnected primary sources and arises from the AusTraits project austraits.org. In 2023 this package was spun out as a separate package from the autraits.build repository.

Goals

The goals of this package are to:

  1. Enable users to create open-source, harmonised, reproducible databases from disparate datasets.
  2. Provide a fully transparent workflow, where all decisions on how the data are handled are exposed.
  3. Offer a relational database structure that fully documents the contextual data essential to interpreting ecological data.
  4. Offer a straightforward, robust template for building a trait dictionary.
  5. Offer a database structure that is flexible enough to accommodate the complexities inherent to ecological data.
  6. Offer a database structure that is underlain by a documented ontology, ensuring each database field is interpretable and interoperable with other databases and data structures.
  7. Have no dependencies on proprietary software or costs to setup and maintain (beyond person time).

To handle the harmonising of diverse data sources, we use a reproducible workflow to implement the various changes required for each source to reformat it suitable for incorporation in a harmonised compilation. Such changes include restructuring datasets, renaming variables, changing variable units, changing taxon names.

Prerequisites

  1. Familiarity with the R programming language, covered in R for Data Science.
  2. Data science workflow management techniques.
  3. How to write functions to prepare data, analyse data, and summarise results in a data analysis project.
  4. Appreciation of `traits.build`` workflow, including the required file structure.

Installation

There are multiple ways to install the traits.build package itself, and both the latest release and the development version are available.

Type Source Command
Release CRAN coming
Development GitHub remotes::install_github("traitecoevo/traits.build")

Documentation

Tutorials

Help

Please read the help guide to learn how best to ask for help using traits.build.

Code of conduct

  • Please note that the package follows the Contributor Code of Conduct for the AusTraits projects. By contributing to this project you agree to abide by its terms.

Citation

A publication describing the traits.build workflow:

Wenk E, Bal P, Coleman D, Gallagher R, Yang S, Falster D, (2024) Traits.build: A data model, workflow and R package for building harmonised ecological trait databases. Ecological Informatics 83: 102773. DOI: 10.1016/j.ecoinf.2024.102773

A publication describing the biggest database using the traits.build workflow:

Falster D, Gallagher R, Wenk, E et al. (2021) AusTraits, a curated plant trait database for the Australian flora. Scientific Data 8: 254. DOI: 10.1038/s41597-021-01006-6

Acknowledgements

Funding: The AusTraits project received investment (https://doi.org/10.47486/TD044, https://doi.org/10.47486/DP720) from the Australian Research Data Commons (ARDC). The ARDC is funded by the National Collaborative Research Infrastructure Strategy (NCRIS).