-
Notifications
You must be signed in to change notification settings - Fork 12
Developer Guideline
Welcome to the Developer Guideline of the AMR
R package. This guideline explains about repository workflows and updates of package elements.
Contents
To start, it is important to know that this R package and all of its components are free, open-source software and licensed under the GNU General Public License (GPL) v2.0. Open-source software does not mean that there are no legal constraints. There are actually some profound ones, since GPL-2.0 in a nutshell means that this package:
-
May be used for commercial and private purposes, but may not be used for patent purposes
-
May be modified, although (1) modifications must also be released under the GPL-2.0 license when distributing the package, and (2) changes made to the code must be documented (using
NEWS.md
) -
May be distributed, although (1) source code must be made available when the package is distributed, and (2) a copy of the license and copyright notice must be included with the package
-
Comes with a LIMITATION of liability, and with NO warranty
The full legal text is included on this repository here.
This repository uses Git hooks to support automated generation of R documentation, automated semantic software versioning, and automated export of our data sets for other software (MS Excel, SPSS, SAS, Stata, Apache Parquet, Apache Feather).
All updates to the repository should be done locally using git commit
(or RStudio) and not the GitHub website, since local commands allow the use of our git
prehooks, allowing automated semantic versioning and R documentation updates.
When using git commit
, a script will be run to increase the version number, update the date and R documentation. Note: This only works on Unix systems, such as macOS and Linux.
To set this up, run this command once when working locally in the repository:
git config --local core.hooksPath ".github/prehooks"
Now, when using git commit
:
git commit -am "test commit"
# Running prehook...
# >> Updating R documentation...
# >> done.
# >>
# >> Updating semantic versioning and date...
# >> - latest tag is 'v1.8.1', with 26 previous commits
# >> - AMR pkg version set to 1.8.1.9027
# >> - updated DESCRIPTION
# >> - updated NEWS.md
# >>
# [main 300b93e] (v1.8.1.9027) test commit
# 3 files changed, 3 insertions(+), 4 deletions(-)
To circumvent using the checks, you can use the argument —-no-verify
(or -n
for short) with git commit
, or add the text "no-check"
or "no-verify"
to the commit message. This is useful for releasing new versions, since otherwise the version number in DESCRIPTION
and NEWS.md
would become overwritten.
# add checks:
git commit -am "small website fix"
# skip checks:
git commit -am "small website fix (no-checks)"
git commit -am "small website fix (no-verify)"
git commit -amn "small website fix"
git commit --no-verify -am "small website fix"
In RStudio, where the git commit
command runs in the background, it is the most convenient way to add "no-checks"
to the commit message.
The website (https://msberends.github.io/AMR) will be generated automatically if changes are pushed to the main
branch. This is done using GitHub Actions, and the workflow file can be found here: .github/workflows/website.yaml
. The website generation will be done in the latest Ubuntu LTS version and the current release version of R.
The website will be stored in the gh-pages
branch.
Since a GitHub Action uses git pull
to retrieve the repo contents, timestamps of files will not be preserved. This is a problem, since the ‘Data Set for Download’ vignette (https://msberends.github.io/AMR/articles/datasets.html) relies on timestamps to let users know when a data set was last updated. For this reason, the following code was added to the GitHub Action workflow file:
This repository contains five GitHub Actions workflow files, each for a different purpose:
File | Runs when | Runs on | Purpose |
---|---|---|---|
check.yaml |
Everyday at 1 AM; After every push to any branch |
Ubuntu 22.04 (R 3.0 to R-devel); Latest Windows (R 3.6 to R-devel); Latest macOS (R 3.6 to R-devel) |
Run R CMD check , including all unit tests |
check-pr.yaml |
In every pull request, including updates (not if author is repo member/owner) | Ubuntu 22.04 (R-release and R-devel); Latest Windows (R-release and R-devel); Latest macOS (R-release and R-devel) |
Run R CMD check , including all unit tests |
codecovr.yaml |
After every push to any branch; In every pull request, including updates |
Latest Ubuntu (R-release) | Check code coverage and upload to http://codecov.io/gh/msberends/AMR |
lintr.yaml |
After every push to any branch; In every pull request, including updates |
Latest Ubuntu (R-release) | Check coding style according to Tidyverse convention |
website.yaml |
After every push to the 'main' branch | Latest Ubuntu (R-release) | Create website from scratch, with all examples |
Please read the separate Wiki page Add or Update a Language for Translation.
This process is also covered when committing a change, since data-raw/_pre_commit_hook.R
contains the full workflow to update language files.
After updating these guidelines, be sure to add the new version numbers to R/aa_globals.R
.
The clinical breakpoints from EUCAST and CLSI are stored in the data set clinical_breakpoints
. To update this data set to include the latest guidelines, follow the instructions in data-raw/reproduction_of_clinical_breakpoints.R
. There is no need to update the documentation manually, all values in the documentation that refer to the clinical_breakpoints
data set are parametrised (such as the names of included guidelines). Running devtools::document()
will do fine, though this is also part of the pre-commit hook.
This script will incorporate the last 10 years of the CLSI and the EUCAST guidelines.
Be sure to do some checks with the original e.g. EUCAST files to check if everything works as expected! For example, run scripts like this:
test_mics <- as.mic(c(0.256, 0.5, 1, 2, 4, 8, 16, 32, 64))
as.sir(test_mics, mo = "Escherichia coli", ab = "ciprofloxacin", guideline = "EUCAST")
as.sir(test_mics, mo = "Escherichia coli", ab = "ciprofloxacin", guideline = "CLSI")
as.sir(test_mics, mo = "Pseudomonas aeruginosa", ab = "ciprofloxacin", guideline = "EUCAST")
as.sir(test_mics, mo = "Pseudomonas aeruginosa", ab = "ciprofloxacin", guideline = "CLSI")
as.sir(test_mics, mo = "Streptococcus pneumoniae", ab = "amoxicillin", guideline = "EUCAST")
as.sir(test_mics, mo = "Streptococcus pneumoniae", ab = "amoxicillin", guideline = "CLSI")
These rules are inside the Clinical Breakpoints tables from EUCAST and only available via their MS Excel and PDF files, we've found no other source in a machine-readable format such as TXT or CSV. The rules (in the Notes sections of each page/sheet) must be added manually to data-raw/eucast_rules.tsv
, although most rules can be copied from an earlier version from that file.
Expert rules from EUCAST are only available via their MS Excel and PDF files, we've found no other source in a machine-readable format such as TXT or CSV. The rules must be added manually to data-raw/eucast_rules.tsv
, although most rules can be copied from an earlier version from that file.
Be sure to update the version numbers in R/data.R
and R/aa_globals.R
afterwards.
EUCAST Dosage guidelines are stored in the data set dosage
. Up to 2022, EUCAST only distributes PDF files with their dosing guidelines. Adobe Acrobat is required to transform them to an Excel file. Follow the instructions in data-raw/reproduction_of_dosage.R
to automatically update the data set.
Be sure to update the version numbers in R/data.R
and R/aa_globals.R
afterwards.
The microbial taxonomy is stored in the data set microorganisms
. Updating this data set is almost 100% automated and can be done following the instructions in data-raw/reproduction_of_microorganisms.R
. Note that it is required to download the full GBIF data set, which requires at least 10 GB of RAM to read into R.
Downloading data from LPSN requires an account. This is free and easy, and can be done here (or alternatively, visit https://lpsn.dsmz.de/downloads and click on Register at the bottom of the form).
There are a lot of unit tests in place to check its integrity after updating, but running a few manual checks never hurts:
as.mo("E. coli")
as.mo("eco")
as.mo("KLEPNE")
The package contains two data sets for antimicrobial agents: antibiotics
and antivirals
.
The antiviral agents are stored in the data set antibiotics.R
. To update this data set, follow the instructions in data-raw/reproduction_of_antibiotics.R
. This script is not fully automated and requires some manual work. The parts to update DDDs and ATC codes are fully automated, though.
The antiviral agents are stored in the data set antivirals
. To update this data set, follow the instructions in data-raw/reproduction_of_antivirals.R
. This script is fully automated.
The data-raw
folder contains all scripts and git history required for any other maintenance task such as updating data or finding out about package development history.
The AMR
package supports extensive S3 support using self-defined data types, also with support for other packages. Read about the S3 object system of R in the free Advanced R book by Hadley Wickham.
In short, S3 allows to add new data types (called a class) to a package, to extend on e.g. character
and Date
. To add a new class labnumber
as an extension of double
, the basis works like this:
x <- c(20220001, 20220002, 20220003)
class(x) <- c("labnumber", "double")
# now print the object:
print(x)
#> [1] 20220001 20220002 20220003
#> attr(,"class")
#> [1] "labnumber" "double"
Now we add an S3 extension for print()
to the package:
#' @export
print.labnumber <- function(x, ...) {
x <- as.character(x)
print(paste0("LAB-", substr(x, 1, 4), "-", substr(x, 5, 8)),
quote = FALSE)
}
Which results in:
print(x)
#> [1] LAB-2022-0001 LAB-2022-0002 LAB-2022-0003
The AMR
package contains 6 new classes (data types) using S3 extensions that users can create themselves with an as.xxx()
function:
Class | Created with | Extension of | Full object class | Purpose | Defined in file |
---|---|---|---|---|---|
ab |
as.ab() |
character |
c("ab", "character") |
Printing of antibiotic and antimycotic codes, ensuring integrity of antimicrobial codes | R/ab.R |
av |
as.av() |
character |
c("av", "character") |
Printing of antivirals, ensuring integrity of antiviral codes | R/av.R |
disk |
as.disk() |
integer |
c("disk", "integer") |
Cleaning of disk diffusion values, and printing, assigning, extracting them | R/disk.R |
mic |
as.mic() |
factor |
c("mic", "ordered", "factor") |
Cleaning of MIC values, using mathematical operators with them (over 80 extensions, such as > , mean , log2 ), and printing, assigning, extracting them |
R/mic.R |
mo |
as.mo() |
character |
c("mo", "character") |
Cleaning of microbial codes and names, and printing, assigning, extracting them | R/mo.R |
sir |
as.sir() |
factor |
c("sir", "ordered", "factor") |
Interpreting and cleaning to SIR values, and printing, assigning, extracting them | R/sir.R |
Additionally, the AMR
package contains 5 classes that are used internally and do not have an as.xxx()
function:
Class | Created with | Extension of | Full object class | Purpose | Defined in file |
---|---|---|---|---|---|
ab_selector |
antibiotic selectors, such as carbapenems()
|
character |
c("ab_selector", "character") |
Selecting/Filtering of antibiotic columns in data | R/ab_selectors.R |
ab_selector_any_all |
N/A | logical |
c("ab_selector_any_all", "logical") |
Using == , != , any() and all() on antibiotic selectors |
R/ab_selectors.R |
bug_drug_combinations |
bug_drug_combinations() |
data.frame |
At least c("bug_drug_combinations", "data.frame") but might inherit other classes, such as tbl_df of tibbles |
Printing and formatting the result of bug_drug_combinations()
|
R/bug_drug_combinations.R |
custom_eucast_rules |
custom_eucast_rules() |
list |
c("custom_eucast_rules", "list") |
Concatenating and printing custom EUCAST rules | R/custom_eucast_rules.R |
custom_mdro_guideline |
custom_mdro_guideline() |
list |
c("custom_mdro_guideline", "list") |
Concatenating and printing custom MDRO rules | R/mdro.R |
The AMR
package also extends foreign packages, by providing S3 classes for functions of those packages. Usually, these functions have to be imported but since the AMR package is designed to independent of any other package, the S3 extensions are loaded after the AMR
package is loaded, as defined in R/zzz.R
. The most important benefit is that even if those foreign do not exist anymore, the AMR
package will work the exact same way without CRAN complaining about incompatible support. This greatly improves durability of our package.
Currently extended packages are cleaner
, ggplot2
, pillar
, skimr
, and vctrs
. These are for that reason also in the Enhances
field of the DESCRIPTION
file.
Foreign package | Foreign package function | Additional (input) class | Defined for class | Defined in |
---|---|---|---|---|
pillar |
pillar_shaft() |
ab |
R/ab.R |
|
pillar |
pillar_shaft() |
av |
R/av.R |
|
pillar |
pillar_shaft() |
mo |
R/mo.R |
|
pillar |
pillar_shaft() |
sir |
R/sir.R |
|
pillar |
pillar_shaft() |
mic |
R/mic.R |
|
pillar |
pillar_shaft() |
disk |
R/disk.R |
|
pillar |
type_sum() |
ab |
R/ab.R |
|
pillar |
type_sum() |
av |
R/av.R |
|
pillar |
type_sum() |
mo |
R/mo.R |
|
pillar |
type_sum() |
sir |
R/sir.R |
|
pillar |
type_sum() |
mic |
R/mic.R |
|
pillar |
type_sum() |
disk |
R/disk.R |
|
cleaner |
freq() |
mo |
R/mo.R |
|
cleaner |
freq() |
sir |
R/sir.R |
|
skimr |
get_skimmers() |
mo |
R/mo.R |
|
skimr |
get_skimmers() |
sir |
R/sir.R |
|
skimr |
get_skimmers() |
mic |
R/mic.R |
|
skimr |
get_skimmers() |
disk |
R/disk.R |
|
ggplot2 |
autoplot() |
sir |
R/sir.R |
|
ggplot2 |
autoplot() |
mic |
R/mic.R |
|
ggplot2 |
autoplot() |
disk |
R/disk.R |
|
ggplot2 |
autoplot() |
resistance_predict |
R/resistance_predict.R |
|
ggplot2 |
fortify() |
sir |
R/sir.R |
|
ggplot2 |
fortify() |
mic |
R/mic.R |
|
ggplot2 |
fortify() |
disk |
R/disk.R |
|
vctrs |
vec_ptype2() |
character |
ab_selector |
R/vctrs.R |
vctrs |
vec_ptype2() |
ab_selector |
character |
R/vctrs.R |
vctrs |
vec_cast() |
character |
ab_selector |
R/vctrs.R |
vctrs |
vec_ptype2() |
logical |
ab_selector_any_all |
R/vctrs.R |
vctrs |
vec_ptype2() |
ab_selector_any_all |
logical |
R/vctrs.R |
vctrs |
vec_cast() |
logical |
ab_selector_any_all |
R/vctrs.R |
vctrs |
vec_ptype2() |
character |
ab |
R/vctrs.R |
vctrs |
vec_ptype2() |
ab |
character |
R/vctrs.R |
vctrs |
vec_cast() |
character |
ab |
R/vctrs.R |
vctrs |
vec_cast() |
ab |
character |
R/vctrs.R |
vctrs |
vec_ptype2() |
character |
av |
R/vctrs.R |
vctrs |
vec_ptype2() |
av |
character |
R/vctrs.R |
vctrs |
vec_cast() |
character |
av |
R/vctrs.R |
vctrs |
vec_cast() |
av |
character |
R/vctrs.R |
vctrs |
vec_ptype2() |
character |
mo |
R/vctrs.R |
vctrs |
vec_ptype2() |
mo |
character |
R/vctrs.R |
vctrs |
vec_cast() |
character |
mo |
R/vctrs.R |
vctrs |
vec_cast() |
mo |
character |
R/vctrs.R |
vctrs |
vec_ptype2() |
integer |
disk |
R/vctrs.R |
vctrs |
vec_ptype2() |
disk |
integer |
R/vctrs.R |
vctrs |
vec_cast() |
integer |
disk |
R/vctrs.R |
vctrs |
vec_cast() |
disk |
integer |
R/vctrs.R |
vctrs |
vec_cast() |
double |
disk |
R/vctrs.R |
vctrs |
vec_cast() |
disk |
double |
R/vctrs.R |
vctrs |
vec_cast() |
character |
disk |
R/vctrs.R |
vctrs |
vec_cast() |
disk |
character |
R/vctrs.R |
vctrs |
vec_cast() |
character |
mic |
R/vctrs.R |
vctrs |
vec_cast() |
double |
mic |
R/vctrs.R |
vctrs |
vec_cast() |
mic |
character |
R/vctrs.R |
vctrs |
vec_cast() |
mic |
double |
R/vctrs.R |
vctrs |
vec_math() |
mic |
R/vctrs.R |
|
vctrs |
vec_ptype2() |
character |
sir |
R/vctrs.R |
vctrs |
vec_ptype2() |
sir |
character |
R/vctrs.R |
vctrs |
vec_cast() |
character |
sir |
R/vctrs.R |
vctrs |
vec_cast() |
sir |
character |
R/vctrs.R |
Badge for sharing anywhere:
AMR
(for R). Developed at the University of Groningen in collaboration with non-profit organisationsCerte Medical Diagnostics and Advice Foundation and University Medical Center Groningen.