diff --git a/CHANGES.md b/CHANGES.md index 08ce7f5..651d5fe 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,5 +1,59 @@ ## Change log +### Version 1.0.1 + +Updates (mostly) required to run the OSPAR 2024 CEMP assessment. + +#### Data import + +For OSPAR and HELCOM style assessments, data from Germany are now matched to stations by name for 2023 onwards. This applies to biota, sediment and water. Note that for HELCOM, biota data from Germany are already matched by name for all years. + +#### Uncertainty processing + +harsat 1.0.0 replaced implausibly large relative uncertainties ($>=$ 100%) and replaced them with imputed values. However, implausibly small relative uncertainties were not changed. The code now replaces relative uncertainties $<=$ 1% with imputed values. + +The defaults can be changed using `control$relative_uncertainty` in `read_data`. To replicate the defaults in harsat 1.0.0, set `control$relative_uncertainty = c(0, 100)`. To keep all uncertainties, regardless of how ridiculous they are, set `control$relative_uncertainty = c(0, Inf)`. + +Two minor bug fixes: + +* relative uncertainties were being filtered for all distributional types, but this is only a reliable procedure for determinands with `distribution == "lognormal"`; the checks are now only applied to lognormal data +* some biological effect data with distributions other than normal or lognormal were being incorrectly deleted; this has now been corrected + +The oddity files have been updated to show: + +* implausible_uncertainties_reported.csv - all reported uncertainties that are replaced by imputed values +* missing_uncertainties.csv - all uncertainties (normal or lognormal data) that are not reported and can't be imputed +* implausible_uncertaintes_calculated.csv - all uncertainties that are calculated during the data processing (e.g. during normalisation) that are implausible and are set to missing + +#### Uncertainty coefficients + +The function `ctsm_uncrt_workup` and related supporting functions are used in OSPAR assessments to update the fixed and proportional standard deviations which are subsequently used to impute missing uncertainties. These functions were ignored during the initial development of harsat and are now harsat compatible. + +#### Biological effect assessments + +Imposex assessments: these are now fully reproducible with seeds for random number generation provided in the calls to `ctsm.VDS.cl` and `assess_imposex` + +Assessment functions for negative binomial data have been added. Negative binomial data includes MNC - the number of micronucleated cells. + +#### Reporting + +`report_assessment` generates default file names. These are based on the series identifier with additional station information. It is now possible to override this behaviour for a single report by providing a different file name using the `output_file` argument. + +#### Reference tables + +* new values added to method_extraction table + +#### Minor bug fixes + +* correct behaviour of argument `return_early` in `create_timeseries` +* pass `info` component of the harsat object to `determinand.link.sum`, `determinand.link.replace`, and `determinand.link.imposex` +* ensure early return from `ctsm_convert_basis` when there is nothing to convert (avoids issues e.g. when all the data are biological effects) +* ensure SURVT (in pargroup B-BIO) is recognised as a biological effect in `ctsm_get_datatype` (SURVT is the only determinand in this pargroup that isn't an auxiliary variable) +* pass `good_status` to assessment functions for data with distributions other than normal and lognormal +* trap pathological case in estimation of `prtrend`; see #436 +* ensure `ctsm_OHAT_legends` uses the symbology as specified in `write_summary_table` + + ### Version 1.0.0 - Initial public release diff --git a/DESCRIPTION b/DESCRIPTION index 427ade9..2ff5613 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: harsat Title: Harmonized Regional Seas Assessment Tool -Version: 1.0.0 +Version: 1.0.1 Authors@R: c( person(given = "Arctic Monitoring and Assessment Programme (AMAP)", email = "amap@amap.no", role = c("cph", "fnd", "aut")), person(given = "Helsinki Commission (HELCOM)", email = "secretariat@helcom.fi", role = c("cph", "fnd", "aut")), @@ -22,26 +22,28 @@ RoxygenNote: 7.2.3 VignetteBuilder: knitr, rmarkdown Depends: R (>= 4.2.1.0) Imports: + digest, + dplyr, flexsurv, + lattice, lme4, + lubridate, + magrittr, + MASS, mgcv, + mvtnorm, numDeriv, optimx, parallel, pbapply, - MASS, - mvtnorm, - dplyr, - magrittr, - lattice, - lubridate, readr, + readxl, + sf, stringr, + survival, + TeachingDemos, tibble, - tidyr, - sf, - digest, - readxl + tidyr Suggests: knitr, rmarkdown, diff --git a/R/assessment_functions.R b/R/assessment_functions.R index ffa10ce..e9597c8 100644 --- a/R/assessment_functions.R +++ b/R/assessment_functions.R @@ -836,6 +836,7 @@ assess_lmm <- function( AC = AC, recent.years = recent.years, determinand = determinand, + good_status = good.status, max.year = max.year, recent.trend = recent.trend, nYearFull = nYearFull, @@ -978,9 +979,9 @@ assess_lmm <- function( contrast.whole <- ctsm.lmm.contrast(fit, start = min(data$year), end = max(data$year)) row.names(contrast.whole) <- "whole" - start.year <- max(max.year - recent.trend + 1, min(data$year)) - if (sum(unique(data$year) >= start.year - 0.5) >= 5) { - contrast.recent <- ctsm.lmm.contrast(fit, start = start.year, end = max(data$year)) + start_recent <- max(max.year - recent.trend + 1, min(data$year)) + if (sum(unique(data$year) >= start_recent - 0.5) >= 5) { + contrast.recent <- ctsm.lmm.contrast(fit, start = start_recent, end = max(data$year)) row.names(contrast.recent) <- "recent" contrast.whole <- rbind(contrast.whole, contrast.recent) } @@ -1065,15 +1066,44 @@ assess_lmm <- function( if (output$method %in% c("linear", "smooth")) { - # for linear trend and recent trend, use pltrend (from likelihood ratio test) if - # method = "linear", because a better test - # really need to go into profile likelihood territory here! + # pltrend + # method = "linear" use p_linear (from likelihood ratio test) + # method = "smooth" use p from the Wald test in contrasts + # for linear model, likelihood ratio test is a better test (fewer + # approximations) than the Wald test + # for smooth model, would be better to go into profile likelihood + # territory (future enhancement) + + # prtrend + # same approach; however p_linear could be misleading when the years at + # the end of the time series are all censored values and a flat model is + # fitted; the estimate of rtrend is shrunk to reflect this, but p_linear + # might be misleadingly significant; something to think about in the + # future + # however, there is a pathological case when all the fitted values in the + # recent period have the same value; rtrend is zero, and yet can still be + # significant based on p_linear even though there are no data to support + # this; in this case use p from the Wald test (which is unity) + + if (output$method == "linear") { + pltrend <- p_linear + } else { + pltrend <- output$contrasts["whole", "p"] + } - pltrend <- if (output$method == "linear") p_linear else with(output$contrasts["whole", ], p) ltrend <- with(output$contrasts["whole", ], estimate / (end - start)) if ("recent" %in% row.names(output$contrasts)) { - prtrend <- if (output$method == "linear") p_linear else with(output$contrasts["recent", ], p) + + if ( + output$method == "linear" & + max(data$year[data$censoring %in% ""]) > start_recent + ) { + prtrend <- p_linear + } else { + prtrend <- output$contrasts["recent", "p"] + } + rtrend <- with(output$contrasts["recent", ], estimate / (end - start)) } } @@ -1292,8 +1322,17 @@ ctsm.lmm.contrast <- function(ctsm.ob, start, end) { wk <- t(wk) %*% ctsm.ob$Xpred[pos, ] se.contrast <- sqrt(wk %*% ctsm.ob$vcov %*% t(wk)) - t.stat <- contrast / se.contrast - p.contrast <- 1 - pf(t.stat^2, 1, ctsm.ob$dfResid) + # catch pathological case where contrast = 0 and se.contrast = 0 + # this can happen if all the data between start and end are censored, so + # a 'flat' model is fitted + + if (dplyr::near(contrast, 0L) & dplyr::near(se.contrast, 0L)) { + p.contrast <- 1 + } else { + t.stat <- contrast / se.contrast + p.contrast <- 1 - pf(t.stat^2, 1, ctsm.ob$dfResid) + } + data.frame(start, end, estimate = contrast, se = se.contrast, p = p.contrast) } @@ -1642,8 +1681,8 @@ ctsm_dyear <- function( # Other distributions ---- assess_survival <- function( - data, annualIndex, AC, recent.years, determinand, max.year, recent.trend, - nYearFull, firstYearFull) { + data, annualIndex, AC, recent.years, determinand, good_status, max.year, + recent.trend, nYearFull, firstYearFull) { # silence non-standard evaluation warnings .data <- est <- lcl <- ucl <- p <- se <- NULL @@ -1741,11 +1780,6 @@ assess_survival <- function( data$year_adj <- data$year - min(recent.years) - # establish other info - - good_status <- ctsm_get_info(info$determinand, determinand, "good_status") - - # type of fit depends on number of years: # nYear <= 2 none # nYear <= 4 mean @@ -1755,7 +1789,7 @@ assess_survival <- function( # have only currently coded for mean and linear - look at ctsm.anyyear.lmm for # extensions to smoothers - if (determinand %in% c("NRR", "SURVT") & nYear >= 7) { + if (determinand %in% c("NRR", "SURVT") & nYear >= 8) { stop("time series too long: need to include code for smoothers") } @@ -1786,7 +1820,7 @@ assess_survival <- function( # mean model fits$mean <- flexsurv::flexsurvreg( - Surv(time, time2, type = "interval2") ~ 1, + survival::Surv(time, time2, type = "interval2") ~ 1, dist = surv_dist, data = data ) @@ -2210,8 +2244,8 @@ assess_survival_refvalue <- function( assess_beta <- function( - data, annualIndex, AC, recent.years, determinand, max.year, recent.trend, - nYearFull, firstYearFull) { + data, annualIndex, AC, recent.years, determinand, good_status, max.year, + recent.trend, nYearFull, firstYearFull) { # silence non-standard evaluation warnings info <- weight <- NULL @@ -2258,11 +2292,7 @@ assess_beta <- function( data$year_fac <- factor(data$year) - # establish other info - - good_status <- ctsm_get_info(info$determinand, determinand, "good_status") - - + # type of fit depends on number of years: # nYear <= 2 none # nYear <= 4 mean @@ -2610,12 +2640,12 @@ assess_beta <- function( assess_negativebinomial <- function( - data, annualIndex, AC, recent.years, determinand, max.year, recent.trend, - nYearFull, firstYearFull) { - + data, annualIndex, AC, recent.years, determinand, good_status, max.year, + recent.trend, nYearFull, firstYearFull) { + # silence non-standard evaluation warnings info <- weight <- NULL - + # over-dispersed count data (perhaps very low over-dispersed values from a # binomial distribution, such an MNC) @@ -2631,11 +2661,26 @@ assess_negativebinomial <- function( output <- list(data = data) - # set up offset - e.g. for MNC these are the number of individuals - # specified in MNc-QC-NR + # check all values are valid counts + # response currently expressed as numbers per 1000 cells + + data$response <- data$response * data[["MNC-QC-NR"]] / 1000 - if (!("offset" %in% names(data))) { - data$offset <- 1 + if (!(all(data$response >= 0) & + isTRUE(all.equal(data$response, as.integer(data$response))))) { + stop("invalid values for negative binomial distribution data") + } + + + # set up offset + # for MNC these are the number of cells specified in MNC-QC-NR (but note that + # the offset is then log transformed in the call to gam - this should be + # rationalised) + + if ("offset" %in% names(data)) { + data$offset <- log(data$offset / 1000) + } else { + data$offset <- 0 } @@ -2648,10 +2693,6 @@ assess_negativebinomial <- function( data$year_fac <- factor(data$year) - # establish other info - - good_status <- ctsm_get_info(info$determinand, determinand, "good_status") - # type of fit depends on number of years: # nYear <= 2 none @@ -2662,7 +2703,7 @@ assess_negativebinomial <- function( # have only currently coded for mean and linear - look at ctsm.anyyear.lmm for # extensions to smoothers - if (nYear >= 3) { + if (nYear >= 7) { stop("time series too long: need to include code for smoothers") } @@ -2681,9 +2722,9 @@ assess_negativebinomial <- function( fits$mean <- mgcv::gam( response ~ 1 + s(year_fac, bs = "re"), - weights = weight, - data = data, - family = "betar", + data = data, + offset = data$offset, + family = "nb", method = "ML" ) @@ -2788,7 +2829,7 @@ assess_negativebinomial <- function( } - # get estimated change in logit value over whole time series and in the + # get estimated change in log value over whole time series and in the # most recent # e.g. twenty years of monitoring (truncate when data missing # and only compute if at least five years in that period) # NB p value from contrast is NOT the same as from likelihood ratio test even @@ -2825,8 +2866,8 @@ assess_negativebinomial <- function( output$reference.values <- lapply(AC, function(i) { ctsm.lmm.refvalue( output, - yearID = max(data$year), - refvalue = qlogis(i / 100), + year = max(data$year), + refvalue = log(i), lower.tail = switch(good_status, low = TRUE, high = FALSE) ) }) @@ -2916,14 +2957,20 @@ assess_negativebinomial <- function( }) else { meanLY <- tail(output$pred$fit, 1) - meanLY <- 100 * plogis(meanLY) + meanLY <- exp(meanLY) clLY <- switch( good_status, low = tail(output$pred$ci.upper, 1), high = tail(output$pred$ci.lower, 1) ) - clLY <- 100 * plogis(clLY) + clLY <- exp(clLY) } + + # turn trends into 'percentage trends' + + ltrend <- ltrend * 100 + rtrend <- rtrend * 100 + }) if (!is.null(AC)) { @@ -2949,7 +2996,7 @@ assess_negativebinomial <- function( else if (rtrend >= 0) bigYear else { - wk <- (qlogis(value / 100) - qlogis(meanLY / 100)) / rtrend + wk <- (exp(value) - exp(meanLY)) / rtrend wk <- round(wk + maxYear) min(wk, bigYear) } @@ -2963,7 +3010,7 @@ assess_negativebinomial <- function( else if (rtrend <= 0) bigYear else { - wk <- (qlogis(value / 100) - qlogis(meanLY / 100)) / rtrend + wk <- (exp(value) - exp(meanLY)) / rtrend wk <- round(wk + maxYear) min(wk, bigYear) } @@ -2996,5 +3043,3 @@ assess_negativebinomial <- function( rownames(output$summary) <- NULL output } - - diff --git a/R/import_check_functions.R b/R/import_check_functions.R index 3f2187b..8b2214a 100644 --- a/R/import_check_functions.R +++ b/R/import_check_functions.R @@ -8,7 +8,6 @@ ctsm_check_variable <- function(data, var_id, info) { return(data) } - # augment data with four variables: # ok says whether original value is ok and should be retained # ok.delete says whether original value is valid but is not to be used in the @@ -59,7 +58,7 @@ ctsm_check_variable <- function(data, var_id, info) { stop( "Not all cases considered when checking '", var_id, "': see '", outfile_name, "'\n", - "You might need to contact the HARSAT development team to fix this.", + " You might need to contact the HARSAT development team to fix this." ) } diff --git a/R/import_functions.R b/R/import_functions.R index 31608b3..12faa89 100644 --- a/R/import_functions.R +++ b/R/import_functions.R @@ -19,9 +19,8 @@ library(readxl) #' @param data_dir The directory where the data files can be found (sometimes #' supplied using 'file.path'). Defaults to "."; i.e. the working directory. #' @param data_format A string specifying whether the data were extracted from -#' the ICES webservice ("ICES" - the default) or are in the simplified format -#' designed for other data sources ("external"). The value "ICES_old" is -#' deprecated. +#' the ICES webservice (`"ICES"` - the default) or are in the simplified +#' format designed for other data sources (`"external"`). #' @param info_files A list of files specifying reference tables which override #' the defaults. See examples. #' @param info_dir The directory where the reference tables can be found @@ -65,13 +64,20 @@ library(readxl) #' `retain == FALSE` are deleted later in `tidy_data` #' * `stations` #' -#' ## Control parameters +#' @details #' -#' Many aspects of the assessment process can be controlled through the -#' parameters stored in `info$control`. This is a list populated with default -#' values which can then be overwritten, if required, using the `control` -#' argument. +#' ## Control parameters #' +#' Many aspects of the assessment process can be controlled through the +#' parameters stored in `info$control`. This is a list populated with default +#' values which can then be overwritten, if required, using the `control` +#' argument. +#' +#' ## External data +#' +#' If `data_format = "external"`, a simplified data and station file can +#' be supplied. See `vignette("external-file-format")` for details. +#' #' @export read_data <- function( compartment = c("biota", "sediment", "water"), @@ -285,6 +291,11 @@ control_default <- function(purpose, compartment) { # use_stage is a logical which determines whether, for biota, stage is used # to populate subseries + # relative_uncertainties is a 2-vector giving the range of acceptable + # relative uncertainties for log-normally distributed data; the default is + # to accept relative uncertainties greater than (but not equal) to 1% and + # less than (but not equal to) 100% + region <- list() region$id <- switch( @@ -315,6 +326,8 @@ control_default <- function(purpose, compartment) { ) use_stage <- FALSE + + relative_uncertainty <- c(1, 100) add_stations <- switch( purpose, @@ -366,7 +379,8 @@ control_default <- function(purpose, compartment) { region = region, add_stations = add_stations, bivalve_spawning_season = bivalve_spawning_season, - use_stage = use_stage + use_stage = use_stage, + relative_uncertainty = relative_uncertainty ) } @@ -410,6 +424,17 @@ control_modify <- function(control_default, control) { ) } } + + + if (length(control$relative_uncertainty) != 2L || + control$relative_uncertainty[1] < 0 || + control$relative_uncertainty[1] > control$relative_uncertainty[2]) { + stop( + "error in control argument: invalid range of acceptable relative ", + "uncertainties", + call. = FALSE + ) + } control } @@ -1290,25 +1315,28 @@ add_stations <- function(data, stations, info){ .id = .id | (.data$country == "France") | (.data$country == "Spain" & .data$.year > 2004) | - (.data$country == "The Netherlands" & .data$.year > 2006) + (.data$country == "The Netherlands" & .data$.year > 2006) | + (.data$country == "Germany" & .data$.year > 2022) ), sediment = dplyr::mutate( x, .id = .id | (.data$country == "France" & .data$.year > 2008) | (.data$country == "Spain" & .data$.year > 2004) | - (.data$country == "The Netherlands" & .data$.year > 2006) + (.data$country == "The Netherlands" & .data$.year > 2006) | + (.data$country == "Germany" & .data$.year > 2022) ), water = dplyr::mutate( x, .id = .id | (.data$country == "France") | (.data$country == "Spain" & .data$.year > 2004) | - (.data$country == "The Netherlands" & .data$.year > 2006) + (.data$country == "The Netherlands" & .data$.year > 2006) | + (.data$country == "Germany" & .data$.year > 2022) ), ) - # and Germany currently only matches by name for HELCOM biota + # and Germany always matches by name for HELCOM biota if (info$compartment == "biota" && info$purpose == "HELCOM") { x <- dplyr::mutate(x, .id = .id | (.data$country %in% "Germany")) @@ -2244,7 +2272,7 @@ create_timeseries <- function( ) } - + # normalisation can either be a logical (TRUE uses default normalisation function) # or a function @@ -2415,7 +2443,13 @@ create_timeseries <- function( data$pargroup <- ctsm_get_info(info$determinand, data$determinand, "pargroup") } + # NB distribution will be missing for auxiliary data + data$distribution <- ctsm_get_info( + info$determinand, data$determinand, "distribution", na_action = "output_ok" + ) + + # drop samples which only have auxiliary data ok <- with(data, sample %in% sample[group != "Auxiliary"]) @@ -2470,62 +2504,27 @@ create_timeseries <- function( # ensure censoring, limit of detection and limit of quantification are consistent - data <- ctsm_check_censoring(data, info, print_code_warnings) + data <- check_censoring(data, info, print_code_warnings) - # convert uncertainty into standard deviations, and remove any associated variables - - data <- ctsm_check( - data, - !is.na(uncertainty) & uncertainty <= 0, - action = "make.NA", - message = "Non-positive uncertainties", - file_name = "non_positive_uncertainties", - missing_id = "uncertainty", - info = info - ) - - data <- dplyr::mutate( - data, - uncertainty_sd = dplyr::case_when( - unit_uncertainty %in% "U2" ~ uncertainty / 2, - unit_uncertainty %in% "%" ~ value * uncertainty / 100, - TRUE ~ uncertainty - ), - uncertainty_rel = 100 * (uncertainty_sd / value) - ) + # ensure uncertainties are plausible + + data <- check_uncertainty(data, info, type = "reported") - wk_id <- match("unit_uncertainty", names(data)) - wk_n <- ncol(data) - data <- data[c( - names(data)[1:wk_id], - "uncertainty_sd", "uncertainty_rel", - names(data)[(wk_id+1):(wk_n-2)])] - - ctsm_check( - data, - !is.na(uncertainty) & uncertainty_rel >= 100, - action = "warning", - message = "Large uncertainties", - file_name = "large uncertainties", - info = info - ) - - # delete data with large relative uncertainties + # convert all uncertainties to unit SD + data <- dplyr::mutate( data, - uncertainty_sd = dplyr::if_else( - .data$uncertainty_rel < 100, - .data$uncertainty_sd, - NA_real_ - ), - uncertainty = .data$uncertainty_sd, - unit_uncertainty = NULL, - uncertainty_sd = NULL, - uncertainty_rel = NULL + uncertainty = dplyr::case_when( + .data$unit_uncertainty %in% "U2" ~ .data$uncertainty / 2, + .data$unit_uncertainty %in% "%" ~ .data$value * .data$uncertainty / 100, + .default = .data$uncertainty + ), + unit_uncertainty = "SD" ) - + + # sort out determinands where several determinands represent the same variable of interest # three types of behaviour: replace, sum and bespoke @@ -2540,7 +2539,7 @@ create_timeseries <- function( bespoke = get(paste("determinand.link", i, sep = "."), mode = "function") ) - args = list(data = data, keep = i, drop = wk$det) + args = list(data = data, info = info, keep = i, drop = wk$det) if ("weights" %in% names(wk)) { args = c(args, list(weights = wk$weights)) } @@ -2568,8 +2567,6 @@ create_timeseries <- function( cat("\nCreating time series data\n") - data <- data[setdiff(names(data), c("qalink", "alabo"))] - # create new.unit and concentration columns comprising the details from the # determinand file in the information folder, required to get correct unit details @@ -2600,7 +2597,7 @@ create_timeseries <- function( # missing values for correction if (info$compartment == "biota") { - data <- ctsm.imposex.check.femalepop(data) + data <- ctsm.imposex.check.femalepop(data, info) } @@ -2610,14 +2607,9 @@ create_timeseries <- function( if (return_early) { - out = c( + out <- c( out, - ctsm.import.value( - data, - station_dictionary, - info$compartment, - info$purpose, - print_code_warnings) + output_timeseries(data, station_dictionary, info, extra = "alabo") ) return(out) @@ -2640,7 +2632,19 @@ create_timeseries <- function( } + # check that all normal and lognormal data have uncertainties + + data <- ctsm_check( + data, + distribution %in% c("normal", "lognormal") & !is.na(concentration) & + is.na(uncertainty), + action = "delete", + message = "Missing uncertainties which cannot be imputed", + file_name = "missing_uncertainties", + info = info + ) + # filter contaminant data to remove bivalve and gastropod records in the # spawning season when they are elevated / more variable @@ -2697,20 +2701,25 @@ create_timeseries <- function( data <- normalise(data, station_dictionary, info, normalise.control) } - - # remove concentrations where: - # uncertainty is missing - # uncertainty cv is > 100% - # ensure uncertainty and censoring are missing when concentration is missing + # check whether implausible uncertainties have been calculated during the + # data processing (e.g. during normalisation) + # if so - make concentration, uncertainty and censoring missing - ok <- !is.na(data$concentration) & !is.na(data$uncertainty) - ok <- ok & (data$uncertainty <= data$concentration) + data <- check_uncertainty(data, info, type = "calculated") + + + # final check to ensure all normal and lognormal data have an uncertainty + + notok <- data$distribution %in% c("normal", "lognormal") & + !is.na(data$concentration) & is.na(data$uncertainty) + + if (any(notok)) { + stop( + "uncertainties missing where they should be present: \n", + "contact HARSAT development team") + } - data$concentration[!ok] <- NA_real_ - data$uncertainty[!ok] <- NA_real_ - data$censoring[!ok] <- NA_character_ - # drop groups of data at stations with no data in recent years cat(" Dropping groups of compounds / stations with no data between", @@ -2725,7 +2734,7 @@ create_timeseries <- function( out <- c( out, - ctsm_import_value(data, station_dictionary, info) + output_timeseries(data, station_dictionary, info) ) out @@ -2841,7 +2850,7 @@ ctsm_check <- function( } -ctsm_import_value <- function(data, station_dictionary, info) { +output_timeseries <- function(data, station_dictionary, info, extra = NULL) { # silence non-standard evaluation warnings .data <- .group <- seriesID <- NULL @@ -2875,6 +2884,10 @@ ctsm_import_value <- function(data, station_dictionary, info) { "limit_detection", "limit_quantification", "uncertainty" ) + if (!is.null(extra)) { + id <- c(id, extra) + } + auxiliary <- ctsm_get_auxiliary(data$determinand, info) auxiliary_id <- paste0( rep(auxiliary, each = 5), @@ -3244,7 +3257,7 @@ ctsm_check_determinands <- function(info, data, determinands, control = NULL) { -determinand.link.check <- function(data, keep, drop, printDuplicates = TRUE, ...) { +determinand.link.check <- function(data, info, keep, drop, printDuplicates = TRUE, ...) { # check whether any drop and keep are both submitted for the same sample and # matrix and, if so, delete drop - note that ctsm_check doesn't do the @@ -3271,7 +3284,8 @@ determinand.link.check <- function(data, keep, drop, printDuplicates = TRUE, ... keep, "and", dropTxt, "submitted in same sample - deleting", dropTxt, "data" ), - file_name = paste("determinand_link", keep, sep = "_"), + file_name = paste("determinand_link", keep, sep = "_"), + info = info, ... ) } @@ -3280,7 +3294,7 @@ determinand.link.check <- function(data, keep, drop, printDuplicates = TRUE, ... } -determinand.link.replace <- function(data, keep, drop, ...) { +determinand.link.replace <- function(data, info, keep, drop, ...) { # core function for relabelling determinand 'drop' as determinand 'keep' # most of the work is checking that there aren't data submitted as both for the same @@ -3295,7 +3309,7 @@ determinand.link.replace <- function(data, keep, drop, ...) { # check for samples with both drop and keep and, if they exist, delete drop - data <- determinand.link.check(data, keep, drop, ...) + data <- determinand.link.check(data, info, keep, drop, ...) # relabel the levels so that drop becomes keep @@ -3307,7 +3321,7 @@ determinand.link.replace <- function(data, keep, drop, ...) { } -determinand.link.imposex <- function(data, keep, drop, ...) { +determinand.link.imposex <- function(data, info, keep, drop, ...) { stopifnot(length(keep) == 1, length(drop) == 1) @@ -3338,6 +3352,7 @@ determinand.link.imposex <- function(data, keep, drop, ...) { action = "warning", message = paste("inconsistent", keep, "and", drop, "submitted in same year"), file_name = paste("determinand_link", keep, sep = "_"), + info = info, ... ) @@ -3363,7 +3378,7 @@ determinand.link.imposex <- function(data, keep, drop, ...) { determinand.link.VDS <- determinand.link.IMPS <- determinand.link.INTS <- determinand.link.imposex -determinand.link.BBKF <- function(data, keep, drop, ...) { +determinand.link.BBKF <- function(data, info, keep, drop, ...) { stopifnot( identical(keep, "BBKF"), @@ -3372,30 +3387,36 @@ determinand.link.BBKF <- function(data, keep, drop, ...) { # first sum samples with both BBF and BKF - data <- determinand.link.sum(data, "BBKF", c("BBF", "BKF")) + data <- determinand.link.sum(data, info, "BBKF", c("BBF", "BKF")) # now sum samples with both BBJF and BKF to give BBJKF - data <- determinand.link.sum(data, "BBJKF", c("BBJF", "BKF")) + data <- determinand.link.sum(data, info, "BBJKF", c("BBJF", "BKF")) # now replace BBJKF with BBKF - data <- determinand.link.replace(data, "BBKF", "BBJKF") + data <- determinand.link.replace(data, info, "BBKF", "BBJKF") data } -assign("determinand.link.LIPIDWT%", function(data, keep, drop, ...) { +assign("determinand.link.LIPIDWT%", function(data, info, keep, drop, ...) { stopifnot(identical(keep, "LIPIDWT%"), identical(sort(drop), c("EXLIP%", "FATWT%"))) # if multiple values present, choose FATWT%, then LIPIDWT%, then EXLIP% (from Foppe) - data <- determinand.link.check(data, keep = "LIPIDWT%", drop = "EXLIP%", printDuplicates = FALSE, ...) - data <- determinand.link.check(data, keep = "FATWT%", drop = "EXLIP%", printDuplicates = FALSE, ...) - data <- determinand.link.check(data, keep = "FATWT%", drop = "LIPIDWT%", printDuplicates = FALSE, ...) + data <- determinand.link.check( + data, info, keep = "LIPIDWT%", drop = "EXLIP%", printDuplicates = FALSE, ... + ) + data <- determinand.link.check( + data, info, keep = "FATWT%", drop = "EXLIP%", printDuplicates = FALSE, ... + ) + data <- determinand.link.check( + data, info, keep = "FATWT%", drop = "LIPIDWT%", printDuplicates = FALSE, ... + ) if (!any(data$determinand %in% drop)) return(data) @@ -3410,7 +3431,7 @@ assign("determinand.link.LIPIDWT%", function(data, keep, drop, ...) { }) -determinand.link.sum <- function(data, keep, drop, ...) { +determinand.link.sum <- function(data, info, keep, drop, ...) { stopifnot(length(keep) == 1, length(drop) > 1) @@ -3531,7 +3552,7 @@ determinand.link.sum <- function(data, keep, drop, ...) { -determinand.link.TEQDFP <- function(data, keep, drop, weights) { +determinand.link.TEQDFP <- function(data, info, keep, drop, weights) { stopifnot(length(keep) == 1, length(drop) > 1) @@ -3648,7 +3669,7 @@ determinand.link.TEQDFP <- function(data, keep, drop, weights) { } -ctsm_check_censoring <- function(data, info, print_code_warnings) { +check_censoring <- function(data, info, print_code_warnings) { # silence non-standard evaluation warnings value <- limit_detection <- limit_quantification <- NULL @@ -3792,6 +3813,93 @@ ctsm_check_censoring <- function(data, info, print_code_warnings) { } +check_uncertainty <- function(data, info, type = c("reported", "calculated")) { + + # import_functions.r + + # uncertainties must be non-negative for all data + # uncertainties must be strictly positive for normal or lognormal data + # relative uncertainties must be within specified range (1, 100) default for + # lognormal data + + # type = reported is used for submitted data + # type = calculated is used to check whether implausible uncertainties have + # been created in e.g. the normalisation process + + type <- match.arg(type) + + + # calculate relative uncertainties for lognormal data + # use value for reported data and concentration for calculated data + + id <- switch(type, reported = "value", calculated = "concentration") + + data <- dplyr::mutate( + data, + .ok = .data$distribution %in% "lognormal", + relative_uncertainty = dplyr::case_when( + .ok & .data$unit_uncertainty %in% "SD" ~ + 100 * .data$uncertainty / .data[[id]], + .ok & .data$unit_uncertainty %in% "U2" ~ + 100 * .data$uncertainty / (2 * .data[[id]]), + .ok & .data$unit_uncertainty %in% "%" ~ .data$uncertainty, + .default = NA_real_ + ), + .ok = NULL + ) + + data <- dplyr::mutate( + data, + reason = dplyr::case_when( + .data$uncertainty < 0 ~ "negative", + .data$distribution %in% c("normal", "lognormal") & + .data$uncertainty == 0 ~ "zero", + .data$distribution %in% "lognormal" & + .data$relative_uncertainty <= info$relative_uncertainty[1] ~ "small", + .data$distribution %in% "lognormal" & + .data$relative_uncertainty >= info$relative_uncertainty[2] ~ "large", + .default = "ok" + ) + ) + + data <- dplyr::relocate( + data, + "relative_uncertainty", + .after = "unit_uncertainty" + ) + + data <- dplyr::relocate(data, "reason") + + if (type == "reported") { + message <- "Implausible uncertainties reported with data" + file_name <- "implausible_uncertainties_reported" + missing_id <- "uncertainty" + } + + if (type == "calculated") { + message <- "Implausible uncertainties calculated in data processing" + file_name <- "implausible_uncertainties_calculated" + missing_id <- c("concentration", "uncertainty", "censoring") + } + + data <- ctsm_check( + data, + reason != "ok", + action = "make.NA", + message = message, + file_name = file_name, + missing_id = missing_id, + info = info + ) + + data$reason <- NULL + data$relative_uncertainty <- NULL + + data +} + + + check_subseries <- function(data, info) { # import_functions.R @@ -4460,7 +4568,7 @@ normalise_sediment_OSPAR <- function(data, station_dictionary, info, control) { data } -#' Normalises sediment concentrations, HELCOM vwersion +#' Normalises sediment concentrations, HELCOM version #' #' @param data the data object #' @param station_dictionary the station dictionary diff --git a/R/imposex_clm.R b/R/imposex_clm.R index 2fda0b4..f9f4c47 100644 --- a/R/imposex_clm.R +++ b/R/imposex_clm.R @@ -203,7 +203,8 @@ imposex.clm.predict <- function(clmFit, theta, data) { imposex_assess_clm <- function( - data, theta, annualIndex, species, recent.trend = 20, max.year) { + data, theta, annualIndex, species, recent.trend = 20, max.year, + seed = NULL) { # silence non-standard evaluation warnings dfResid <- twiceLogLik <- pFixed <- NULL @@ -211,6 +212,13 @@ imposex_assess_clm <- function( output <- list() summary <- list() + + # set seed for random number generations (used to obtain confidence limits on + # fitted trend) + + set.seed(seed) + + # decide whether there are sufficient years to model data # appropriate type of fit depends on total number of years and # number of years with intermediate values (i.e between 0 and max(VDS)) diff --git a/R/imposex_functions.R b/R/imposex_functions.R index 6b1dbd8..8f0e0ea 100644 --- a/R/imposex_functions.R +++ b/R/imposex_functions.R @@ -240,9 +240,14 @@ assess_imposex <- function( VDS = pmin(.data$concentration, theta$K), VDS = factor(.data$VDS, levels = 0:theta$K) ) - + + # create seed for random number generation based on combination of + # station_code and species + + seed <- TeachingDemos::char2seed(paste0(station_code, species)) + assessment <- imposex_assess_clm( - data, theta, annualIndex, species, recent.trend, max.year + data, theta, annualIndex, species, recent.trend, max.year, seed ) } else { diff --git a/R/information_functions.R b/R/information_functions.R index 2c184a5..f135ed3 100644 --- a/R/information_functions.R +++ b/R/information_functions.R @@ -710,6 +710,7 @@ ctsm_get_datatype <- function(determinand, info, abbr = FALSE){ startsWith(pargroup, "OC-") ~ "contaminant", pargroup %in% c("I-MET", "I-RNC") ~ "contaminant", pargroup %in% c("B-MBA", "B-TOX", "B-END") ~ "effect", + determinand %in% "SURVT" ~ "effect", pargroup %in% c("B-GRS", "B-HST") ~ "disease", TRUE ~ "auxiliary" ) @@ -2081,6 +2082,10 @@ ctsm_convert_basis <- function( # set up working data frame + if (all(exclude)) { + return(conc) + } + data <- data.frame( conc, from, to, drywt, drywt_censoring, lipidwt, lipidwt_censoring, exclude ) diff --git a/R/proportional_odds_functions.R b/R/proportional_odds_functions.R index 76534b4..7e8822e 100644 --- a/R/proportional_odds_functions.R +++ b/R/proportional_odds_functions.R @@ -175,6 +175,8 @@ ctsm.VDS.cl <- function(fit, nsim = 1000) { indexID <- setdiff(names(fit$par), cutsID) + set.seed(fit$seed) + data <- MASS::mvrnorm(nsim, fit$par, fit$vcov) data.cuts <- data[, cutsID, drop = FALSE] diff --git a/R/reporting_functions.R b/R/reporting_functions.R index ca35827..f181e30 100644 --- a/R/reporting_functions.R +++ b/R/reporting_functions.R @@ -316,14 +316,11 @@ ctsm.web.AC <- function(assessment_ob, classification) { drop = TRUE ) - # identity all AC that are relevant to the overall assessment - # more AC might be included in the assessment for each timeseries - this - # is a legacy issue that needs to be resolved - has arisen in looking - # at both environmental and health criteria + # identity all AC that are relevant + + AC_id <- names(classification[["below"]]) + stopifnot(AC_id %in% assessment_ob$info$AC) - AC_id <- assessment_ob$info$AC - stopifnot(AC_id %in% names(classification[["below"]])) - # loop over determinands out <- sapply(assessment_id, USE.NAMES = TRUE, simplify = FALSE, FUN = function(id) { @@ -840,6 +837,9 @@ ctsm_collapse_AC <- function(x, type = c("real", "character")) { #' @param output_dir The output directory for the assessment plots (possibly #' supplied using 'file.path'). The default is the working directory. The #' output directory must already exist. +#' @param output_file An alterntive file name to override the default. This is +#' currently only implemented for a single report. If not supplied, the .html +#' extension will be added. #' @param max_report The maximum number of reports that will be generated. #' Defaults to 100. Each report is about 1MB in size and takes a few seconds #' to run, so this prevents a ridiculous number of reports being created. @@ -854,6 +854,7 @@ report_assessment <- function( assessment_obj, subset = NULL, output_dir = ".", + output_file = NULL, max_report = 100L) { # reporting_functions.R @@ -867,6 +868,14 @@ report_assessment <- function( ) } + if (!is.null(output_file) & length(output_file) > 1) { + stop( + "\n`output_file` can currently only be a single character string for", + " renaming a single\nreport.", + call. = FALSE + ) + } + info <- assessment_obj$info timeSeries <- assessment_obj$timeSeries @@ -924,31 +933,50 @@ report_assessment <- function( } + # if output_file supplied, ensure there is only one series + + if (!is.null(output_file) & n_series > 1) { + stop( + "\n`output_file` can currently only be used to rename a single report", + " and ", n_series, " reports have\nbeen requested", + call. = FALSE + ) + } + + # report on each time series lapply(series_id, function(id) { - # get file name from id, and add country and station name - # for easier identification - - series <- timeSeries[id, ] - - output_id <- sub( - series$station_code, - paste(series$station_code, series$country, series$station_name), - id, - fixed=TRUE - ) + # get file name + # if not supplied, use id and add country and station name for easier + # identification - # get rid of any slashes that might have crept in - - output_id <- gsub(" / ", " ", output_id, fixed = TRUE) - output_id <- gsub("/", " ", output_id, fixed = TRUE) - - output_id <- gsub(" \ ", " ", output_id, fixed = TRUE) - output_id <- gsub("\\", " ", output_id, fixed = TRUE) - + if (!is.null(output_file)) { + + output_id = output_file + + } else { + series <- timeSeries[id, ] + + output_id <- sub( + series$station_code, + paste(series$station_code, series$country, series$station_name), + id, + fixed=TRUE + ) + + # get rid of any slashes that might have crept in + + output_id <- gsub(" / ", " ", output_id, fixed = TRUE) + output_id <- gsub("/", " ", output_id, fixed = TRUE) + + output_id <- gsub(" \ ", " ", output_id, fixed = TRUE) + output_id <- gsub("\\", " ", output_id, fixed = TRUE) + + } + package_dir = system.file(package = "harsat") template_dir = file.path(package_dir, "markdown") report_file <- file.path(template_dir, "report_assessment.Rmd") @@ -973,17 +1001,16 @@ report_assessment <- function( #' @export ctsm_OHAT_legends <- function( - assessments, determinandGroups, regionalGroups = NULL, distanceGroups = NULL, path) { + assessments, determinandGroups, determinands, symbology, + regionalGroups = NULL, distanceGroups = NULL, path) { # silence non-standard evaluation warnings info <- NULL out <- sapply(names(assessments), simplify = FALSE, USE.NAMES = TRUE, FUN = function(media) { - assessment.ob <- assessments[[media]] - assessment <- assessment.ob$assessment - classColour <- assessment.ob$classColour - determinands <- assessment.ob$determinands + assessment <- assessments[[media]] + classColour <- symbology[[media]] regionalGroups <- regionalGroups[[media]] distanceGroups <- distanceGroups[[media]] @@ -994,16 +1021,16 @@ ctsm_OHAT_legends <- function( compartment <- assessment$info$compartment group <- ctsm_get_info( - info$determinand, determinands, "group", compartment, sep = "_" + assessment$info$determinand, determinands, "group", compartment, sep = "_" ) web_group <- factor( group, levels = determinandGroups$levels, labels = determinandGroups$labels ) - web_group <- wk_group[, drop = TRUE] + web_group <- web_group[, drop = TRUE] - goodStatus <- ctsm_get_info(info$determinand, determinands, "good_status") + goodStatus <- ctsm_get_info(assessment$info$determinand, determinands, "good_status") goodStatus <- as.character(goodStatus) legendName <- apply(legends, 1, function(i) paste(colnames(legends)[i], collapse = " ")) diff --git a/R/uncertainty_functions.R b/R/uncertainty_functions.R index 34be926..6b947e4 100644 --- a/R/uncertainty_functions.R +++ b/R/uncertainty_functions.R @@ -1,32 +1,30 @@ #' @export -ctsm_uncrt_workup <- function(clean_data) { +ctsm_uncrt_workup <- function(harsat_obj) { # silence non-standard evaluation warnings - determinands <- qaID <- uncertainty <- concentration <- NULL - + .data <- NULL + # turn 'clean' data into uncertainty data # read in data - data <- clean_data$data - stations <- clean_data$stations - compartment <- clean_data$info$compartment + data <- harsat_obj$data + stations <- harsat_obj$stations + info <- harsat_obj$info - rm(clean_data) + rm(harsat_obj) # link to country - data$country <- stations[as.character(data$station), "country"] + data <- dplyr::left_join( + data, + stations[c("station_code", "country")], + by = "station_code" + ) - # get alabo and remove missing alabo - - data <- within(data, { - alabo <- sapply(strsplit(as.character(qaID), "_"), "[", 3) - alabo[alabo %in% "NA"] <- NA - alabo <- factor(alabo) - }) + # remove data with no analytical laboratory information data <- data[!is.na(data$alabo), ] @@ -38,23 +36,25 @@ ctsm_uncrt_workup <- function(clean_data) { id_aux <- c( "", ".uncertainty", ".censoring", ".limit_detection", ".limit_quantification" ) + - id <- intersect( - c("country", "alabo", "year", "sample", "group", "determinand", - "concentration", "uncertainty", - "censoring", "limit_detection", "limit_quantification", - paste0("AL", id_aux), - paste0("LI", id_aux), - paste0("CORG", id_aux), - paste0("LOIGN", id_aux)), - names(data) + id <- c( + "country", "alabo", "year", "sample", "group", "determinand", + "concentration", "uncertainty", + "censoring", "limit_detection", "limit_quantification", + paste0("AL", id_aux), + paste0("LI", id_aux), + paste0("CORG", id_aux), + paste0("LOIGN", id_aux) ) - data <- data[id] + + data <- dplyr::select(data, any_of(id)) + # sort out AL and CORG etc for sediment - if (compartment == "sediment") { + if (info$compartment == "sediment") { id <- c("country", "alabo", "year", "group", "sample", "determinand") @@ -115,52 +115,61 @@ ctsm_uncrt_workup <- function(clean_data) { # restrict to 'log-normally' distributed responses - - ok <- with(data, { - dist <- ctsm_get_info( - "determinand", determinand, "distribution", na_action = "output_ok" + # keep explicit mention of CORG and LOIGN just in case + + data <- dplyr::mutate( + data, + distribution = ctsm_get_info( + info$determinand, + .data$determinand, + "distribution", + na_action = "output_ok" ) - dist %in% "lognormal" | determinand %in% c("CORG", "LOIGN") - }) - - data <- data[ok, ] + ) + data <- dplyr::filter( + data, + .data$distribution %in% "lognormal" | .data$determinand %in% c("CORG", "LOIGN") + ) + # order groups and determinands within group - det_list <- determinands[[stringr::str_to_title(compartment)]] - - data <- within(data, { - group <- factor(as.character(group), levels = c(names(det_list), "auxiliary")) - determinand <- factor( - as.character(determinand), - levels = c(unlist(det_list), "AL", "LI", "CORG", "LOIGN")) - }) + # det_list <- determinands[[stringr::str_to_title(compartment)]] + # + # data <- within(data, { + # group <- factor(as.character(group), levels = c(names(det_list), "auxiliary")) + # determinand <- factor( + # as.character(determinand), + # levels = c(unlist(det_list), "AL", "LI", "CORG", "LOIGN")) + # }) # calculate relative uncertainty - data <- within(data, relative_u <- 100 * uncertainty / concentration) - - data <- droplevels(data) + data <- dplyr::mutate( + data, + relative_u = 100 * .data$uncertainty / .data$concentration + ) - list(compartment = compartment, data = data) + list(compartment = info$compartment, data = data) } + #' @export ctsm_uncrt_estimate <- function(data) { # silence non-standard evaluation warnings - .data <- n <- relative_u <- sd_variable <- sd_constant <- NULL + .data <- NULL # initialise output with total number of values by determinand options(dplyr.summarise.inform = FALSE) on.exit(options(dplyr.summarise.inform = NULL)) - out <- data %>% - dplyr::group_by(.data$determinand) %>% - dplyr::summarise(n_values = n()) + out <- data |> + dplyr::group_by(.data$determinand) |> + dplyr::summarise(n_values = dplyr::n()) # remove duplicate combinations of concentration and uncertainty (and associated censoring variables) @@ -177,9 +186,12 @@ ctsm_uncrt_estimate <- function(data) { # get number of 'unique values - out_unique <- data %>% - dplyr::group_by(.data$determinand) %>% - dplyr::summarise(n_unique = n(), n_alabo = dplyr::n_distinct(.data$alabo)) + out_unique <- data |> + dplyr::group_by(.data$determinand) |> + dplyr::summarise( + n_unique = dplyr::n(), + n_alabo = dplyr::n_distinct(.data$alabo) + ) out <- dplyr::left_join(out, out_unique, by = "determinand") @@ -193,16 +205,16 @@ ctsm_uncrt_estimate <- function(data) { # relative error # median relative_u for values above the detection level by alabo - out_relative <- data %>% - dplyr::filter(.data$censoring == "") %>% - dplyr::group_by(.data$determinand, .data$alabo) %>% + out_relative <- data |> + dplyr::filter(.data$censoring == "") |> + dplyr::group_by(.data$determinand, .data$alabo) |> dplyr::summarise(sd_variable = median(.data$relative_u) / 100) # now the median value across alabos - out_relative <- out_relative %>% - dplyr::group_by(.data$determinand) %>% - dplyr::summarise(sd_variable = median(sd_variable)) + out_relative <- out_relative |> + dplyr::group_by(.data$determinand) |> + dplyr::summarise(sd_variable = median(.data$sd_variable)) out <- dplyr::left_join(out, out_relative, by = "determinand") @@ -211,52 +223,53 @@ ctsm_uncrt_estimate <- function(data) { # median limit_detection for values with censoring == D, Q or "" by alabo # don't use "<" because we can't trust any of the limit values - out_constant <- data %>% - dplyr::filter(.data$censoring %in% c("D", "Q", "")) %>% - tidyr::drop_na(.data$limit_detection) %>% - dplyr::group_by(.data$determinand, .data$alabo) %>% + out_constant <- data |> + dplyr::filter(.data$censoring %in% c("D", "Q", "")) |> + tidyr::drop_na(.data$limit_detection) |> + dplyr::group_by(.data$determinand, .data$alabo) |> dplyr::summarise(sd_constant = median(.data$limit_detection) / 3) # now the median value across alabos - out_constant <- out_constant %>% - dplyr::group_by(.data$determinand) %>% - dplyr::summarise(sd_constant = median(sd_constant)) + out_constant <- out_constant |> + dplyr::group_by(.data$determinand) |> + dplyr::summarise(sd_constant = median(.data$sd_constant)) out <- dplyr::left_join(out, out_constant, by = "determinand") # tidy up - out <- out %>% - as.data.frame() %>% - column_to_rownames("determinand") %>% + out <- out |> + as.data.frame() |> + tibble::column_to_rownames("determinand") |> round(6) out } #' @export -ctsm_uncrt_plot_estimates <- function(uncrt_obj, old_estimates, group_id) { +ctsm_uncrt_plot_estimates <- function(uncrt_obj, group_id) { - id <- with(uncrt_obj$data, group %in% group_id) - data <- uncrt_obj$data[id, ] + data <- dplyr::filter(uncrt_obj$data, .data$group %in% group_id) - data <- data[with(data, order(determinand, concentration)), ] + data <- dplyr::arrange(data, .data$determinand, .data$concentration) + + data <- dplyr::filter(data, .data$relative_u >= 1 & .data$relative_u <= 100) - ok <- with(data, relative_u >= 1 & relative_u <= 100) - data <- data[ok, ] - new <- uncrt_obj$estimates[c("sd_constant", "sd_variable")] names(new) <- c("sdC", "sdV") - var_id <- paste(uncrt_obj$compartment, c("sd_constant", "sd_variable"), sep= ".") - old <- old_estimates[var_id] + var_id <- paste0(uncrt_obj$compartment, c("_sd_constant", "_sd_variable")) + old <- uncrt_obj$old_estimates[var_id] names(old) <- c("sdC", "sdV") xyplot( relative_u ~ concentration | determinand, data = data, aspect = 1, - scales = list(alternating = FALSE, x = list(log = TRUE, relation = "free", equispaced = FALSE)), + scales = list( + alternating = FALSE, + x = list(log = TRUE, relation = "free", equispaced = FALSE) + ), as.table = TRUE, panel = function(x, y, subscripts) { data <- data[subscripts, ] diff --git a/example_HELCOM.r b/example_HELCOM.r index 1fa69eb..0239f1a 100644 --- a/example_HELCOM.r +++ b/example_HELCOM.r @@ -154,10 +154,12 @@ write_summary_table( levels = c("Metals", "Organotins", "Organofluorines"), labels = c("Metals", "Organotins", "Organofluorines") ), - classColour = list( - below = c("EQS" = "green"), - above = c("EQS" = "red"), - none = "black" + symbology = list( + colour = list( + below = c("EQS" = "green"), + above = c("EQS" = "red"), + none = "black" + ) ), collapse_AC = list(EAC = "EQS"), output_dir = file.path("output", "example_HELCOM") @@ -239,10 +241,12 @@ write_summary_table( "Organobromines", "Organobromines" ) ), - classColour = list( - below = c("EQS" = "green"), - above = c("EQS" = "red"), - none = "black" + symbology = list( + colour = list( + below = c("EQS" = "green"), + above = c("EQS" = "red"), + none = "black" + ) ), collapse_AC = list(EAC = "EQS"), output_dir = file.path("output", "example_HELCOM") @@ -356,10 +360,12 @@ write_summary_table( "PCBs and dioxins", "PCBs and dioxins" ) ), - classColour = list( - below = c("BAC" = "green", "EAC" = "green", "EQS" = "green", "MPC" = "green"), - above = c("BAC" = "red", "EAC" = "red", "EQS" = "red", "MPC" = "red"), - none = "black" + symbology = list( + colour = list( + below = c("BAC" = "green", "EAC" = "green", "EQS" = "green", "MPC" = "green"), + above = c("BAC" = "red", "EAC" = "red", "EQS" = "red", "MPC" = "red"), + none = "black" + ) ), collapse_AC = list(EAC = c("EAC", "EQS", "MPC")), output_dir = file.path("output", "example_HELCOM") diff --git a/example_OSPAR.r b/example_OSPAR.r index 4ddadea..0709554 100644 --- a/example_OSPAR.r +++ b/example_OSPAR.r @@ -75,7 +75,6 @@ report_assessment( - # Sediment ---- sediment_data <- read_data( @@ -142,22 +141,24 @@ write_summary_table( "Polychlorinated biphenyls", "Dioxins", "Organochlorines (other)" ) ), - classColour = list( - below = c( - "BAC" = "blue", - "ERL" = "green", - "EAC" = "green", - "EQS" = "green", - "FEQG" = "green" - ), - above = c( - "BAC" = "orange", - "ERL" = "red", - "EAC" = "red", - "EQS" = "red", - "FEQG" = "red" - ), - none = "black" + symbology = list( + colour = list( + below = c( + "BAC" = "blue", + "ERL" = "green", + "EAC" = "green", + "EQS" = "green", + "FEQG" = "green" + ), + above = c( + "BAC" = "orange", + "ERL" = "red", + "EAC" = "red", + "EQS" = "red", + "FEQG" = "red" + ), + none = "black" + ) ), collapse_AC = list(BAC = "BAC", EAC = c("EAC", "ERL", "EQS", "FEQG")), output_dir = file.path("output", "example_OSPAR") @@ -187,6 +188,7 @@ biota_data <- read_data( biota_data <- tidy_data(biota_data) + biota_timeseries <- create_timeseries( biota_data, determinands.control = list( @@ -244,7 +246,7 @@ biota_assessment <- run_assessment( biota_assessment <- update_assessment( biota_assessment, - subset = !determinand %in% wk_metals, + subset = !determinand %in% wk_metals, parallel = TRUE ) @@ -260,9 +262,6 @@ biota_assessment <- update_assessment( check_assessment(biota_assessment) - - - # environmental summary wk_groups <- list( diff --git a/example_external_data.r b/example_external_data.r index 7d97c1f..3d741ac 100644 --- a/example_external_data.r +++ b/example_external_data.r @@ -129,3 +129,4 @@ plot_assessment( output_dir = file.path("output", "graphics"), file_format = "pdf" ) + diff --git a/inst/information/method_extraction.csv b/inst/information/method_extraction.csv index 421b28c..f82032f 100644 --- a/inst/information/method_extraction.csv +++ b/inst/information/method_extraction.csv @@ -10,6 +10,7 @@ AM-AQR,"APDC-complexation, MIBK-extraction, Aqua regia digestion",nn AM-HF-C,"APDC-complexation, MIBK-extraction, HF/HNO3 digestion", ANT,Acetonenitrile (legacy data text),nn AQR,Aqua regia extraction HNO3:HCL = 1:3,Pw +ASE-DCM-HX,Accelerated Solvent Extraction with dichloromethane and hexane,nn BRB,Bromate/Bromide solvent,nn CD,Cadmium reduction,nn CDS,Cadmium reduction and Sulfanilamid and N-1-Naphtylethylendiamindihydrochlorid,nn @@ -19,6 +20,7 @@ CTC,Extraction with carbon tetrachloride,nn DBC,"DBCDTC-Complexation, Chloroform-Methanol-Extraction",nn DCM,Dichloromethane,nn DET,Diethyl ether, +dSPE-Q,QuEChERS buffered acetonitrile (MeCN) extraction with salting out with MgSO4 and cleanup on dispersive solid-phase column,nn dSPE-QEN,QuEChERS following EN 15662,nn ETA,Ethylacetate,nn ETH,Ethanol,nn @@ -52,7 +54,7 @@ KPX-BA,Potassium-peroxodisulphate and boric acid (K2S2O8-H3BO3), LMF-A,Lithium metaborate fusion LiBO2 followed by dissolution in acid,Tot LMF-A-L,Lithium metaborate fusion LiBO2 followed by dissolution in acid (Lanthanoides),Tot MDCM,Methanol and dichloromethane,nn -METH,Methanol, +METH,Methanol,nn MGN-MGO,Ashing in presence of magnesiumnitrate and magnesiumoxide, MHCL,Methanol and HCL,nn MHX,Methanol/hexane mixture in acetic acid environment,nn @@ -72,6 +74,7 @@ SFE,Supercritical Fluid Extraction,nn SMD,Smedes extraction (cyclohexane/isopropanol),nn SOX,Soxhlet method,nn SPE-DCM,Solid phase extraction with dichloromethane, +SPE-MNA-SPE,Solid phase extraction with methanol and NaOH and Solid phase extraction,nn TCF,"Extraction with 1,1,2-Trichlortrifluorethan",nn TOL,Toluene,nn TOT,Total extraction method - report in METOA,nn diff --git a/man/normalise_sediment_HELCOM.Rd b/man/normalise_sediment_HELCOM.Rd index 84c9f8e..cb0921b 100644 --- a/man/normalise_sediment_HELCOM.Rd +++ b/man/normalise_sediment_HELCOM.Rd @@ -2,7 +2,7 @@ % Please edit documentation in R/import_functions.R \name{normalise_sediment_HELCOM} \alias{normalise_sediment_HELCOM} -\title{Normalises sediment concentrations, HELCOM vwersion} +\title{Normalises sediment concentrations, HELCOM version} \usage{ normalise_sediment_HELCOM(data, station_dictionary, info, control) } @@ -16,5 +16,5 @@ normalise_sediment_HELCOM(data, station_dictionary, info, control) \item{control}{control values} } \description{ -Normalises sediment concentrations, HELCOM vwersion +Normalises sediment concentrations, HELCOM version } diff --git a/man/read_data.Rd b/man/read_data.Rd index fd57158..e406751 100644 --- a/man/read_data.Rd +++ b/man/read_data.Rd @@ -33,9 +33,8 @@ read_data( supplied using 'file.path'). Defaults to "."; i.e. the working directory.} \item{data_format}{A string specifying whether the data were extracted from -the ICES webservice ("ICES" - the default) or are in the simplified format -designed for other data sources ("external"). The value "ICES_old" is -deprecated.} +the ICES webservice (\code{"ICES"} - the default) or are in the simplified +format designed for other data sources (\code{"external"}).} \item{info_files}{A list of files specifying reference tables which override the defaults. See examples.} @@ -89,6 +88,14 @@ will be \code{FALSE} if the vflag entry is \code{"S"} or suspect. Records for wh } \item \code{stations} } +} +\description{ +Reads in contaminant and effects data, the station dictionary and various +reference tables. For data from the ICES webservice, it matches data to +stations in the station dictionary. It also allows the user to set control +parameters that dictate the assessment process. +} +\details{ \subsection{Control parameters}{ Many aspects of the assessment process can be controlled through the @@ -96,10 +103,10 @@ parameters stored in \code{info$control}. This is a list populated with default values which can then be overwritten, if required, using the \code{control} argument. } + +\subsection{External data}{ + +If \code{data_format = "external"}, a simplified data and station file can +be supplied. See \code{vignette("external-file-format")} for details. } -\description{ -Reads in contaminant and effects data, the station dictionary and various -reference tables. For data from the ICES webservice, it matches data to -stations in the station dictionary. It also allows the user to set control -parameters that dictate the assessment process. } diff --git a/man/report_assessment.Rd b/man/report_assessment.Rd index 41a2311..508cab6 100644 --- a/man/report_assessment.Rd +++ b/man/report_assessment.Rd @@ -8,6 +8,7 @@ report_assessment( assessment_obj, subset = NULL, output_dir = ".", + output_file = NULL, max_report = 100L ) } @@ -23,6 +24,10 @@ assessment_obj; use 'series' to identify individual timeseries.} supplied using 'file.path'). The default is the working directory. The output directory must already exist.} +\item{output_file}{An alterntive file name to override the default. This is +currently only implemented for a single report. If not supplied, the .html +extension will be added.} + \item{max_report}{The maximum number of reports that will be generated. Defaults to 100. Each report is about 1MB in size and takes a few seconds to run, so this prevents a ridiculous number of reports being created.} diff --git a/vignettes/example_HELCOM.Rmd b/vignettes/example_HELCOM.Rmd index 4aa2d95..ad21f80 100644 --- a/vignettes/example_HELCOM.Rmd +++ b/vignettes/example_HELCOM.Rmd @@ -47,7 +47,7 @@ in a directory `data`, and information files in a directory `information`, but you can use any directory for these. ```r -working.directory <- '/Users/stuart/git/HARSAT' +working.directory <- 'C:/Users/robfr/Documents/HARSAT/HARSAT' ``` # Water assessment @@ -92,23 +92,23 @@ water_data <- read_data( info_dir = file.path(working.directory, "information", "HELCOM_2023"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/species.csv -#> Found in path thresholds_water.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/thresholds_water.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/determinand.csv': 'ed6c2f076d852976e23eab797bd16164' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/thresholds_water.csv': '599000609710f7a53450f16fef814d1c' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\species.csv +#> Found in path thresholds_water.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\thresholds_water.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\determinand.csv': '4b48cbec9c71380f4b464779e643cab2' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\thresholds_water.csv': '7e9487630022c11b0c3dd6d553a9955b' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_HELCOM/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_HELCOM/stations.txt': 'ada1ffa58215843e8e4d5f4d74f5e21e' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/stations.txt': 'd229a1c984d507537840e73080f3773c' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_HELCOM/water.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_HELCOM/water.txt': 'd229a93b6e1c8b37008d375365488db4' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/water.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/water.txt': 'b18b0556f6f78378c6f0a77682f51988' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: HELCOM @@ -335,25 +335,25 @@ sediment_data <- read_data( info_dir = file.path(working.directory, "information", "HELCOM_2023"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/species.csv -#> Found in path thresholds_sediment.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/thresholds_sediment.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/determinand.csv': 'ed6c2f076d852976e23eab797bd16164' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/method_extraction.csv': 'b3c891f17b9b35774114edaa2f58b6cc' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/pivot_values.csv': '372ad2d2ef807cec64ce1a7bd1967158' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/thresholds_sediment.csv': '52456c255f587a539177d5fa0fbb7cf1' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\species.csv +#> Found in path thresholds_sediment.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\thresholds_sediment.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\determinand.csv': '4b48cbec9c71380f4b464779e643cab2' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv': '28e38bdd0b9e735643c60026dcda8a78' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv': '23ca1799017bfea360d586b1a70bffd4' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\thresholds_sediment.csv': '41c686160bc8e0877477239eec0f0b1b' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_HELCOM/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_HELCOM/stations.txt': 'ada1ffa58215843e8e4d5f4d74f5e21e' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/stations.txt': 'd229a1c984d507537840e73080f3773c' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_HELCOM/sediment.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_HELCOM/sediment.txt': '0cd5fa2f4a07a2750a56a24b2fe887bf' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/sediment.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/sediment.txt': 'a5635836e9a69f3dd8be42d5078cad6b' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: HELCOM @@ -442,8 +442,7 @@ sediment_timeseries <- create_timeseries( #> Limit of quantification less than limit of detection: see limits_inconsistent.csv #> Censoring codes D and Q inconsistent with respective limits: see censoring_codes_inconsistent.csv #> Detection limit higher than data: see detection_limit_high.csv -#> Non-positive uncertainties: see non_positive_uncertainties.csv -#> Large uncertainties: see large_uncertainties.csv +#> Implausible uncertainties reported with data: see implausible_uncertainties_reported.csv #> Data submitted as BDE28, BDE47, BDE99, BD100, BD153, BD154 summed to give #> SBDE6 #> 61 of 124 samples lost due to incomplete submissions @@ -458,6 +457,7 @@ sediment_timeseries <- create_timeseries( #> Normalising metals to AL using pivot values #> Normalising organics to 5% CORG #> Removing sediment data where normaliser is a less than +#> Implausible uncertainties calculated in data processing: see implausible_uncertainties_calculated.csv #> Dropping groups of compounds / stations with no data between 2015 and 2020 ``` @@ -530,25 +530,25 @@ biota_data <- read_data( info_dir = file.path(working.directory, "information", "HELCOM_2023"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/species.csv -#> Found in path thresholds_biota.csv /Users/stuart/git/HARSAT/information/HELCOM_2023/thresholds_biota.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/determinand.csv': 'ed6c2f076d852976e23eab797bd16164' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/species.csv': '895d4f259f2a1ee8d9c7ec52210f584c' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/imposex.csv': '7e42cb57944b9d79216ad25c12ccada5' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/HELCOM_2023/thresholds_biota.csv': '1798bfbfb15104c2cbf8ff00ccf13abd' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\species.csv +#> Found in path thresholds_biota.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\thresholds_biota.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\determinand.csv': '4b48cbec9c71380f4b464779e643cab2' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\species.csv': '769328e51065226809c91944b6d8fe79' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv': 'b602a882d4783085c896bcf130c8f848' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\HELCOM_2023\thresholds_biota.csv': '9af82cd9730c0b135edd4a003724e8a6' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_HELCOM/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_HELCOM/stations.txt': 'ada1ffa58215843e8e4d5f4d74f5e21e' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/stations.txt': 'd229a1c984d507537840e73080f3773c' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_HELCOM/biota.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_HELCOM/biota.txt': 'a986e6899ecd6c9fbc0bd7854b452c9a' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/biota.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_HELCOM/biota.txt': '0a1a33c4e668e63c97a6d50cdc644d22' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: HELCOM @@ -661,8 +661,7 @@ biota_timeseries <- create_timeseries( #> Non-positive quantification limits: see non_positive_quant_limits.csv #> Censoring codes D and Q inconsistent with respective limits: see censoring_codes_inconsistent.csv #> Detection limit higher than data: see detection_limit_high.csv -#> Non-positive uncertainties: see non_positive_uncertainties.csv -#> Large uncertainties: see large_uncertainties.csv +#> Implausible uncertainties reported with data: see implausible_uncertainties_reported.csv #> Data submitted as BDE28, BDE47, BDE99, BD100, BD153, BD154 summed to give #> SBDE6 #> 257 of 497 samples lost due to incomplete submissions @@ -710,7 +709,6 @@ errors. Dealing with non-converged timeseries is a topic for a future vignette. check_assessment(biota_assessment) #> The following assessment models have not converged: #> 2299 PYR1OH Limanda limanda BI HPLC-FD -#> 5844 CD Clupea harengus LI biota_assessment <- update_assessment( biota_assessment, @@ -721,8 +719,7 @@ biota_assessment <- update_assessment( #> assessing series: station_code 2299; determinand PYR1OH; species Limanda limanda; matrix BI; method_analysis HPLC-FD; unit ng/ml check_assessment(biota_assessment) -#> The following assessment models have not converged: -#> 5844 CD Clupea harengus LI +#> All assessment models have converged ``` diff --git a/vignettes/example_OSPAR.Rmd b/vignettes/example_OSPAR.Rmd index aa8dc62..7e4f8bd 100644 --- a/vignettes/example_OSPAR.Rmd +++ b/vignettes/example_OSPAR.Rmd @@ -40,7 +40,7 @@ in a directory `data`, and information files in a directory `information`, but you can use any directory for these. ```r -working.directory <- '/Users/stuart/git/HARSAT' +working.directory <- 'C:/Users/robfr/Documents/HARSAT/HARSAT' ``` # Water assessment @@ -60,23 +60,23 @@ water_data <- read_data( info_dir = file.path(working.directory, "information", "OSPAR_2022"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/species.csv -#> Found in path thresholds_water.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_water.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv': '912a86ca3efdc719e405a7632e2b89ce' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_water.csv': '2b165f406bb440297435ea3f46eb3612' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\species.csv +#> Found in path thresholds_water.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_water.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv': '6b36346446c0ac04a52b3f1347829f6b' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_water.csv': '615ef96f716ef1d43c01ab67f383c881' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt': '057984ad2a1885bc5d15a41ee3b34471' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt': '58b9e90f314e89f637c60558c06755f4' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/water.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/water.txt': '0ccaec75c5fd7e875c730467d58fdb26' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/water.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/water.txt': '13d63b6161b671165b215b58f5e22469' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: OSPAR @@ -93,8 +93,7 @@ water_data <- read_data( water_data <- tidy_data(water_data) #> -#> Oddities will be written to 'oddities/water' with previous oddities backed up to -#> 'oddities/water_backup' +#> Oddities will be written to 'oddities/water' #> #> Dropping 411 records from data flagged for deletion. Possible reasons are: #> - vflag = suspect @@ -136,6 +135,7 @@ water_timeseries <- create_timeseries( #> Limit of quantification less than limit of detection: see limits_inconsistent.csv #> Censoring codes D and Q inconsistent with respective limits: see censoring_codes_inconsistent.csv #> Detection limit higher than data: see detection_limit_high.csv +#> Implausible uncertainties reported with data: see implausible_uncertainties_reported.csv #> Data submitted as CHRTR relabelled as CHR #> Data submitted as BBF, BKF summed to give BBKF #> 1 of 71 samples lost due to incomplete submissions @@ -322,7 +322,7 @@ Now let's look at the `water_summary.csv` file: NA 37.6 0.2642877 - 0.8079560 + 0.8079549 NA NA @@ -364,7 +364,7 @@ Now let's look at the `water_summary.csv` file: -9.9 23.1 1.9403931 - 4.4243067 + 4.4243256 NA NA @@ -406,7 +406,7 @@ Now let's look at the `water_summary.csv` file: 4.5 17.9 0.0342139 - 0.0586748 + 0.0586751 NA NA @@ -448,7 +448,7 @@ Now let's look at the `water_summary.csv` file: 3.6 10.8 0.0347176 - 0.0446955 + 0.0446956 NA NA @@ -616,7 +616,7 @@ Now let's look at the `water_summary.csv` file: -0.1 19.0 0.0010141 - 0.0017480 + 0.0017375 NA NA @@ -910,7 +910,7 @@ Now let's look at the `water_summary.csv` file: -5.7 12.6 0.1593088 - 0.2251622 + 0.2251621 NA NA @@ -952,7 +952,7 @@ Now let's look at the `water_summary.csv` file: -2.5 14.8 0.4915249 - 0.6665299 + 0.6665300 NA NA @@ -994,7 +994,7 @@ Now let's look at the `water_summary.csv` file: -3.1 17.0 0.0122965 - 0.0201135 + 0.0201136 NA NA @@ -1036,7 +1036,7 @@ Now let's look at the `water_summary.csv` file: -3.5 45.3 0.0608592 - 0.1669677 + 0.1669681 NA NA @@ -1120,7 +1120,7 @@ Now let's look at the `water_summary.csv` file: -5.7 25.1 0.0375571 - 0.0747836 + 0.0747835 NA NA @@ -1246,7 +1246,7 @@ Now let's look at the `water_summary.csv` file: -3.6 19.2 0.0301767 - 0.0499804 + 0.0499803 NA NA @@ -1288,7 +1288,7 @@ Now let's look at the `water_summary.csv` file: -0.9 22.2 1.3776942 - 2.4084546 + 2.4085277 NA NA @@ -1330,7 +1330,7 @@ Now let's look at the `water_summary.csv` file: 0.5 12.0 0.0218170 - 0.0324892 + 0.0324885 NA NA @@ -1372,7 +1372,7 @@ Now let's look at the `water_summary.csv` file: -4.3 15.2 0.6971756 - 1.0166068 + 1.0166054 NA NA @@ -1414,7 +1414,7 @@ Now let's look at the `water_summary.csv` file: -11.7 35.8 0.1854117 - 0.4934672 + 0.4934678 NA NA @@ -1540,7 +1540,7 @@ Now let's look at the `water_summary.csv` file: -19.5 54.6 0.0070453 - 0.0235891 + 0.0235892 NA NA @@ -1708,7 +1708,7 @@ Now let's look at the `water_summary.csv` file: NA 10.7 1.5024002 - 2.3889251 + 2.3889254 NA NA @@ -1918,10 +1918,10 @@ Now let's look at the `water_summary.csv` file: NA 31.7 1.5345685 - 4.9992653 + 4.9992398 EQS 2.0e+03 - -1995.0007347 + -1995.0007602 2021 @@ -2170,7 +2170,7 @@ Now let's look at the `water_summary.csv` file: 4.4 9.8 4.2013867 - 6.3756677 + 6.3756630 NA NA @@ -2380,7 +2380,7 @@ Now let's look at the `water_summary.csv` file: NA 22.9 0.7230029 - 1.9410264 + 1.9410317 NA NA @@ -2800,10 +2800,10 @@ Now let's look at the `water_summary.csv` file: -11.6 11.1 0.0901546 - 0.1212221 + 0.1212222 EQS 2.0e+00 - -1.8787779 + -1.8787778 2017 below @@ -3010,10 +3010,10 @@ Now let's look at the `water_summary.csv` file: -11.1 36.6 0.0029571 - 0.0091140 + 0.0091139 EQS 1.0e+02 - -99.9908860 + -99.9908861 2019 below @@ -3052,7 +3052,7 @@ Now let's look at the `water_summary.csv` file: -2.1 16.9 0.0019921 - 0.0033308 + 0.0033309 NA NA @@ -3346,7 +3346,7 @@ Now let's look at the `water_summary.csv` file: NA 14.0 2.1057059 - 3.7494366 + 3.7494368 NA NA @@ -3424,13 +3424,13 @@ Now let's look at the `water_summary.csv` file: 0.0005 0.5607 0.0004 - 0.4532 + 0.4531 0.9 - 0.4532 + 0.4531 0.9 8.8 0.6833377 - 0.8793254 + 0.8793163 NA NA @@ -3514,10 +3514,10 @@ Now let's look at the `water_summary.csv` file: NA 46.7 0.0117176 - 1.9080359 + 1.9080383 EQS 1.3e+00 - 0.6080359 + 0.6080383 2020 below @@ -3724,7 +3724,7 @@ Now let's look at the `water_summary.csv` file: NA 39.8 0.9199809 - 2.5982052 + 2.5981971 NA NA @@ -3850,7 +3850,7 @@ Now let's look at the `water_summary.csv` file: 17.0 22.2 0.7934854 - 2.2375378 + 2.2375377 NA NA @@ -4354,7 +4354,7 @@ Now let's look at the `water_summary.csv` file: NA 25.1 1.0907904 - 2.8631028 + 2.8630501 NA NA @@ -4480,10 +4480,10 @@ Now let's look at the `water_summary.csv` file: 2.6 7.7 0.3373065 - 0.4675318 + 0.4675314 EQS 8.6e+00 - -8.1324682 + -8.1324686 2020 below @@ -5392,25 +5392,25 @@ sediment_data <- read_data( info_dir = file.path(working.directory, "information", "OSPAR_2022"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/species.csv -#> Found in path thresholds_sediment.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_sediment.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv': '912a86ca3efdc719e405a7632e2b89ce' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/method_extraction.csv': 'b3c891f17b9b35774114edaa2f58b6cc' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/pivot_values.csv': '372ad2d2ef807cec64ce1a7bd1967158' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_sediment.csv': 'dcf8d4a452f93ead01729866ca6b139c' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\species.csv +#> Found in path thresholds_sediment.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_sediment.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv': '6b36346446c0ac04a52b3f1347829f6b' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv': '28e38bdd0b9e735643c60026dcda8a78' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv': '23ca1799017bfea360d586b1a70bffd4' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_sediment.csv': 'ab2fddb32a8b1d126004febbf6375b5d' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt': '057984ad2a1885bc5d15a41ee3b34471' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt': '58b9e90f314e89f637c60558c06755f4' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/sediment.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/sediment.txt': '26c2ab1c30d6b5cead9250df216efbc2' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/sediment.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/sediment.txt': '0cb722db8f8ea4ed263e1cb8c8334665' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: OSPAR @@ -5426,8 +5426,7 @@ sediment_data <- read_data( #> Argument max_year taken to be the maximum year in the data: 2021 sediment_data <- tidy_data(sediment_data) #> -#> Oddities will be written to 'oddities/sediment' with previous oddities backed up to -#> 'oddities/sediment_backup' +#> Oddities will be written to 'oddities/sediment' #> #> Dropping 75 records from data flagged for deletion. Possible reasons are: #> - vflag = suspect @@ -5496,7 +5495,7 @@ sediment_timeseries <- create_timeseries( #> Limit of quantification less than limit of detection: see limits_inconsistent.csv #> Censoring codes D and Q inconsistent with respective limits: see censoring_codes_inconsistent.csv #> Detection limit higher than data: see detection_limit_high.csv -#> Large uncertainties: see large_uncertainties.csv +#> Implausible uncertainties reported with data: see implausible_uncertainties_reported.csv #> Data submitted as BBF, BKF summed to give BBKF #> Data submitted as BBJF, BKF summed to give BBJKF #> 104 of 104 samples lost due to incomplete submissions @@ -5513,6 +5512,7 @@ sediment_timeseries <- create_timeseries( #> Warning: Nss values for CR hard-wired to 5.0 (AL) or 52 (LI) for all digestions #> Normalising organics to 2.5% CORG #> Removing sediment data where normaliser is a less than +#> Implausible uncertainties calculated in data processing: see implausible_uncertainties_calculated.csv #> Dropping groups of compounds / stations with no data between 2016 and 2021 ``` @@ -5710,10 +5710,10 @@ Now let's look at the `sediment_summary.csv` file: NA 11.4 7.6492369 - 12.4844431 + 12.4844430 BAC 0.190 - 12.2944431 + 12.2944430 NA EAC @@ -5945,7 +5945,7 @@ Now let's look at the `sediment_summary.csv` file: NA 41.3 9.7752928 - 333.2085461 + 333.2085424 NA NA @@ -6086,15 +6086,15 @@ Now let's look at the `sediment_summary.csv` file: NA 15.7 1.5241197 - 2.7882848 + 2.7882850 BAC 0.120 - 2.6682848 + 2.6682850 NA EAC 2.70 - 8.828480e-02 + 8.828500e-02 2019 @@ -6133,10 +6133,10 @@ Now let's look at the `sediment_summary.csv` file: NA 12.9 18.8753953 - 77.4193597 + 77.4193598 BAC 8.000 - 69.4193597 + 69.4193598 NA ERL @@ -6180,7 +6180,7 @@ Now let's look at the `sediment_summary.csv` file: -0.2 14.7 17.3299558 - 26.5929954 + 26.5931662 NA NA @@ -6319,13 +6319,13 @@ Now let's look at the `sediment_summary.csv` file: -0.8 0.7914 -0.8 - 17.9 - 210.3469203 - 388.7486333 + 17.8 + 210.4998016 + 385.5535038 BAC 103.000 - 285.7486333 - 2109 + 282.5535038 + 2110 above NA @@ -6415,7 +6415,7 @@ Now let's look at the `sediment_summary.csv` file: -3.5 18.2 1.5278464 - 2.9801733 + 2.9801382 NA NA @@ -6556,7 +6556,7 @@ Now let's look at the `sediment_summary.csv` file: 7.8 39.0 13.9782469 - 44.9733385 + 44.9733223 NA NA @@ -6603,7 +6603,7 @@ Now let's look at the `sediment_summary.csv` file: NA 13.9 10.4852634 - 18.6243267 + 18.6243266 NA NA @@ -6650,15 +6650,15 @@ Now let's look at the `sediment_summary.csv` file: 6.6 44.6 0.3415152 - 0.9961428 + 0.9961422 BAC 0.120 - 0.8761428 + 0.8761422 3000 above EAC 2.70 - -1.703857e+00 + -1.703858e+00 2020 below @@ -6697,10 +6697,10 @@ Now let's look at the `sediment_summary.csv` file: 6.4 20.4 80.0934477 - 165.0513734 + 165.0511546 BAC 103.000 - 62.0513734 + 62.0511546 2018 below @@ -6744,7 +6744,7 @@ Now let's look at the `sediment_summary.csv` file: 3.5 18.1 11.6236043 - 22.1468058 + 22.1467252 NA NA @@ -6791,7 +6791,7 @@ Now let's look at the `sediment_summary.csv` file: 3.7 5.0 97.9305843 - 117.2890199 + 117.2889974 NA NA @@ -6799,7 +6799,7 @@ Now let's look at the `sediment_summary.csv` file: ERL 81.00 - 3.628902e+01 + 3.628900e+01 3000 above @@ -6838,15 +6838,15 @@ Now let's look at the `sediment_summary.csv` file: 0.0 8.3 8.3584236 - 10.5355845 + 10.5535455 BAC 32.000 - -21.4644155 + -21.4464545 2020 below ERL 240.00 - -2.294644e+02 + -2.294465e+02 2020 below @@ -6885,7 +6885,7 @@ Now let's look at the `sediment_summary.csv` file: 0.7 7.8 9.1370801 - 11.5815694 + 11.5814966 NA NA @@ -6932,15 +6932,15 @@ Now let's look at the `sediment_summary.csv` file: -0.3 13.9 8.1270914 - 12.0526609 + 12.0546912 BAC 16.000 - -3.9473391 + -3.9453088 2020 below ERL 261.00 - -2.489473e+02 + -2.489453e+02 2020 below @@ -6979,10 +6979,10 @@ Now let's look at the `sediment_summary.csv` file: 2.0 10.7 20.0087767 - 27.0948686 + 27.0948915 BAC 24.000 - 3.0948686 + 3.0948915 2020 above ERL @@ -7026,7 +7026,7 @@ Now let's look at the `sediment_summary.csv` file: -2.4 6.4 4.8714973 - 5.8651171 + 5.8651188 NA NA @@ -7073,7 +7073,7 @@ Now let's look at the `sediment_summary.csv` file: 1.8 31.9 21.3008515 - 45.5518827 + 45.5512218 NA NA @@ -7120,7 +7120,7 @@ Now let's look at the `sediment_summary.csv` file: 3.1 36.2 6.5587942 - 22.5315109 + 22.5322761 NA NA @@ -7167,15 +7167,15 @@ Now let's look at the `sediment_summary.csv` file: NA 170.9 0.1428399 - 50.5311334 + 50.5314827 BAC 0.050 - 50.4811334 + 50.4814827 NA above FEQG 1.00 - 4.953113e+01 + 4.953148e+01 2021 above @@ -7261,10 +7261,10 @@ Now let's look at the `sediment_summary.csv` file: -3.9 21.9 33.2459810 - 57.5383470 + 57.5383454 BAC 5.000 - 52.5383470 + 52.5383454 2069 above ERL @@ -7402,7 +7402,7 @@ Now let's look at the `sediment_summary.csv` file: NA 31.7 3.9267126 - 9.7602011 + 9.7602006 NA NA @@ -7449,15 +7449,15 @@ Now let's look at the `sediment_summary.csv` file: 0.7 8.8 35.5236210 - 47.1405502 + 47.1406906 BAC 16.000 - 31.1405502 + 31.1406906 3000 above ERL 261.00 - -2.138594e+02 + -2.138593e+02 2021 below @@ -7590,10 +7590,10 @@ Now let's look at the `sediment_summary.csv` file: -5.8 5.1 176.9489655 - 205.1498197 + 205.1498238 BAC 122.000 - 83.1498197 + 83.1498238 2027 above ERL @@ -7637,15 +7637,15 @@ Now let's look at the `sediment_summary.csv` file: -6.2 5.0 177.5006764 - 216.1974328 + 216.1974414 BAC 122.000 - 94.1974328 + 94.1974414 2027 above ERL 150.00 - 6.619743e+01 + 6.619744e+01 2024 above @@ -7684,15 +7684,15 @@ Now let's look at the `sediment_summary.csv` file: -0.6 10.4 256.8557305 - 403.3279483 + 403.3336740 BAC 122.000 - 281.3279483 + 281.3336740 2141 above ERL 150.00 - 2.533279e+02 + 2.533337e+02 2108 above @@ -7731,15 +7731,15 @@ Now let's look at the `sediment_summary.csv` file: -1.3 3.7 93.7391593 - 103.3514654 + 103.3517060 BAC 24.000 - 79.3514654 + 79.3517060 2126 above ERL 665.00 - -5.616485e+02 + -5.616483e+02 2021 below @@ -7778,7 +7778,7 @@ Now let's look at the `sediment_summary.csv` file: -13.3 37.6 0.0124304 - 0.0368966 + 0.0368964 NA NA @@ -7872,15 +7872,15 @@ Now let's look at the `sediment_summary.csv` file: -2.8 5.1 0.3907767 - 0.4814161 + 0.4814163 BAC 0.310 - 0.1714161 + 0.1714163 2029 above ERL 1.20 - -7.185839e-01 + -7.185837e-01 2021 below @@ -7919,15 +7919,15 @@ Now let's look at the `sediment_summary.csv` file: -5.0 10.8 0.5399518 - 0.8057462 + 0.8057464 BAC 0.310 - 0.4957462 + 0.4957464 2032 above ERL 1.20 - -3.942538e-01 + -3.942536e-01 2021 below @@ -7966,15 +7966,15 @@ Now let's look at the `sediment_summary.csv` file: 0.1 3.8 84.2085193 - 93.6304389 + 93.6467575 BAC 24.000 - 69.6304389 + 69.6467575 3000 above ERL 665.00 - -5.713696e+02 + -5.713532e+02 2021 below @@ -8013,7 +8013,7 @@ Now let's look at the `sediment_summary.csv` file: -3.3 11.5 0.1223601 - 0.1686230 + 0.1686226 NA NA @@ -8060,10 +8060,10 @@ Now let's look at the `sediment_summary.csv` file: 1.8 6.9 17.5725588 - 21.1210080 + 21.1210085 BAC 27.000 - -5.8789920 + -5.8789915 2017 below ERL @@ -8154,10 +8154,10 @@ Now let's look at the `sediment_summary.csv` file: -3.7 4.0 32.1279297 - 35.7272265 + 35.7272272 BAC 36.000 - -0.2727735 + -0.2727728 2021 above @@ -8201,10 +8201,10 @@ Now let's look at the `sediment_summary.csv` file: -1.4 3.8 35.0417076 - 40.4264084 + 40.4264462 BAC 36.000 - 4.4264084 + 4.4264462 2021 above @@ -8295,7 +8295,7 @@ Now let's look at the `sediment_summary.csv` file: -5.6 11.1 4.5794713 - 6.5818075 + 6.5818084 NA NA @@ -8483,15 +8483,15 @@ Now let's look at the `sediment_summary.csv` file: -0.7 6.6 49.4394452 - 54.2409018 + 54.2408759 BAC 38.000 - 16.2409018 + 16.2408759 2056 above ERL 47.00 - 7.240902e+00 + 7.240876e+00 2024 above @@ -8624,7 +8624,7 @@ Now let's look at the `sediment_summary.csv` file: -1.4 11.0 81.0797223 - 115.1596858 + 115.1592557 NA NA @@ -8632,7 +8632,7 @@ Now let's look at the `sediment_summary.csv` file: ERL 81.00 - 3.415969e+01 + 3.415926e+01 2021 above @@ -8671,7 +8671,7 @@ Now let's look at the `sediment_summary.csv` file: 0.9 10.6 85.0696678 - 131.1793385 + 131.1811961 NA NA @@ -8679,7 +8679,7 @@ Now let's look at the `sediment_summary.csv` file: ERL 81.00 - 5.017934e+01 + 5.018120e+01 3000 above @@ -8763,17 +8763,17 @@ Now let's look at the `sediment_summary.csv` file: NA NA NA - 26.3 - 38.5458249 - 114.3244897 + 25.8 + 38.4399651 + 111.9533979 BAC 30.000 - 84.3244897 + 81.9533979 NA ERL 430.00 - -3.156755e+02 + -3.180466e+02 2021 @@ -8859,15 +8859,15 @@ Now let's look at the `sediment_summary.csv` file: -0.8 5.0 31.0052370 - 34.7650582 + 34.7650601 BAC 27.000 - 7.7650582 + 7.7650601 2039 above ERL 34.00 - 7.650582e-01 + 7.650601e-01 2021 above @@ -8906,10 +8906,10 @@ Now let's look at the `sediment_summary.csv` file: -1.5 4.8 180.9338724 - 208.4848190 + 208.4848198 BAC 122.000 - 86.4848190 + 86.4848198 2047 above ERL @@ -8953,10 +8953,10 @@ Now let's look at the `sediment_summary.csv` file: 0.7 14.4 0.3268515 - 0.5708141 + 0.5708069 BAC 0.160 - 0.4108141 + 0.4108069 3000 above @@ -9000,10 +9000,10 @@ Now let's look at the `sediment_summary.csv` file: -0.8 8.0 23.3176722 - 28.3145048 + 28.3145261 BAC 25.000 - 3.3145048 + 3.3145261 2021 below @@ -9047,15 +9047,15 @@ Now let's look at the `sediment_summary.csv` file: -0.1 3.9 162.3162514 - 175.9811123 + 175.9796251 BAC 122.000 - 53.9811123 + 53.9796251 2337 above ERL 150.00 - 2.598111e+01 + 2.597963e+01 2108 above @@ -9094,15 +9094,15 @@ Now let's look at the `sediment_summary.csv` file: -1.1 4.0 48.3551881 - 52.4647959 + 52.4647968 BAC 38.000 - 14.4647959 + 14.4647968 2044 above ERL 47.00 - 5.464796e+00 + 5.464797e+00 2024 above @@ -9235,10 +9235,10 @@ Now let's look at the `sediment_summary.csv` file: -2.2 8.4 51.9959559 - 64.9009443 + 64.9009187 BAC 39.000 - 25.9009443 + 25.9009187 2034 above ERL @@ -9282,10 +9282,10 @@ Now let's look at the `sediment_summary.csv` file: 1.1 8.5 37.1110282 - 45.6793612 + 45.6793479 BAC 25.000 - 20.6793612 + 20.6793479 3000 above @@ -9329,10 +9329,10 @@ Now let's look at the `sediment_summary.csv` file: -3.5 14.8 50.6045734 - 72.1917839 + 72.1917775 BAC 20.000 - 52.1917839 + 52.1917775 2048 above ERL @@ -9376,10 +9376,10 @@ Now let's look at the `sediment_summary.csv` file: -0.8 6.5 25.4065239 - 29.5861739 + 29.5861728 BAC 25.000 - 4.5861739 + 4.5861728 2023 above @@ -9423,15 +9423,15 @@ Now let's look at the `sediment_summary.csv` file: -1.2 4.7 185.8620053 - 204.2140298 + 204.2140233 BAC 122.000 - 82.2140298 + 82.2140233 2055 above ERL 150.00 - 5.421403e+01 + 5.421402e+01 2038 above @@ -9469,8 +9469,8 @@ Now let's look at the `sediment_summary.csv` file: 0.7289 -2.3 53.5 - 348.2567082 - 2087.1723418 + 348.2567083 + 2087.1695517 NA NA @@ -9799,10 +9799,10 @@ Now let's look at the `sediment_summary.csv` file: 4.6 9.9 87.7924194 - 126.3425480 + 126.3425590 BAC 36.000 - 90.3425480 + 90.3425590 3000 above @@ -10410,10 +10410,10 @@ Now let's look at the `sediment_summary.csv` file: 13.7 33.3 67.3610094 - 160.4225459 + 160.4217651 BAC 80.000 - 80.4225459 + 80.4217651 2019 above @@ -10457,7 +10457,7 @@ Now let's look at the `sediment_summary.csv` file: 0.7 8.2 27.9195093 - 35.8018636 + 35.8018530 NA NA @@ -10504,7 +10504,7 @@ Now let's look at the `sediment_summary.csv` file: 1.5 8.8 2.8958131 - 3.7894398 + 3.7894254 NA NA @@ -10550,16 +10550,16 @@ Now let's look at the `sediment_summary.csv` file: 0.0396 4.5 12.6 - 110.5869001 - 160.0920936 + 110.5869002 + 160.0921414 BAC 122.000 - 38.0920936 + 38.0921414 2020 above ERL 150.00 - 1.009209e+01 + 1.009214e+01 2020 above @@ -10739,7 +10739,7 @@ Now let's look at the `sediment_summary.csv` file: -1.1 25.9 10.8927655 - 17.0609808 + 17.0614630 NA NA @@ -10833,15 +10833,15 @@ Now let's look at the `sediment_summary.csv` file: -3.2 23.6 0.5271708 - 1.1023415 + 1.1023416 BAC 0.120 - 0.9823415 + 0.9823416 2067 above EAC 2.70 - -1.597659e+00 + -1.597658e+00 2021 above @@ -10880,10 +10880,10 @@ Now let's look at the `sediment_summary.csv` file: -4.7 40.7 0.1426633 - 0.3413434 + 0.3413416 BAC 0.050 - 0.2913434 + 0.2913416 2043 above FEQG @@ -10927,7 +10927,7 @@ Now let's look at the `sediment_summary.csv` file: -2.8 17.3 17.2422238 - 26.5431217 + 26.5430186 NA NA @@ -11021,15 +11021,15 @@ Now let's look at the `sediment_summary.csv` file: -5.3 4.6 94.4031461 - 110.6922948 + 110.6922812 BAC 122.000 - -11.3077052 + -11.3077188 2016 below ERL 150.00 - -3.930771e+01 + -3.930772e+01 2016 below @@ -11068,10 +11068,10 @@ Now let's look at the `sediment_summary.csv` file: 6.9 23.8 0.0996671 - 0.2456764 + 0.2456768 BAC 0.050 - 0.1956764 + 0.1956768 3000 above FEQG @@ -11115,7 +11115,7 @@ Now let's look at the `sediment_summary.csv` file: -0.9 22.9 101.9663912 - 240.6797252 + 240.6598269 NA NA @@ -13741,25 +13741,25 @@ biota_data <- read_data( info_dir = file.path(working.directory, "information", "OSPAR_2022"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/species.csv -#> Found in path thresholds_biota.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_biota.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv': '912a86ca3efdc719e405a7632e2b89ce' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/species.csv': 'e0678c65e9c433f04cfff5ec95659bb4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/imposex.csv': '7e42cb57944b9d79216ad25c12ccada5' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_biota.csv': 'f728e56db82dc936b4b384cc0471477a' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\species.csv +#> Found in path thresholds_biota.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_biota.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv': '6b36346446c0ac04a52b3f1347829f6b' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\species.csv': '952a6e718e07b8bc501eafe42a74a760' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv': 'b602a882d4783085c896bcf130c8f848' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_biota.csv': 'a487aa3bb6738f95ab9462e4420e124a' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt': '057984ad2a1885bc5d15a41ee3b34471' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt': '58b9e90f314e89f637c60558c06755f4' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/biota.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/biota.txt': 'b00f0c9665ec57492b8f14f1323ff635' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/biota.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/biota.txt': 'abfd256bda44cde5ed63693c75ec9024' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: OSPAR @@ -13775,8 +13775,7 @@ biota_data <- read_data( #> Argument max_year taken to be the maximum year in the data: 2022 biota_data <- tidy_data(biota_data) #> -#> Oddities will be written to 'oddities/biota' with previous oddities backed up to -#> 'oddities/biota_backup' +#> Oddities will be written to 'oddities/biota' #> #> Dropping 11 records from data flagged for deletion. Possible reasons are: #> - vflag = suspect @@ -13801,6 +13800,7 @@ rules for birds and mammals. This is dealt with using the customised function ```r + biota_timeseries <- create_timeseries( biota_data, determinands.control = list( @@ -13854,8 +13854,7 @@ biota_timeseries <- create_timeseries( #> Unrecognised censoring values: deleted data in 'censoring_codes_unrecognised.csv #> Censoring codes D and Q inconsistent with respective limits: see censoring_codes_inconsistent.csv #> Detection limit higher than data: see detection_limit_high.csv -#> Non-positive uncertainties: see non_positive_uncertainties.csv -#> Large uncertainties: see large_uncertainties.csv +#> Implausible uncertainties reported with data: see implausible_uncertainties_reported.csv #> Data submitted as CHRTR relabelled as CHR #> Data submitted as BBF, BKF summed to give BBKF #> 125 of 149 samples lost due to incomplete submissions @@ -14121,10 +14120,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -5.1 13.7 1.316844e+02 - 2.086121e+02 + 2.086102e+02 BAC 2.600000e+01 - 1.826121e+02 + 1.826102e+02 2053 above @@ -14139,7 +14138,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 1.000000e+03 - -7.913879e+02 + -7.913898e+02 2021 below @@ -14181,7 +14180,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 7.4 6.7 2.034604e+04 - 2.566752e+04 + 2.566750e+04 NA NA @@ -14241,10 +14240,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.1 3.2 9.308773e+02 - 1.052552e+03 + 1.052550e+03 BAC 9.600000e+02 - 9.255238e+01 + 9.255016e+01 2021 below @@ -14259,7 +14258,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 6.097561e+03 - -5.045009e+03 + -5.045011e+03 2021 below @@ -14421,7 +14420,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 8.8 26.1 3.876517e+01 - 8.468271e+01 + 8.468272e+01 NA NA @@ -14833,15 +14832,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 2005 2018 NA - 0.1938 - 0.1938 - 0.1938 + 0.1939 + 0.1939 + 0.1939 -9.4 - 0.1938 + 0.1939 -9.4 40.5 - 3.915469e-01 - 9.897512e-01 + 3.907754e-01 + 8.558981e-01 NA NA @@ -15261,7 +15260,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.0 9.3 8.920403e-01 - 1.149973e+00 + 1.149974e+00 NA NA @@ -15501,15 +15500,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 6.7 9.0 2.626357e+02 - 3.306569e+02 + 3.306568e+02 BAC 9.000000e+01 - 2.406569e+02 + 2.406568e+02 3000 above QSsp 121.951219 - 208.7056527 + 208.7056200 3000 above @@ -16101,10 +16100,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.3 7.2 6.591527e+03 - 7.454657e+03 + 7.454648e+03 BAC 6.000000e+03 - 1.454657e+03 + 1.454648e+03 3000 above @@ -16161,15 +16160,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 8.3 51.8 1.365729e+01 - 4.074705e+01 + 4.074714e+01 BAC 1.100000e+01 - 2.974705e+01 + 2.974714e+01 3000 above EAC 1700.000000 - -1659.2529538 + -1659.2528610 2021 below @@ -16221,7 +16220,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.0 11.8 1.355463e+03 - 1.789453e+03 + 1.789465e+03 NA NA @@ -16281,15 +16280,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 8.5 48.1 1.387296e+01 - 4.212455e+01 + 4.212456e+01 BAC 9.000000e+00 - 3.312455e+01 + 3.312456e+01 3000 above EAC 100.000000 - -57.8754484 + -57.8754427 2021 above @@ -16341,7 +16340,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 4.2 26.9 2.702145e+03 - 4.916236e+03 + 4.916237e+03 NA NA @@ -16404,7 +16403,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.101690e+01 BAC 8.100000e+00 - 2.916903e+00 + 2.916904e+00 3000 above @@ -16461,15 +16460,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 7.6 37.0 3.369492e+01 - 8.614906e+01 + 8.614927e+01 BAC 1.220000e+01 - 7.394906e+01 + 7.394927e+01 3000 above EAC 110.000000 - -23.8509445 + -23.8507337 2021 below @@ -16479,7 +16478,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 1.829268e+02 - -9.677777e+01 + -9.677756e+01 2021 below @@ -16581,7 +16580,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 19.3 39.1 4.330671e+00 - 1.054091e+01 + 1.054093e+01 NA NA @@ -16589,7 +16588,7 @@ And finally, let's take a look at the `biota_summary.csv` file. EAC 290.000000 - -279.4590853 + -279.4590727 2021 below @@ -16640,16 +16639,16 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0000 -6.9 13.9 - 1.552505e+01 - 1.967680e+01 + 1.552506e+01 + 1.967682e+01 BAC 7.000000e-01 - 1.897680e+01 + 1.897682e+01 2066 above EAC 10.107927 - 9.5688777 + 9.5688945 2027 above @@ -16700,16 +16699,16 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0004 -6.7 17.7 - 1.300401e+01 - 1.732732e+01 + 1.300410e+01 + 1.732749e+01 BAC 6.000000e-01 - 1.672732e+01 + 1.672749e+01 2067 above EAC 2.088415 - 15.2389089 + 15.2390736 2048 above @@ -16769,7 +16768,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 26.481098 - 0.5806135 + 0.5806136 2021 above @@ -16821,15 +16820,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.5 14.8 6.237495e+01 - 9.016173e+01 + 9.016172e+01 BAC 6.000000e-01 - 8.956173e+01 + 8.956172e+01 2322 above EAC 132.405488 - -42.2437608 + -42.2437721 2021 below @@ -16881,15 +16880,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.3 23.4 2.086848e+00 - 3.234972e+00 + 3.234784e+00 BAC 6.000000e-01 - 2.634972e+00 + 2.634784e+00 3000 above EAC 39.178658 - -35.9436865 + -35.9438748 2021 below @@ -16949,7 +16948,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 5.596951 - -5.1214788 + -5.1214789 2021 below @@ -17009,7 +17008,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 9.021951 - -4.2395474 + -4.2395472 2021 below @@ -17079,7 +17078,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 4.573171e+02 - -3.481419e+02 + -3.481418e+02 2021 below @@ -17301,10 +17300,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.4 41.9 1.148711e+00 - 2.847476e+00 + 2.847727e+00 BAC 7.500000e-01 - 2.097476e+00 + 2.097727e+00 3000 above @@ -17421,10 +17420,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.7 23.2 5.373398e-01 - 8.544452e-01 + 8.544514e-01 BAC 6.000000e-01 - 2.544452e-01 + 2.544514e-01 2021 above @@ -17721,7 +17720,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -2.1 16.2 9.409999e+00 - 1.266532e+01 + 1.266533e+01 NA NA @@ -17904,7 +17903,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.145459e+01 BAC 2.400000e+00 - 9.054590e+00 + 9.054593e+00 3000 above @@ -17953,19 +17952,19 @@ And finally, let's take a look at the `biota_summary.csv` file. 2003 2019 NA - 0.0090 - 0.0090 - 0.0090 + 0.0111 + 0.0111 + 0.0111 -4.3 - 0.0090 + 0.0111 -4.3 - 9.1 - 2.727439e+01 - 3.745632e+01 + 9.2 + 2.727930e+01 + 3.752830e+01 BAC 5.102041e-01 - 3.694612e+01 - 2112 + 3.701810e+01 + 2113 above NA @@ -18021,15 +18020,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -0.6 12.1 1.413949e-01 - 2.115986e-01 + 2.115979e-01 BAC 1.500000e-01 - 6.159860e-02 + 6.159790e-02 2021 above EAC 22.000000 - -21.7884014 + -21.7884021 2021 below @@ -18141,7 +18140,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.7 7.5 6.610692e+03 - 8.450020e+03 + 8.449752e+03 NA NA @@ -18201,7 +18200,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 3.9 10.1 6.198193e+03 - 8.513470e+03 + 8.513507e+03 NA NA @@ -18261,15 +18260,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 13.1 28.5 5.865732e-01 - 1.238822e+00 + 1.238823e+00 BAC 1.300000e+00 - -6.117750e-02 + -6.117710e-02 2020 below EAC 29.000000 - -27.7611775 + -27.7611771 2020 below @@ -18321,15 +18320,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -6.7 6.2 1.914322e+01 - 2.572905e+01 + 2.572904e+01 BAC 1.220000e+01 - 1.352905e+01 + 1.352904e+01 2026 above EAC 110.000000 - -84.2709545 + -84.2709606 2019 below @@ -18381,10 +18380,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -3.6 16.0 4.457526e+03 - 7.019330e+03 + 7.019332e+03 BAC 6.000000e+03 - 1.019330e+03 + 1.019332e+03 2019 above @@ -18441,7 +18440,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.9 29.8 2.254849e+03 - 5.096016e+03 + 5.095479e+03 NA NA @@ -18583,6 +18582,66 @@ And finally, let's take a look at the `biota_summary.csv` file. NA + + 7182 SURVT Mytilus edulis WO + 2 + Northern North Sea + United Kingdom + 7182 + EScotland_YthanEstuary_sh01 + E Scotland (Ythan Estuary) + 57.32123 + -1.994370 + RH + TT + SURVT + Biological effects (other) + Mytilus edulis + WO + + d + NA + + NA + small_filled_circle + green + 3 + 3 + 3 + 2012 + 2012 + 2020 + NA + NA + NA + NA + NA + NA + NA + NA + 7.822224e+00 + 7.413895e+00 + BAC + 1.000000e+01 + -2.586105e+00 + NA + + EAC + 5.000000 + 2.4138954 + 2020 + + + NA + NA + NA + + + NA + NA + NA + + 7208 BDE66 Pleuronectes platessa LI 2 @@ -18673,23 +18732,23 @@ And finally, let's take a look at the `biota_summary.csv` file. 2006 2019 NA - 0.0350 - 0.0350 - 0.0350 - 4.8 - 0.0350 - 4.8 - 11.3 - 4.413530e+02 - 6.083309e+02 + 0.0376 + 0.0376 + 0.0376 + 4.7 + 0.0376 + 4.7 + 11.6 + 4.412843e+02 + 6.103200e+02 BAC 9.000000e+01 - 5.183309e+02 + 5.203200e+02 3000 above QSsp 121.951219 - 486.3796676 + 488.3688062 3000 above @@ -18699,7 +18758,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 3.048780e+03 - -2.440450e+03 + -2.438460e+03 2019 below @@ -18733,18 +18792,18 @@ And finally, let's take a look at the `biota_summary.csv` file. 2005 2019 NA - 0.0329 - 0.0329 - 0.0329 + 0.0346 + 0.0346 + 0.0346 -3.7 - 0.0329 + 0.0346 -3.7 - 9.1 - 3.053810e+03 - 4.000937e+03 + 9.2 + 3.054208e+03 + 4.008297e+03 BAC 1.300000e+03 - 2.700937e+03 + 2.708297e+03 2042 above @@ -18759,7 +18818,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 9.146341e+03 - -5.145404e+03 + -5.138044e+03 2019 below @@ -18853,15 +18912,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 2007 2019 NA - 0.3617 - 0.3617 - 0.3617 + 0.3644 + 0.3644 + 0.3644 -1.3 - 0.3617 + 0.3644 -1.3 - 6.6 - 1.186483e+04 - 1.471283e+04 + 6.8 + 1.185828e+04 + 1.473781e+04 NA NA @@ -18921,7 +18980,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -2.4 8.2 3.404049e+03 - 4.713604e+03 + 4.713631e+03 NA NA @@ -18981,7 +19040,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 2.0 8.0 1.624333e+02 - 2.253627e+02 + 2.253620e+02 NA NA @@ -19041,10 +19100,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -0.4 2.9 5.782819e+03 - 6.317354e+03 + 6.317364e+03 BAC 6.000000e+03 - 3.173537e+02 + 3.173641e+02 2019 above @@ -19109,7 +19168,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 12.000000 - 6.4198941 + 6.4198934 2021 above @@ -19169,7 +19228,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above QSsp 121.951219 - 470.2344314 + 470.2344817 2035 above @@ -19221,7 +19280,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 2.1 23.8 1.802870e+01 - 2.973969e+01 + 2.973973e+01 NA NA @@ -19281,10 +19340,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.1 11.2 4.743194e+03 - 5.661246e+03 + 5.661239e+03 BAC 6.000000e+03 - -3.387544e+02 + -3.387612e+02 2021 above @@ -19341,7 +19400,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 2.7 21.4 1.646761e+03 - 2.958782e+03 + 2.958734e+03 NA NA @@ -20181,7 +20240,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.0 8.9 9.527265e+00 - 1.416868e+01 + 1.416865e+01 NA NA @@ -20549,7 +20608,7 @@ And finally, let's take a look at the `biota_summary.csv` file. EAC 80.000000 - -70.5353266 + -70.5353269 2020 @@ -20601,10 +20660,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -6.9 6.4 1.061902e+01 - 1.629257e+01 + 1.629258e+01 BAC 8.100000e+00 - 8.192574e+00 + 8.192575e+00 2025 above @@ -20660,8 +20719,8 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0063 -53.7 59.5 - 1.054408e+01 - 5.085901e+01 + 1.054405e+01 + 5.085813e+01 NA NA @@ -20841,7 +20900,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -17.3 24.7 2.143187e+00 - 3.478543e+00 + 3.478541e+00 NA NA @@ -20849,7 +20908,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QSsp 165.000000 - -161.5214572 + -161.5214594 2021 below @@ -20900,16 +20959,16 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0245 -3.0 9.9 - 1.209565e+01 - 1.685229e+01 + 1.209626e+01 + 1.685318e+01 BAC 2.093023e-01 - 1.664299e+01 + 1.664387e+01 2157 above QSsp 334.000000 - -317.1477084 + -317.1468237 2020 below @@ -20919,7 +20978,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 2.941176e+03 - -2.924324e+03 + -2.924323e+03 2020 below @@ -20961,7 +21020,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.9 14.0 1.845246e+01 - 1.456063e+01 + 1.456048e+01 NA NA @@ -21081,7 +21140,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 4.5 28.9 6.846897e+03 - 1.359641e+04 + 1.359640e+04 NA NA @@ -21141,10 +21200,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -0.1 8.7 1.901907e+03 - 2.224243e+03 + 2.224020e+03 BAC 9.600000e+02 - 1.264243e+03 + 1.264020e+03 3000 above @@ -21159,7 +21218,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 6.097561e+03 - -3.873318e+03 + -3.873541e+03 2021 below @@ -21201,15 +21260,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.2 10.4 1.536832e+02 - 2.016510e+02 + 2.016500e+02 BAC 9.000000e+01 - 1.116510e+02 + 1.116500e+02 2064 above QSsp 121.951219 - 79.6997825 + 79.6987737 2039 above @@ -21219,7 +21278,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 3.048780e+03 - -2.847129e+03 + -2.847130e+03 2020 below @@ -21255,13 +21314,13 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0030 0.3076 0.0018 - 0.3964 + 0.3995 -2.0 - 0.0971 + 0.0973 4.9 23.1 6.003179e+01 - 1.246740e+02 + 1.249964e+02 NA NA @@ -21321,10 +21380,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.6 5.0 5.875961e+03 - 6.704890e+03 + 6.704888e+03 BAC 6.000000e+03 - 7.048897e+02 + 7.048879e+02 2021 above @@ -21381,7 +21440,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -0.8 40.0 5.162003e+00 - 1.107543e+01 + 1.108761e+01 NA NA @@ -21681,7 +21740,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.3 9.6 1.695841e+03 - 2.030936e+03 + 2.030933e+03 NA NA @@ -21741,15 +21800,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 14.2 33.7 3.053404e+01 - 6.817610e+01 + 6.817609e+01 BAC 2.500000e+00 - 6.567610e+01 + 6.567609e+01 3000 above EAC 80.000000 - -11.8239033 + -11.8239067 2020 below @@ -21801,15 +21860,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 15.5 41.2 1.112510e+01 - 2.960848e+01 + 2.960838e+01 BAC 1.400000e+00 - 2.820848e+01 + 2.820838e+01 3000 above EAC 600.000000 - -570.3915200 + -570.3916234 2020 below @@ -21819,7 +21878,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 3.048780e+01 - -8.793249e-01 + -8.794283e-01 2020 above @@ -21861,10 +21920,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 8.3 23.6 2.408702e-01 - 4.151733e-01 + 4.151734e-01 BAC 5.429900e-03 - 4.097434e-01 + 4.097435e-01 3000 above @@ -22101,10 +22160,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.7 24.5 2.973331e-01 - 4.742029e-01 + 4.742097e-01 BAC 5.429900e-03 - 4.687731e-01 + 4.687798e-01 3000 above @@ -22114,7 +22173,7 @@ And finally, let's take a look at the `biota_summary.csv` file. FEQG 73.512195 - -73.037992 + -73.037985 2020 below @@ -22221,7 +22280,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 11.9 1.016794e+00 - 1.376183e+00 + 1.376186e+00 NA NA @@ -22239,7 +22298,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 5.182930e-02 - 1.324354e+00 + 1.324357e+00 NA @@ -22341,10 +22400,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -2.4 29.6 2.506238e+00 - 4.265789e+00 + 4.265803e+00 BAC 8.100000e+00 - -3.834211e+00 + -3.834197e+00 2020 below @@ -22641,7 +22700,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 8.3 5.9 4.786980e+00 - 6.193307e+00 + 6.193305e+00 NA NA @@ -22701,7 +22760,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0 9.1 9.432131e+03 - 1.200035e+04 + 1.200034e+04 NA NA @@ -22761,7 +22820,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 3.7 31.6 1.584089e+00 - 3.232474e+00 + 3.232467e+00 NA NA @@ -22813,18 +22872,18 @@ And finally, let's take a look at the `biota_summary.csv` file. 2001 2021 NA - 0.7351 - 0.7351 - 0.7351 + 0.7333 + 0.7333 + 0.7333 0.4 - 0.7351 + 0.7333 0.4 9.5 - 8.945326e+03 - 1.088514e+04 + 8.949618e+03 + 1.089833e+04 BAC 6.000000e+03 - 4.885137e+03 + 4.898335e+03 3000 above @@ -23541,7 +23600,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 2.2 5.6 1.155751e+01 - 1.766374e+01 + 1.766375e+01 NA NA @@ -23661,15 +23720,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 7.9 31.6 1.891081e+01 - 7.498441e+01 + 7.498486e+01 BAC 1.220000e+01 - 6.278441e+01 + 6.278486e+01 3000 above EAC 110.000000 - -35.0155891 + -35.0151414 2021 below @@ -23679,7 +23738,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 1.940492e+02 - -1.190647e+02 + -1.190643e+02 2021 below @@ -23832,19 +23891,19 @@ And finally, let's take a look at the `biota_summary.csv` file. 1990 2009 2020 - 0.0071 + 0.0070 0.0001 0.0000 0.0000 - 13.5 + 13.6 0.0000 - 13.5 - 8.2 - 1.209735e+03 - 1.625164e+03 + 13.6 + 8.4 + 1.206902e+03 + 1.631504e+03 BAC 1.300000e+03 - 3.251638e+02 + 3.315040e+02 2020 above @@ -23859,7 +23918,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 9.146341e+03 - -7.521178e+03 + -7.514837e+03 2020 below @@ -23961,7 +24020,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 64.8 5.471304e+00 - 5.247959e+01 + 5.247962e+01 NA NA @@ -24029,7 +24088,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 12.000000 - 19.2015028 + 19.2015015 2030 above @@ -24039,7 +24098,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 9.268293e+01 - -6.148142e+01 + -6.148143e+01 2022 below @@ -24141,10 +24200,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -7.9 9.3 7.109217e+00 - 9.556026e+00 + 9.556025e+00 BAC 2.400000e+00 - 7.156026e+00 + 7.156025e+00 2036 above @@ -24201,15 +24260,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 12.6 62.2 5.564609e+00 - 4.128231e+01 + 4.128065e+01 BAC 8.181818e-01 - 4.046413e+01 + 4.046247e+01 3000 above QSsp 334.000000 - -292.7176879 + -292.7193527 2022 below @@ -24219,7 +24278,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 2.083333e+03 - -2.042051e+03 + -2.042053e+03 2022 below @@ -24261,7 +24320,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 17.7 2.132071e+00 - 4.292670e+00 + 4.292669e+00 NA NA @@ -24569,7 +24628,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 132.405488 - -128.1546405 + -128.1546404 2017 below @@ -25041,10 +25100,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.4 6.8 7.006825e+03 - 7.839426e+03 + 7.839430e+03 BAC 6.000000e+03 - 1.839426e+03 + 1.839430e+03 3000 above @@ -25161,15 +25220,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 4.5 29.0 2.618194e+00 - 5.149778e+00 + 5.149781e+00 BAC 2.500000e+00 - 2.649778e+00 + 2.649782e+00 3000 above EAC 80.000000 - -74.8502224 + -74.8502185 2021 below @@ -25341,10 +25400,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -3.1 6.8 3.854160e+03 - 4.696115e+03 + 4.696116e+03 BAC 6.000000e+03 - -1.303885e+03 + -1.303884e+03 2018 above @@ -25649,7 +25708,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 26.481098 - -25.3078107 + -25.3078108 2018 below @@ -25821,15 +25880,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -4.7 46.9 8.382260e-02 - 2.913039e-01 + 2.913036e-01 BAC 7.500000e-01 - -4.586961e-01 + -4.586964e-01 2018 below EAC 5.596951 - -5.3056473 + -5.3056476 2018 below @@ -25881,15 +25940,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -8.5 36.8 1.171265e-01 - 3.133840e-01 + 3.133843e-01 BAC 7.500000e-01 - -4.366160e-01 + -4.366157e-01 2018 above EAC 9.021951 - -8.7085672 + -8.7085670 2018 below @@ -25941,7 +26000,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -4.9 12.4 3.368199e+00 - 4.590677e+00 + 4.590676e+00 NA NA @@ -26121,7 +26180,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -4.1 31.8 5.737986e+00 - 1.736160e+01 + 1.736162e+01 NA NA @@ -26189,7 +26248,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 100.000000 - -40.1976817 + -40.1976820 2021 below @@ -26361,10 +26420,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -26.3 33.2 3.280735e+00 - 1.443685e+01 + 1.443684e+01 BAC 8.100000e+00 - 6.336848e+00 + 6.336841e+00 2020 above @@ -27381,10 +27440,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.1 4.0 7.002625e+03 - 7.977351e+03 + 7.977361e+03 BAC 6.000000e+03 - 1.977351e+03 + 1.977361e+03 3000 above @@ -27441,7 +27500,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.3 18.1 1.284506e+03 - 2.428518e+03 + 2.428544e+03 NA NA @@ -27493,19 +27552,19 @@ And finally, let's take a look at the `biota_summary.csv` file. 2004 2021 NA - 0.7592 - 0.7592 - 0.7592 + 0.7965 + 0.7965 + 0.7965 -0.3 - 0.7592 + 0.7965 -0.3 - 8.1 - 8.962650e+04 - 1.121350e+05 + 7.9 + 8.969351e+04 + 1.108173e+05 BAC 6.300000e+04 - 4.913499e+04 - 2129 + 4.781733e+04 + 2157 above NA @@ -27561,10 +27620,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 3.4 4.9 5.895511e+04 - 6.829194e+04 + 6.829196e+04 BAC 6.300000e+04 - 5.291944e+03 + 5.291963e+03 2020 above @@ -27621,10 +27680,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -11.0 26.7 1.618750e+00 - 3.567369e+00 + 3.567368e+00 BAC 2.400000e+00 - 1.167369e+00 + 1.167368e+00 2020 above @@ -27733,15 +27792,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 2011 2021 NA - 0.5937 - 0.5937 - 0.5937 - 18.8 - 0.5937 - 18.8 - 131.3 - 1.788205e+01 - 1.287941e+03 + 0.5908 + 0.5908 + 0.5908 + 18.7 + 0.5908 + 18.7 + 130.2 + 1.811408e+01 + 1.246110e+03 NA NA @@ -27801,10 +27860,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 3.2 19.4 4.513217e+02 - 1.114588e+03 + 1.114590e+03 BAC 9.600000e+02 - 1.545879e+02 + 1.545895e+02 2019 below @@ -27819,7 +27878,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 6.097561e+03 - -4.982973e+03 + -4.982971e+03 2019 below @@ -27861,7 +27920,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 3.8 47.5 1.218477e+00 - 9.630652e+00 + 9.663559e+00 NA NA @@ -28101,7 +28160,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 74.3 5.972304e-01 - 7.002018e+02 + 7.002288e+02 NA NA @@ -28153,15 +28212,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 2006 2020 NA - 0.0824 - 0.0824 - 0.0824 + 0.0811 + 0.0811 + 0.0811 2.8 - 0.0824 + 0.0811 2.8 8.1 - 3.057967e+04 - 3.884929e+04 + 3.060942e+04 + 3.887465e+04 NA NA @@ -28213,15 +28272,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 2006 2020 NA - 0.6907 - 0.6907 - 0.6907 + 0.6826 + 0.6826 + 0.6826 0.5 - 0.6907 + 0.6826 0.5 6.4 - 3.964118e+03 - 4.781710e+03 + 3.966763e+03 + 4.787691e+03 NA NA @@ -28281,10 +28340,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.7 16.6 1.845006e+02 - 3.008291e+02 + 3.008280e+02 BAC 2.600000e+01 - 2.748291e+02 + 2.748280e+02 3000 above @@ -28299,7 +28358,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 1.000000e+03 - -6.991709e+02 + -6.991720e+02 2020 below @@ -28469,7 +28528,7 @@ And finally, let's take a look at the `biota_summary.csv` file. above EAC 110.000000 - -84.6833526 + -84.6833533 2018 below @@ -28581,7 +28640,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -1.8 7.6 1.440957e+03 - 1.815972e+03 + 1.815969e+03 NA NA @@ -28641,7 +28700,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -0.8 7.7 1.928844e+03 - 2.410726e+03 + 2.410708e+03 NA NA @@ -28693,18 +28752,18 @@ And finally, let's take a look at the `biota_summary.csv` file. 1999 2019 NA - 0.0296 - 0.0296 - 0.0296 + 0.0320 + 0.0320 + 0.0320 -3.8 - 0.0296 + 0.0320 -3.8 - 15.5 - 6.789019e+04 - 9.561041e+04 + 15.7 + 6.765536e+04 + 9.564331e+04 BAC 6.300000e+04 - 3.261041e+04 + 3.264331e+04 2021 above @@ -28753,23 +28812,23 @@ And finally, let's take a look at the `biota_summary.csv` file. 1999 2019 NA - 0.0494 - 0.0494 - 0.0494 - 1.9 - 0.0494 - 1.9 - 8.5 - 1.590096e+02 - 1.953489e+02 + 0.0438 + 0.0438 + 0.0438 + 2.0 + 0.0438 + 2.0 + 8.2 + 1.576562e+02 + 1.926865e+02 BAC 9.000000e+01 - 1.053489e+02 + 1.026865e+02 3000 above QSsp 121.951219 - 73.3977062 + 70.7353242 3000 above @@ -28779,7 +28838,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 3.048780e+03 - -2.853432e+03 + -2.856094e+03 2019 below @@ -28821,7 +28880,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -0.3 8.0 1.269818e+04 - 1.767981e+04 + 1.767594e+04 NA NA @@ -28877,11 +28936,11 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0000 0.0006 6.5 - 0.0044 + 0.0045 5.3 - 7.5 - 2.188535e+04 - 2.991320e+04 + 7.6 + 2.187976e+04 + 2.990755e+04 NA NA @@ -29184,7 +29243,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.031644e+03 BAC 9.600000e+02 - 7.164403e+01 + 7.164424e+01 2021 below @@ -29244,7 +29303,7 @@ And finally, let's take a look at the `biota_summary.csv` file. 5.184633e+00 BAC 6.500000e-02 - 5.119632e+00 + 5.119633e+00 2046 above @@ -29254,7 +29313,7 @@ And finally, let's take a look at the `biota_summary.csv` file. FEQG 20.000000 - -14.815368 + -14.815367 2020 below @@ -29301,10 +29360,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -11.5 12.9 5.201285e-01 - 7.782011e-01 + 7.782012e-01 BAC 6.500000e-02 - 7.132011e-01 + 7.132012e-01 2038 above @@ -29540,11 +29599,11 @@ And finally, let's take a look at the `biota_summary.csv` file. 0.0000 -19.7 13.5 - 1.445111e-01 - 2.319564e-01 + 1.445166e-01 + 2.319647e-01 BAC 6.500000e-02 - 1.669564e-01 + 1.669647e-01 2024 above @@ -29554,7 +29613,7 @@ And finally, let's take a look at the `biota_summary.csv` file. FEQG 20.000000 - -19.768044 + -19.768035 2020 below @@ -29661,7 +29720,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 33.7 1.622263e+00 - 4.302480e+00 + 4.302497e+00 NA NA @@ -29669,7 +29728,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QSsp 3340.000000 - -3335.6975196 + -3335.6975027 2020 @@ -29721,7 +29780,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 37.3 1.039588e+00 - 3.049004e+00 + 3.049102e+00 NA NA @@ -29781,7 +29840,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 113.7 1.124000e-03 - 6.248620e+01 + 6.248614e+01 NA NA @@ -29901,10 +29960,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -12.2 24.6 2.873469e+00 - 5.550273e+00 + 5.550272e+00 BAC 6.500000e-02 - 5.485273e+00 + 5.485271e+00 2051 above @@ -29914,7 +29973,7 @@ And finally, let's take a look at the `biota_summary.csv` file. FEQG 20.000000 - -14.449727 + -14.449729 2020 below @@ -29961,10 +30020,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -8.9 12.0 4.549631e-01 - 6.690603e-01 + 6.690600e-01 BAC 6.500000e-02 - 6.040603e-01 + 6.040600e-01 2042 above @@ -30021,10 +30080,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -8.6 11.9 4.548695e+00 - 6.268090e+00 + 6.268089e+00 BAC 6.500000e-02 - 6.203090e+00 + 6.203089e+00 2069 above @@ -30034,7 +30093,7 @@ And finally, let's take a look at the `biota_summary.csv` file. FEQG 80.000000 - -73.731910 + -73.731911 2020 below @@ -30201,10 +30260,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -15.6 30.7 1.553649e-01 - 3.584500e-01 + 3.584499e-01 BAC 6.500000e-02 - 2.934500e-01 + 2.934499e-01 2026 above @@ -30449,7 +30508,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QSsp 3340.000000 - -3336.5011712 + -3336.5011714 2020 @@ -30501,7 +30560,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 29.1 8.913997e-01 - 2.066000e+00 + 2.065998e+00 NA NA @@ -30561,7 +30620,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 150.9 2.494000e-03 - 1.275083e+03 + 1.275085e+03 NA NA @@ -30621,7 +30680,7 @@ And finally, let's take a look at the `biota_summary.csv` file. NA 161.3 1.317860e-02 - 3.389176e+00 + 3.389185e+00 NA NA @@ -30861,10 +30920,10 @@ And finally, let's take a look at the `biota_summary.csv` file. -3.3 16.2 5.715751e+01 - 8.490202e+01 + 8.490190e+01 BAC 2.600000e+01 - 5.890202e+01 + 5.890190e+01 2044 above @@ -30879,7 +30938,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 1.000000e+03 - -9.150980e+02 + -9.150981e+02 2020 below @@ -31041,15 +31100,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 26.5 14.6 2.359492e+01 - 4.155589e+01 + 4.155590e+01 BAC 1.400000e+00 - 4.015589e+01 + 4.015590e+01 3000 above EAC 600.000000 - -558.4441054 + -558.4440997 2021 below @@ -31059,7 +31118,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 3.048780e+01 - 1.106809e+01 + 1.106810e+01 2021 below @@ -31281,10 +31340,10 @@ And finally, let's take a look at the `biota_summary.csv` file. 3.9 25.6 1.447601e+02 - 3.574847e+02 + 3.574835e+02 BAC 8.100000e+00 - 3.493847e+02 + 3.493835e+02 3000 above @@ -31461,15 +31520,15 @@ And finally, let's take a look at the `biota_summary.csv` file. 1.2 14.4 2.534392e+01 - 4.253511e+01 + 4.253478e+01 BAC 1.220000e+01 - 3.033511e+01 + 3.033478e+01 3000 above EAC 110.000000 - -67.4648945 + -67.4652232 2021 below @@ -31479,7 +31538,7 @@ And finally, let's take a look at the `biota_summary.csv` file. QShh 1.829268e+02 - -1.403917e+02 + -1.403921e+02 2021 below @@ -31521,15 +31580,15 @@ And finally, let's take a look at the `biota_summary.csv` file. -2.7 5.2 6.022694e+01 - 7.344098e+01 + 7.344101e+01 BAC 9.000000e+01 - -1.655902e+01 + -1.655899e+01 2021 below QSsp 121.951219 - -48.5102432 + -48.5102067 2021 below @@ -31539,7 +31598,7 @@ And finally, let's take a look at the `biota_summary.csv` file. MPC 3.048780e+03 - -2.975340e+03 + -2.975339e+03 2021 below @@ -31641,7 +31700,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -37.8 77.9 4.144080e-02 - 6.552550e-02 + 6.552540e-02 NA NA @@ -31701,7 +31760,7 @@ And finally, let's take a look at the `biota_summary.csv` file. -36.4 45.8 4.782990e-02 - 8.925290e-02 + 8.925300e-02 NA NA @@ -31709,7 +31768,7 @@ And finally, let's take a look at the `biota_summary.csv` file. EAC 22.000000 - -21.9107471 + -21.9107470 2018 below diff --git a/vignettes/example_OSPAR.Rmd.orig b/vignettes/example_OSPAR.Rmd.orig index a8bd7b2..76a0e15 100644 --- a/vignettes/example_OSPAR.Rmd.orig +++ b/vignettes/example_OSPAR.Rmd.orig @@ -290,6 +290,7 @@ rules for birds and mammals. This is dealt with using the customised function ```{r ospar-biota-timeseries} + biota_timeseries <- create_timeseries( biota_data, determinands.control = list( diff --git a/vignettes/example_external_data.Rmd b/vignettes/example_external_data.Rmd index c311060..30833f6 100644 --- a/vignettes/example_external_data.Rmd +++ b/vignettes/example_external_data.Rmd @@ -26,7 +26,7 @@ in a directory `data`, and information files in a directory `information`, but you can use any directory for these. ```r -working.directory <- '/Users/stuart/git/HARSAT' +working.directory <- 'C:/Users/robfr/Documents/HARSAT/HARSAT' ``` # Read data @@ -44,25 +44,25 @@ biota_data <- read_data( data_format = "external", info_dir = file.path(working.directory, "information", "AMAP"), ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/AMAP/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/AMAP/species.csv -#> Found in path thresholds_biota.csv /Users/stuart/git/HARSAT/information/AMAP/thresholds_biota.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/AMAP/determinand.csv': '7e310c487109c531aa62cb9a217f879b' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/AMAP/species.csv': 'a3d78e2147a3fc2867b82d1d16a3ae88' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/imposex.csv': '7e42cb57944b9d79216ad25c12ccada5' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/AMAP/thresholds_biota.csv': '7208225201dcbb851427a4df70760106' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\AMAP\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\AMAP\species.csv +#> Found in path thresholds_biota.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\AMAP\thresholds_biota.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\AMAP\determinand.csv': '80bca84d428856c93e89c52aebf8b144' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\AMAP\species.csv': '1aba5ace8155923ed18bd9d0b414e48e' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv': 'b602a882d4783085c896bcf130c8f848' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\AMAP\thresholds_biota.csv': 'a6d82623b8968910b59c1308a646e8a8' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_external_data/EXTERNAL_AMAP_STATIONS.csv' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_external_data/EXTERNAL_AMAP_STATIONS.csv': '4abf9ba9b2f033e296d4ddc3b99c16d7' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_external_data/EXTERNAL_AMAP_STATIONS.csv' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_external_data/EXTERNAL_AMAP_STATIONS.csv': '91e7eb7661ce43b02c68cc81153ac3d7' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_external_data/EXTERNAL_FO_PW_DATA.csv' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_external_data/EXTERNAL_FO_PW_DATA.csv': 'dce2913be61985213ac903f69a783c9f' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_external_data/EXTERNAL_FO_PW_DATA.csv' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_external_data/EXTERNAL_FO_PW_DATA.csv': '9ade644c5bdb561f0c3bf4a3560a2c15' #> #> Argument max_year taken to be the maximum year in the data: 2020 ``` @@ -118,6 +118,7 @@ biota_timeseries <- create_timeseries( #> #> Creating time series data #> Converting data to appropriate basis for statistical analysis +#> Missing uncertainties which cannot be imputed: deleted data in 'missing_uncertainties.csv #> Dropping groups of compounds / stations with no data between 2015 and 2020 ``` diff --git a/vignettes/external-file-format.Rmd b/vignettes/external-file-format.Rmd index a57fa68..55eea22 100644 --- a/vignettes/external-file-format.Rmd +++ b/vignettes/external-file-format.Rmd @@ -7,22 +7,29 @@ vignette: > %\VignetteEncoding{UTF-8} --- -These are the column headers for CSV-formatted external data files. +These are the column headers for CSV-formatted external data files. The files should be UTF-8 encoded. + +Missing values should be supplied as blank cells, not as `NA` or some other code. + +Other columns can also be supplied, but will typically be ignored. + ## Contaminant data +The data file has one row for each measurement. + | column name | type | mandatory | comments | | ------------- | --------- | :-------: | --------------------- | -| `country` | character | yes | must match country in station file
no missing values | -| `station_code` | alphanumeric | yes | must match station_code in station file
no missing values | -| `station_name` | character | yes | must match station_name in station file
no missing values | -| `sample_latitude` | numeric (decimal degrees) | | doesn’t need to match station latitude in station file | -| `sample_longitude` | numeric (decimal degrees) | | doesn’t need to match station longitude in station file | -| `year` | integer | yes | monitoring year
doesn’t necessarily match date since a sampling season running from e.g. November 2021 to May 2022 might all be considered the 2022 monitoring year
no missing values | +| `country` | character | yes | identifies the source of the data; for international assessments this is typically the country of origin, but for national assessments it could be a local monitoring authority
must match `country` in station file
no missing values | +| `station_code` | alphanumeric | yes | the station (code) where the sample was collected
must match `station_code` in station file
no missing values | +| `station_name` | alphanumeric | yes | the station (name) where the sample was collected; this is often more intuitive to a user than `station_code`
must match `station_name` in station file
no missing values | +| `sample_latitude` | numeric (decimal degrees) | | need not match `station_latitude` in station file | +| `sample_longitude` | numeric (decimal degrees) | | need not match `station_longitude` in station file | +| `year` | integer | yes | monitoring year
doesn’t necessarily match `date` since a sampling season running from e.g. November 2021 to May 2022 might all be considered the 2022 monitoring year
no missing values | | `date` | date: use ISO 8601 standard e.g. 2023-06-28 | | sampling date | -| `depth` | numeric (m) | | sediment: assumed to be a surface sediment sample with depth being the lower depth of the grab; water: assumed to be a surface water sample with depth being the upper depth of the sample; biota: not used, so can supply whatever is useful (or omit) | +| `depth` | numeric (m) | | sediment: assumed to be a surface sediment sample with depth being the lower depth of the grab
water: assumed to be a surface water sample with depth being the upper depth of the sample
biota: not used, so can supply whatever is useful (or omit) | | `species` | character | yes (biota) | latin name which must match a `submitted_species` in the species reference table | no missing values (biota) | -| `sex` | character | | see ICES reference codes for SEXCO
required for EROD assessments
desirable if sex is used to subdivide timeseries (see subseries) | +| `sex` | character | | see ICES reference codes for SEXCO
required for EROD assessments
desirable if sex is used to subdivide timeseries (see `subseries`) | | `n_individual` | integer | | number of pooled individuals in the sample
required for imposex assessments | | `subseries` | character | | used to split up timeseries by e.g. sex or age
for example: `juvenile`, `adult_male`, `adult_female`
missing values indicate that all records in a timeseries will be considered together (no subdivision) | | `sample` | alphanumeric | yes | links measurements made on the same individuals (biota), in the same sediment grab or in the same water sample
no missing values | @@ -36,15 +43,17 @@ These are the column headers for CSV-formatted external data files. | `limit_quantification` | numeric | | same unit as value | | `uncertainty` | numeric | | analytical uncertainty in the measurement
same unit as value | | `unit_uncertainty` | character | | `SD`, `U2` or `%`
if `uncertainty` is present, `unit_uncertainty` must also be present | -| `method_pretreatment` | character (ICES metpt list) | | see ICES reference codes for METPT | -| `method_analysis` | character (ICES metpt list) | | see ICES reference codes for METOA
required for bile metabolite measurements | -| `method_extraction` | character (ICES metpt list) | | see ICES reference codes for METCX
required for sediment normalisation (typically for metals) | +| `method_pretreatment` | character | | use ICES reference codes for METPT | +| `method_analysis` | character | | use ICES reference codes for METOA
required for bile metabolite measurements | +| `method_extraction` | character | | use ICES reference codes for METCX
required for sediment normalisation (typically for metals) | ## Station data +The station file has one row for each station. + | current_name | Type | mandatory | Comments | | ------------ | -------- | :--------: | --------------------- | -| `OSPAR_region` | character | | the regional columns can be called anything (and are optional)
OSPAR: use `OSPAR_region` and `OSPAR_subregion`
HELCOM: use `HELCOM_subbasin`, `HELCOM_L3` and `HELCOM_L4` | +| `OSPAR_region` | character | | the regional columns can be called anything (and are optional)
for OSPAR assessments, use `OSPAR_region` and `OSPAR_subregion`
for HELCOM assessments use `HELCOM_subbasin`, `HELCOM_L3` and `HELCOM_L4`
for other assessments any regional columns must be explicitly identified when calling `read_data` using the `control` argument | | `OSPAR_subregion` | character | | see above | | `country` | character | yes | no missing values | | `station_code` | alphanumeric | yes | no missing values | diff --git a/vignettes/harsat.Rmd b/vignettes/harsat.Rmd index fb4b24c..fd587d3 100644 --- a/vignettes/harsat.Rmd +++ b/vignettes/harsat.Rmd @@ -81,7 +81,7 @@ in a directory `data`, and information files in a directory `information`, but you can use any directory for these. ```r -working.directory <- '/Users/stuart/git/HARSAT' +working.directory <- 'C:/Users/robfr/Documents/HARSAT/HARSAT' ``` ## Reading in the data @@ -100,23 +100,23 @@ water_data <- read_data( info_dir = file.path(working.directory, "information", "OSPAR_2022"), extraction = "2023/08/23" ) -#> Found in path determinand.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv -#> Found in path species.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/species.csv -#> Found in path thresholds_water.csv /Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_water.csv -#> Found in package method_extraction.csv /Users/stuart/git/HARSAT/inst/information/method_extraction.csv -#> Found in package pivot_values.csv /Users/stuart/git/HARSAT/inst/information/pivot_values.csv -#> Found in package matrix.csv /Users/stuart/git/HARSAT/inst/information/matrix.csv -#> Found in package imposex.csv /Users/stuart/git/HARSAT/inst/information/imposex.csv -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/determinand.csv': '912a86ca3efdc719e405a7632e2b89ce' -#> MD5 digest for: '/Users/stuart/git/HARSAT/inst/information/matrix.csv': '9ba2731a7d90accddac659025835a6e4' -#> MD5 digest for: '/Users/stuart/git/HARSAT/information/OSPAR_2022/thresholds_water.csv': '2b165f406bb440297435ea3f46eb3612' +#> Found in path determinand.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv +#> Found in path species.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\species.csv +#> Found in path thresholds_water.csv C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_water.csv +#> Found in package method_extraction.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/method_extraction.csv +#> Found in package pivot_values.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/pivot_values.csv +#> Found in package matrix.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv +#> Found in package imposex.csv C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/imposex.csv +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\determinand.csv': '6b36346446c0ac04a52b3f1347829f6b' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/inst/information/matrix.csv': '4b4fb3814bb84cfbf9b37f7b59d45eb9' +#> MD5 digest for: 'C:\Users\robfr\Documents\HARSAT\HARSAT\information\OSPAR_2022\thresholds_water.csv': '615ef96f716ef1d43c01ab67f383c881' #> Reading station dictionary from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/stations.txt': '057984ad2a1885bc5d15a41ee3b34471' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/stations.txt': '58b9e90f314e89f637c60558c06755f4' #> #> Reading contaminant and effects data from: -#> '/Users/stuart/git/HARSAT/data/example_OSPAR/water.txt' -#> MD5 digest for: '/Users/stuart/git/HARSAT/data/example_OSPAR/water.txt': '0ccaec75c5fd7e875c730467d58fdb26' +#> 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/water.txt' +#> MD5 digest for: 'C:/Users/robfr/Documents/HARSAT/HARSAT/data/example_OSPAR/water.txt': '13d63b6161b671165b215b58f5e22469' #> #> Matching data with station dictionary #> - restricting to stations in these convention areas: OSPAR @@ -228,6 +228,7 @@ water_timeseries <- create_timeseries( #> Limit of quantification less than limit of detection: see limits_inconsistent.csv #> Censoring codes D and Q inconsistent with respective limits: see censoring_codes_inconsistent.csv #> Detection limit higher than data: see detection_limit_high.csv +#> Implausible uncertainties reported with data: see implausible_uncertainties_reported.csv #> Data submitted as CHRTR relabelled as CHR #> Data submitted as BBF, BKF summed to give BBKF #> 1 of 71 samples lost due to incomplete submissions diff --git a/vignettes/reference-file-formats.Rmd b/vignettes/reference-file-formats.Rmd index 74bf7b0..1bc83ef 100644 --- a/vignettes/reference-file-formats.Rmd +++ b/vignettes/reference-file-formats.Rmd @@ -7,7 +7,8 @@ vignette: > %\VignetteEncoding{UTF-8} --- -These are the column headers for CSV-formatted reference table files. +These are the column headers for CSV-formatted reference table files. Ideally, the files should be UTF-8 encoded. + `harsat` uses three different reference tables: 1. A species file @@ -18,6 +19,8 @@ All these files should be in your information files directory. Do not use any forward or backward slashes in the reference files. +Missing values should be supplied as blank cells, not as `NA`. + The order of the columns does not matter, as long as they are named consistently with the specification below. ## Species file format @@ -67,7 +70,7 @@ The determinand file will be `determinand.csv` in your information files directo | `sediment_auxiliary` | character | yes* | yes | Identifies all the auxiliary measurements that should be associated with the determinand. These should be separated by a `~`. For example, for metal contaminants, this might be 'AL~LI~CORG' | | `water_auxiliary` | character | yes" | yes | Identifies all the auxiliary measurements that should be associated with the determinand. These should be separated by a `~`. Not currently used much for water assessments | | `biota_sd_constant` | character | no | yes | If supplied, this allows the imputation of measurement uncertainty for determinands in a biota assessment when they are missing from the data file. `sd_constant` is the constant error with units given by `biota_unit` | -| `biota_sd_variable` | character | no | yes | If supplied, this allows the imputation of measurement uncertainty for determinands in a biota assessment when they are missing from the data file. `sd_constant` is the proportional error expressed as a percentage (%) | +| `biota_sd_variable` | character | no | yes | If supplied, this allows the imputation of measurement uncertainty for determinands in a biota assessment when they are missing from the data file. `sd_variable` is the proportional error expressed as a percentage (%) | | `sediment_sd_constant` | character | no | yes | See `biota_sd_constant` | | `sediment_sd_variable` | character | no | yes | See `biota_sd_variable` | | `water_sd_constant` | character | no | yes | See `biota_sd_constant` |