Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Analyze Poisson using broom #114

Merged
merged 1 commit into from
Feb 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ Encoding: UTF-8
Imports:
dplyr,
lubridate,
magrittr
magrittr,
broom
Suggests:
testthat (>= 3.0.0),
safetyData,
Expand Down
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export(Transform_EventCount)
export(TreatmentExposure)
import(dplyr)
import(lubridate)
importFrom(broom,augment)
importFrom(lubridate,is.Date)
importFrom(lubridate,time_length)
importFrom(magrittr,"%>%")
Expand All @@ -30,6 +31,5 @@ importFrom(stats,offset)
importFrom(stats,pnorm)
importFrom(stats,poisson)
importFrom(stats,quantile)
importFrom(stats,residuals)
importFrom(stats,sd)
importFrom(stats,wilcox.test)
25 changes: 15 additions & 10 deletions R/Analyze_Poisson.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,13 @@
#' The input data (` dfTransformed`) for the Analyze_Poisson is typically created using \code{\link{Transform_EventCount}} and should be one record per Site with columns for:
#' - `SubjectID` - Unique subject ID
#' - `SiteID` - Site ID
#' - `Count` - Number of Adverse Events
#' - `Exposure` - Number of days of exposure
#' - `TotalCount` - Number of Events
#' - `TotalExposure` - Number of days of exposure
#'
#' @param dfTransformed data.frame in format produced by \code{\link{Transform_EventCount}}. Must include
#'
#' @importFrom stats glm offset poisson residuals pnorm
#' @importFrom stats glm offset poisson pnorm
#' @importFrom broom augment
#'
#' @return input data frame with columns added for "Residuals", "PredictedCount" and "PValue"
#'
Expand All @@ -38,18 +39,22 @@ Analyze_Poisson <- function( dfTransformed ){
all(c("SiteID", "N", "TotalExposure", "TotalCount", "Rate") %in% names(dfTransformed))
)

dfTransformed$LogExposure <- log( dfTransformed$TotalExposure )
dfModel <- dfTransformed %>% mutate(LogExposure = log( .data$TotalExposure) )

cModel <- stats::glm(
TotalCount ~ stats::offset(LogExposure), family=stats::poisson(link="log"),
data=dfTransformed
data=dfModel
)

dfAnalyzed <- dfTransformed
dfAnalyzed$Residuals <- stats::residuals( cModel )
dfAnalyzed$PredictedCount <- exp(dfAnalyzed$LogExposure*cModel$coefficients[2]+cModel$coefficients[1])
dfAnalyzed$PValue = stats::pnorm( abs(dfAnalyzed$Residuals) , lower.tail=F ) * 2
dfAnalyzed <- dfAnalyzed[order(abs(dfAnalyzed$Residuals) , decreasing=T), ]
dfAnalyzed <- broom::augment(cModel, dfModel, type.predict = "response") %>%
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably something looked into a while back, but I forget the difference here between Residuals vs standardized residuals for the formula in the Poisson. I think we probably only need to keep the relevant columns (two residuals columns, and the additional hat (leverage), sigma (sd), cooksd (influence), may confuse user and probably can be hidden for now)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I agree. Will think about it a bit more once wilcoxon and maybe disposition are done and try to come up with some standardized naming.

rename(
Residuals=.data$.resid,
PredictedCount=.data$.fitted,
) %>%
mutate(PValue = stats::pnorm( abs(.data$Residuals) , lower.tail=F ) * 2) %>%
arrange(.data$Residuals)

# Note that the PValue calculation is a non-standard approximation and might be more accurately labeled a "standardized estimate" rather than a formal p-value.

return(dfAnalyzed)
}
4 changes: 2 additions & 2 deletions man/Analyze_Poisson.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.