-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: a_summary.logical formatting not dealing as a_summary.factor when denom = 0 #649
Comments
I think this is a fast fix. Should we use just 0 or NA? @edelarua @ayogasekaram @shajoezhu |
NA |
This changes a lot of things. Keeping it in todo. If we want to unify this behavior, we need to be sure that NA is an universal standard for / 0. Can we assume so? @anajens @barnett11 ? |
The actual standards guidance (which was written pre-R onboarding I should caveat) is below, 3.1. Presentation of counts and percentages 3.2. Presentation of Summary Statistics |
@Melkiades sorry for the very brief reply, was meant to get back to this thread, and I was hoping for dig up the rationale we did the way we have now. but got side tracked, and many thanks to @barnett11 , this is exactly the context i was looking for. When I was thinking of NA, I was thinking to specify use the NA_str to modify the "reporting" here. pernally, i hope we can seperate the actual value we compute here and the values that we reporting here. back in the days when we first wrote these summary statistics function, rtables did not have the option for na_str display. I think that is why we mixed these computation together. I think another good examples can be found at, #543 (comment) I was thinking, maybe we can do something more explicit to the reporting option, the NA is caused by "dividing a treatment group of 0", the computation should be mathematical, letting 0/0 be computed as 0 is less ideal I think. but if we are more explicit with what "NA" is displayed as, to users, this option would be more flexible? what do you think? @Melkiades @barnett11 @anajens @khatril |
If I understood correctly @barnett11 comment, it seems that any summary statistic derived from a 0 population should be NE, while counts should just be 0 (rightfully so). I would put NA for tern to stay general, while having NE in tlg-catalog through |
checked with @barnett11 , specify na_str = "NE" for this could work |
Feature description
Hi @shajoezhu ,
When using the
summarize_vars
function, I realized that when the denominator is 0,summarize_vars()
returnsNA
for logical variables while it returns0
for factor variables for.stats = count_fraction
. I feel both should act in the same way. I am proposing the solution.See example:
Look at the red and green circles.
The following is the reason why this is happening:
s_summary.logical
y$count_fraction <- c(count, ifelse(dn > 0, count / dn, NA))
s_summary.factor
In the logical part we are assigning the NA value if denom is 0, In the factor case we are assigning the 0 value for the same case. Then, if we go to format_count_fraction() which is the function used for formatting, if
x[1]
orx[2]
isNA
(which is the case for the logical scenario above), then the output will beNA
.So the solution would be:
To act in the same in both cases. I would change s_summary.logical to be aligned with s_summary.factor:
y$count_fraction <- c(count, ifelse(dn > 0, count / dn, NA))
Don't you think that when showing counts and percentages (xx (xx%)), would be even better the output to be i.e 0/0%, instead of just 0 when denom is 0?
See the new proposal on format_count_fraction() code:
Code of Conduct
Contribution Guidelines
Security Policy
The text was updated successfully, but these errors were encountered: