-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase precision of xls datetimes when coerced to character #431
Conversation
@jimhester This "fixes" the problem but I am not entirely satisfied because
General context, in case it's not clear: until the day when we can translate xls(x) date time formats to R date time formats, datetimes will be coerced to character as if they were just regular doubles. Do you have any advice re: setting precision and/or testing? devtools::load_all(here::here())
#> Loading readxl
xlsx <- read_excel(test_sheet("texty-dates-xlsx.xlsx"), col_types = "text")
xls <- read_excel(test_sheet("texty-dates-xls.xls"), col_types = "text")
xlsx
#> # A tibble: 2 x 1
#> a
#> <chr>
#> 1 31117.541666666672
#> 2 31117.558009259261
xls
#> # A tibble: 2 x 1
#> a
#> <chr>
#> 1 31117.5416667
#> 2 31117.5580093 Created on 2018-03-15 by the reprex package (v0.2.0). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we do this now?
translate xls(x) date time formats to R date time formats
src/XlsCell.h
Outdated
@@ -247,7 +247,7 @@ class XlsCell { | |||
if (std::modf(cell_->d, &intpart) == 0.0) { | |||
strs << std::fixed << (int64_t)cell_->d; | |||
} else { | |||
strs << cell_->d; | |||
strs << std::setprecision(12) << cell_->d; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be something like std::numeric_limits<double>::digits10 + 2
(which ends up being 17 for 64bit doubles) to ensure full precision. See the proposal linked at https://stackoverflow.com/questions/554063/how-do-i-print-a-double-value-with-full-precision-using-cout#comment29144568_554134 for details why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks. Do you think I should pre-emptively do same on the xlsx side, so they are both doing the "right" thing and doing the same thing? Disregard: just remembered that there is no coercion on the xlsx side -- the serial date is already a string.
Re: converting time format strings, our impression is that this is doable but it's also not a tiny piece of work. The plan is to use or build on this: https://github.com/WizardMac/TimeFormatStrings, which would have application across multiple packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I apparently misunderstood how the dates are represented in xls format, LGTM!
Fixes #430 read_xls rounding date times when col_types="text"