Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase precision of xls datetimes when coerced to character #431

Merged
merged 5 commits into from
Mar 16, 2018
Merged

Increase precision of xls datetimes when coerced to character #431

merged 5 commits into from
Mar 16, 2018

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented Mar 16, 2018

Fixes #430 read_xls rounding date times when col_types="text"

@jennybc jennybc requested a review from jimhester March 16, 2018 06:07
@jennybc
Copy link
Member Author

jennybc commented Mar 16, 2018

@jimhester This "fixes" the problem but I am not entirely satisfied because

  • I don't have a good handle on why the precision was previously so low (?) on the xls side, nor how it's currently set on the xlsx side.
  • I feel uncertain about how many digits of agreement I should expect between xlsx and xls in the test.

General context, in case it's not clear: until the day when we can translate xls(x) date time formats to R date time formats, datetimes will be coerced to character as if they were just regular doubles.

Do you have any advice re: setting precision and/or testing?


devtools::load_all(here::here())
#> Loading readxl
xlsx <- read_excel(test_sheet("texty-dates-xlsx.xlsx"), col_types = "text")
xls <- read_excel(test_sheet("texty-dates-xls.xls"), col_types = "text")

xlsx
#> # A tibble: 2 x 1
#>   a                 
#>   <chr>             
#> 1 31117.541666666672
#> 2 31117.558009259261
xls
#> # A tibble: 2 x 1
#>   a            
#>   <chr>        
#> 1 31117.5416667
#> 2 31117.5580093

Created on 2018-03-15 by the reprex package (v0.2.0).

Copy link
Contributor

@jimhester jimhester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we do this now?

translate xls(x) date time formats to R date time formats

src/XlsCell.h Outdated
@@ -247,7 +247,7 @@ class XlsCell {
if (std::modf(cell_->d, &intpart) == 0.0) {
strs << std::fixed << (int64_t)cell_->d;
} else {
strs << cell_->d;
strs << std::setprecision(12) << cell_->d;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be something like std::numeric_limits<double>::digits10 + 2 (which ends up being 17 for 64bit doubles) to ensure full precision. See the proposal linked at https://stackoverflow.com/questions/554063/how-do-i-print-a-double-value-with-full-precision-using-cout#comment29144568_554134 for details why.

Copy link
Member Author

@jennybc jennybc Mar 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks. Do you think I should pre-emptively do same on the xlsx side, so they are both doing the "right" thing and doing the same thing? Disregard: just remembered that there is no coercion on the xlsx side -- the serial date is already a string.

Re: converting time format strings, our impression is that this is doable but it's also not a tiny piece of work. The plan is to use or build on this: https://github.com/WizardMac/TimeFormatStrings, which would have application across multiple packages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I apparently misunderstood how the dates are represented in xls format, LGTM!

@jennybc jennybc merged commit 4af8b31 into tidyverse:master Mar 16, 2018
@jennybc jennybc deleted the bugfix-430-xls-precision branch March 16, 2018 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants