Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vctrs support to lubridate #871

Merged
merged 26 commits into from
May 18, 2020
Merged

Add vctrs support to lubridate #871

merged 26 commits into from
May 18, 2020

Conversation

DavisVaughan
Copy link
Member

@DavisVaughan DavisVaughan commented Mar 20, 2020

Closes #721
Fixes @batpigandme's comment in r-lib/vctrs#919 (comment)

This PR adds vctrs support for Period / Duration / Interval S4 classes. Specifically, each one now has methods for vec_ptype2(), vec_cast(), vec_proxy(), vec_restore(), vec_proxy_compare(), and vec_proxy_equal().

I have added an extensive test suite, testing the cast/ptype behavior, and also the behavior of various vctrs functions (ordering, binding, c, etc). I think these vctrs functionality tests are good because this is one way to ensure that your vec_restore() and vec_proxy() methods are right.

  • There is no common type between Period / Duration / Interval. This could be added later if required, but for now this seems safe and should solve 95% of the problems.

  • Duration and difftime have a common type.

  • I have not prevented inheritance here (like we did in glue), as we expect inheritance in the cast/ptype2 methods to be prevented natively in vctrs in the near future.

  • Period and Interval types use a data frame as their proxy, with a special optional rcrd_names column to hold any vector names. This is better than adding row names, because it allows duplicates. The comparison/equality proxies do not have this column.

  • Interval types have a time zone attached. I followed the same rules as POSIXct when determining the common type.

    • Local time zones are overruled by the other time zone
    • Otherwise the time zone used in the common type comes from the LHS
  • You'll see a TODO in the constructor section. Lubridate constructors are rather slow, and I'd really like to store a global empty Period and Duration object, but I can't seem to create one in the .onLoad(). If I call period() in .onLoad(), for some reason the S4 initialize() method gets bypassed, but the validity method still gets run and we get an error from there. I'm curious if anyone has ever seen this.

  • This currently requires dev vctrs, but it is in Suggests.

  • There was also a small bug where interval(tzone = "my-tz") wasn't respecting the tzone argument. This has been fixed.

@DavisVaughan
Copy link
Member Author

Travis and Windows failures seem unrelated to me (Travis looks like the Rcpp problem)

Copy link
Member

@vspinu vspinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this is a light change, in the sense that it doesn't require changes to the main code.

I plan to release soon. Is vectrs mature enough, or shall we keep this in dev for a bit?

R/intervals.r Outdated Show resolved Hide resolved
@hadley
Copy link
Member

hadley commented Mar 21, 2020

We'd really like to get this into the next release as it makes lubridate classes work reliably in tidyr and dplyr 1.0.0.

@vspinu
Copy link
Member

vspinu commented Mar 21, 2020

Ok, let's push it with the next release.

@DavisVaughan
Copy link
Member Author

DavisVaughan commented Mar 21, 2020

It does currently require dev vctrs because of a bug that I fixed with vec_chop(), which affects vec_rbind() (at the very least). It caused a crash of R.

r-lib/vctrs#934

Maybe we could do one more small vctrs release with that fix before the big one with the breaking changes?

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Even if it's not perfect, it's still a substantial improvement over the current behaviour, and we can iterate in a future release.

R/vctrs.R Outdated Show resolved Hide resolved
@vspinu
Copy link
Member

vspinu commented Apr 1, 2020

vctrs tests are failing with CRAN vctrs, so I am afraid this will have to go in next release in a month or two. Will pull into dev as soon as new version of vctrs is on CRAN as well.

@DavisVaughan
Copy link
Member Author

Noting that it would be nice to add vec_arith() methods as well. I was just bitten by this, and was very confused.

library(vctrs)
library(lubridate)

i <- structure(c(0, 1, 2, 3), class = "Date")
i
#> [1] "1970-01-01" "1970-01-02" "1970-01-03" "1970-01-04"

i - lubridate::days(1)
#> [1] "1969-12-31" "1970-01-01" "1970-01-02" "1970-01-03"

# This "works" because `days()` is a Period, which is an S4 declared to be a
# "double".
# The `.Data` slot of Period is the number of seconds, which is 0 here.
# So `i` is returned unchanged.
vec_arith("-", i, lubridate::days(1))
#> [1] "1970-01-01" "1970-01-02" "1970-01-03" "1970-01-04"

@vspinu
Copy link
Member

vspinu commented Apr 30, 2020

@DavisVaughan What is the status of vctrs? I would like to push a bug fix release to CRAN towards the end of the week. Is this one ready to be included?

@hadley
Copy link
Member

hadley commented Apr 30, 2020

Would it be possible to hold off release until the end of next week? vctrs is scheduled for release on 2020-05-06, so @DavisVaughan could update this PR with changes needed for the latest version of vctrs, and we could get this PR merged before dplyr 1.0.0 is out.

@vspinu
Copy link
Member

vspinu commented May 1, 2020

Sure, next week then!

Merge branch 'master' into vctrs

# Conflicts:
#	R/intervals.r
@vspinu
Copy link
Member

vspinu commented May 16, 2020

@DavisVaughan could you please update on the status? Thanks!

@DavisVaughan
Copy link
Member Author

@vspinu just bumped to use CRAN vctrs. It should be good to merge.

vec_arith() isn't exactly right in vctrs yet so we are waiting to add support for it in lubridate. It shouldn't have any effects on how lubridate is used with dplyr though.

@vspinu vspinu merged commit 2e5a446 into tidyverse:master May 18, 2020
@DavisVaughan DavisVaughan deleted the vctrs branch May 18, 2020 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vctrs integration
3 participants