Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return file version #181

Merged
merged 4 commits into from
Oct 14, 2020
Merged

Conversation

alexbrodersen
Copy link
Contributor

The file version_id can be useful for tracking versions but is not available in box_previous_versions for the currently "active" version of the file. This pull request updates box_ls() and the s3 as.data.frame method to include the version_id, as well as alters the name of the value of the current "version" to "version_no".

@nathancday
Copy link
Member

nathancday commented Sep 15, 2020

Thanks for the PR. I like your idea of mirroring the behavior of box_previous_versions and the proposed name changes sound good.

Do you have a specific instance were you want to use version_id instead of version_no?

I've been working on adding a function, box_file_version() that uses file IDs to do something like this with version_no results. As part of that I've have been re-considering what we return our of box_previous_versions() and my thinking was that version_no alone was enough, but maybe not :)

@alexbrodersen
Copy link
Contributor Author

I believe the version_id is unique but the version_no is more inferred from modification dates? For my purposes, I need to have a direct identifier to the version and I think file_id & version_id provide that but version_no may not.

@nathancday
Copy link
Member

Got it. Paired with a file_id, either a version_id or version_no is enough. Numbers are inferred, but the Box web-app does that too. I've personally found it humanly easier to work with version_no, but I like your idea of having both easily available to users.

@alexbrodersen
Copy link
Contributor Author

I agree, especially if using both boxr and Web App, the version_no is more straightforward to work with.

I do think both should be provided.

One case where the version_id matters is (at least in the enterprise version) is the URL for file version.

This has the form

https://{company_name}.app.box.com/file/{file_id}?sb=/activity/versions/{version_id}

and as far as I know version_no can't be used in this way.

@ijlyttle
Copy link
Member

ijlyttle commented Sep 21, 2020

Let me put my own spin on version_no: this is a construct only of the {boxr} package, and is completely independent of the Box API, which knows only file_version_id. In other words, I agree that both are important.

We are currently working on a successor to box_previous_versions(), called box_version_history(), which will return a data-frame (or have a data-frame method).

One of the things I might like to implement in box_version_history() is that the data-frame would return a numerical column, version_no, rather than a character column, version, with values "V1", "V2", etc. - so this sounds like a good fit. One of the neat things about introducing a new name is that we can get away with changing the API :)

For box_ls(), I think we should keep the version variable in the as.data.frame() method because this could be a breaking change for someone otherwise. I don't know how to deprecate a variable in a return object, maybe the least-bad thing would be to provide both version and version_no, awkward as that may be. Thoughts?

In short, I think that it would be good to standardize on the use of version_no within {boxr}, and this is a step in that direction.

There's another PR that I want to get wrapped-up first, then we may have some specific suggestions for this one.

Thanks!

@ijlyttle
Copy link
Member

ijlyttle commented Sep 21, 2020

Another question - perhaps to the universe than to anyone specifically: the use of x$etag predates my involvement. I have read around a bit (google search), and I have not yet come to the conclusion that this is a rock-solid proxy for version number.

Of course, it was done this way for a reason - I would like to feel good about the reason while we have the question on the floor.

@ijlyttle
Copy link
Member

ijlyttle commented Sep 21, 2020

Finally (for now), welcome @alexbrodersen!

So that you can feel fully a part of the fun: I had a quick look through our code, and see that we refer to both file_version_id and version_id, although Box uses file_version_id.

As a first thought, it might be easiest for {boxr} to standardize on version_id, as it is used much more than file_version_id.

(As a side note, this discussion has really helped me clarify my thinking on box_version_history())

@alexbrodersen
Copy link
Contributor Author

@ijlyttle Regarding depreciating a column in a return variable, I suppose this may represent a breaking API change, and boxr should increment version number if it's completely removed?

I do think the easiest solution, for now, is to leave that as is and extend the return with the new variables.

Regarding file_version_id vs version_id, I'm in favor of keeping things relatively consistent with the box API, but in my opinion it's more important to be internally consistent across functions in the boxr package.

@alexbrodersen
Copy link
Contributor Author

Here's what I get from grep for the list of where version_id vs file_version_id is used. I can update my PR to change this based on whatever is decided?

boxr__internal_get.R:63:      version_id <- versions$file_version_id[version_no]
boxr_file_versions.R:9:#' The returned `data.frame` contains a variable, `file_version_id`, 
boxr_file_versions.R:63:  colnames(d)[colnames(d) == "id"] <- "file_version_id"

boxr__internal_get.R:30:boxGet <- function(file_id, local_file, version_id = NULL, version_no = NULL,
boxr__internal_get.R:36:  if (!is.null(version_id) & !is.null(version_no)) {
boxr__internal_get.R:38:    stop("Only one of version_id and version_no may be supplied")
boxr__internal_get.R:41:  if (is.null(version_id) & !is.null(version_no)) {
boxr__internal_get.R:63:      version_id <- versions$file_version_id[version_no]
boxr__internal_get.R:67:  if (!is.null(version_id)) {
boxr__internal_get.R:68:    # If you have a version_id, check that it looks reasonable (11 digits)
boxr__internal_get.R:69:    version_id <- as.numeric(version_id)
boxr__internal_get.R:71:    if (!grepl("[[:digit:]]{11}", version_id))
boxr__internal_get.R:72:      stop("version_id must be an integer, 11 characters in length")
boxr__internal_get.R:79:        file_id, "/content", "?version=", version_id
boxr_file_versions.R:9:#' The returned `data.frame` contains a variable, `file_version_id`, 
boxr_file_versions.R:63:  colnames(d)[colnames(d) == "id"] <- "file_version_id"
boxr_read.R:56:box_read <- function(file_id, type = NULL, version_id = NULL, 
boxr_read.R:64:  req <- boxGet(file_id, local_file = temp_file, version_id = version_id, 
boxr_s3_classes.R:143:      version_id          = x$file_version$id %|0|% NA_character_,
boxr_upload_download.R:13:#' **`version_id`** or **`version_no`**.
boxr_upload_download.R:17:#' using the `version_id` parameter.
boxr_upload_download.R:24:#' [box_previous_versions()] to retrieve the `version_id`,
boxr_upload_download.R:32:#' @param version_id `character` or `numeric`, the `version_id` of the file.
boxr_upload_download.R:57:                   file_name = NULL, version_id = NULL, version_no = NULL,
boxr_upload_download.R:88:  req <- boxGet(file_id = file_id, version_id = version_id,

@ijlyttle
Copy link
Member

Hi @alexbrodersen,

I think we are on the same page here.

If I were starting the package from scratch, I might have chosen file_version_id to be better aligned with the Box API, but I think the right move is to stick with version_id here.

We do plan to change the version number, but I think it would be "rude" for the version variable to just disappear - I think we can just have both for now, and think about how to remove it down the road.

The PR is in good shape now - I plan to get #159 taken care of immediately, then come back with specific suggestions for this PR - none of which will be very big or surprising.

@nathancday
Copy link
Member

IMO if we keep version named as is and add version_id this is good to go. Maybe add a test for names exceptions.

The box web app also must be doing a similar version inference because it displays a v1, v2, v3, ... badge next to file names and in the version menu. I think we should continue to support both (human readable number and full ID) like box_dl() whenever possible. No reason box_version can't report both.

Screen Shot 2020-09-23 at 5 32 06 PM

Screen Shot 2020-09-23 at 5 32 27 PM

@ijlyttle
Copy link
Member

Sounds good to me! I can provide some specific feedback here shortly.

R/boxr_s3_classes.R Show resolved Hide resolved
DESCRIPTION Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@ijlyttle
Copy link
Member

@alexbrodersen, I hit the wrong button before I could summarize the review.

I don't think there is too much to do, once you are done, I'll be happy to bump the version.

As @nathancday points out, it will be good to have a test, but I had a look at the test and it's not too user friendly, so I will be happy to add a test before merging.

Merge remote-tracking branch 'refs/remotes/origin/master'

Conflicts:
	DESCRIPTION
@ijlyttle ijlyttle requested a review from nathancday October 12, 2020 22:26
@codecov-io
Copy link

Codecov Report

Merging #181 into master will decrease coverage by 52.97%.
The diff coverage is 10.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #181       +/-   ##
==========================================
- Coverage   55.88%   2.91%   -52.98%     
==========================================
  Files          24      25        +1     
  Lines        1555    1785      +230     
==========================================
- Hits          869      52      -817     
- Misses        686    1733     +1047     
Impacted Files Coverage Δ
R/boxr__internal_dir_comparison.R 0.00% <0.00%> (-97.42%) ⬇️
R/boxr__internal_get.R 0.00% <0.00%> (-83.02%) ⬇️
R/boxr__internal_verb_exit.R 0.00% <0.00%> (-98.31%) ⬇️
R/boxr_add_description.R 0.00% <ø> (ø)
R/boxr_collab.R 0.00% <0.00%> (ø)
R/boxr_comment.R 0.00% <0.00%> (ø)
R/boxr_file_versions.R 0.00% <0.00%> (-100.00%) ⬇️
R/boxr_misc.R 0.00% <0.00%> (-44.65%) ⬇️
R/boxr_read.R 0.00% <0.00%> (-65.72%) ⬇️
R/boxr_s3_classes.R 0.00% <0.00%> (-14.12%) ⬇️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f488cf3...c4cdbd4. Read the comment docs.

@ijlyttle
Copy link
Member

ijlyttle commented Oct 12, 2020

@alexbrodersen I'm doing some other work that will need to incorporate yours, so I hope it's OK that I made the changes.

@nathancday can you have a look to make sure what I've done is OK?

Copy link
Member

@nathancday nathancday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning version_no feels weird, the two identical columns. What about just version and version_id? Keep the legacy and add Alex's improvement for id

@ijlyttle
Copy link
Member

The legacy was already split; box_read() and box_dl() both currently use a version_no argument.

I prefer version_no to version as it is more distinct from version_id. I agree it's awkward to return both, but I think we would have to do that for backwards compatibility (small as that risk might seem).

@nathancday
Copy link
Member

Yea totally forgot about the existing split across args. Sounds good to have the weirdness of dupes for consistancy.

@nathancday nathancday merged commit b81a6f7 into r-box:master Oct 14, 2020
@ijlyttle
Copy link
Member

Thanks @alexbrodersen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants