Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix download consistency #4923

Merged
merged 18 commits into from
Nov 12, 2024
Merged

Conversation

butonic
Copy link
Contributor

@butonic butonic commented Nov 8, 2024

fixes owncloud/ocis#10526

  • refactor code to reuse sections
  • changelog

@butonic butonic requested review from a team and labkode as code owners November 8, 2024 17:00
@butonic butonic self-assigned this Nov 8, 2024
Copy link

update-docs bot commented Nov 8, 2024

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@aduffeck
Copy link
Contributor

The change itself looks sane to me and I think it'll work (I haven't actually tried it yet), but I have a few thoughts.

Posixfs won't be able to fulfill the ConsistentDownloader, at least not for now. I doesn't do versions and the blobs are just the files themselves, so the reader function return from ConsistentDownload would still read the changed blob when getContent is actually called.
The code in this PR is still consistent, because Posixfs uses the fs middleware which doesn't include the ConsistentDownloader interface, but it's a deviation to keep in mind.

But taking a step back and looking at the problem I noticed that GetOrHeadFile doesn't actually care about the tag itself, it just wants consistency between the size it returns in the ContentLength header and the bytes it passes on using fs.Download and uses the etag as a means to do so.

So, what if - instead of introducing the ConsistentDownloader interface - we change the Download function to return the metadata for the blob being downloaded alongside the according reader? That way we would get consistency for free, we would spare one now-superfluous stat request per download, and posixfs would be able to provide the same functionality out of the box.
It looks easy enough to do, if you want I can open a PR to try it out. Or am I missing something?

@butonic
Copy link
Contributor Author

butonic commented Nov 11, 2024

How would changing the signature of Download() differ from using the ConsistentDownloader interface? ConsistentDownload() already returns both: the metadata (so we can omit the stat, which is used to return the content length and LastModified header ... and we should return the checksum IIRC) as well as a callback to get a ReadCloser. Why a callback? Because for s3 getting the reader might be expensive and unnecessary for a HEAD operation.

🤔

Posixfs could already open the file but then how do we close it if it is never read?

...

ugh .... we always call getContent ... which we could omit in HEAD requests without a Range header. Probably just an optimization since it would only affect range requests...

So yeah, we could simplify the interface for now and always return the metadata and a ReadCloser.

@aduffeck
Copy link
Contributor

Hm. No, we return early when the etag didn't change, without calling getContent, no? Download already handles the if-match header case. If we also teach it the if-none-match case we wouldn't have to open the reader for the cases where it won't be needed.

@butonic
Copy link
Contributor Author

butonic commented Nov 11, 2024

Yeah, I'd really like to be efficient in that case and not even open the file then. And I don't want to push down the IfNoneMatch check into the filesystem implementations.

@aduffeck aduffeck requested a review from glpatcern as a code owner November 11, 2024 15:00
@butonic butonic force-pushed the fix-download-consistency branch from d73987c to 3b4e40a Compare November 11, 2024 16:43
}
return reader, nil
return ri, reader, nil
Copy link
Contributor Author

@butonic butonic Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, we are returning the ResourceInfo here ... even though the user may not have stat permissions ... I'd say that is ok for now, because Download() is only called by the dataprovider. still meh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also returning a reader to the file. So I see no additional security implications.

butonic and others added 14 commits November 11, 2024 18:01
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
@butonic butonic force-pushed the fix-download-consistency branch from abbcab2 to 4edeb44 Compare November 11, 2024 17:01
@butonic
Copy link
Contributor Author

butonic commented Nov 11, 2024

urgh the ceph fs build fails ... we should have used an additional interface ...

Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
@butonic butonic merged commit cbef82f into cs3org:edge Nov 12, 2024
10 checks passed
@butonic butonic deleted the fix-download-consistency branch November 12, 2024 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants