Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try harder to make believable directories #3775

Merged
merged 10 commits into from
Oct 10, 2018

Conversation

jean-philippe-martin
Copy link

@jean-philippe-martin jean-philippe-martin commented Oct 5, 2018

Previous implementation of fake folders just looked for a trailing /. Now we also list the files with this prefix and use this to determine whether the input is a folder. This will detect dir in addition to just dir/ before.

This version will still say 'yes' to anything that ends in a slash, even if they don't exist, maintaining the previous behavior.

This PR also adds support for MAX_VALUES to our fake storage provider, and adds an integration test for finding directories (passing).

Fixes #3774

Previous implementation of fake folders just looked for a trailing '/'.
Now we also list the files with this prefix and use this to determine
whether the input is a folder. This will detect 'dir' in addition to
just 'dir/' before.

This version will still say 'yes' to anything that ends in a slash,
even if they don't exist, maintaining the previous behavior.
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Oct 5, 2018
@jean-philippe-martin
Copy link
Author

This PR should fix issue #3774

Copy link
Contributor

@pongad pongad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chingor13 @JesseLovelace please also take a look

@chingor13 chingor13 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 8, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 8, 2018
Copy link
Contributor

@chingor13 chingor13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note somewhere that this check now makes an additional list operation. usePseudoDirectories() appears to default to true.

It also seems like a psuedo directory is the failure case for many operations. A user might be better served by turning off psuedo directories and handling any exception if they don't want an additional request per file specified.

Improved performance by moving the pseudodir code to only run on file
not found, leaving the common-case path to run faster.

Added tests for pseudodirs, and one in LocalStorage too.

Improved comments following reviewer suggestions.
@chingor13
Copy link
Contributor

@jean-philippe-martin The thing I'm most concerned about is that we're silently adding additional list requests and it's opaque and possibly unexpected. A user upgrading may see many more list requests and at scale this could cost them $$$.

@jean-philippe-martin
Copy link
Author

@chingor13 you're right, there would be more list requests: but only when the user tries to access files that do not exist (and do not end in slash). This is not a very common operation at all. And there is an easy way to make sure no list operation is sent: disable pseudo-directories (of course that has some drawbacks too), or whenever using directories, make sure to append a final "/" at the end so it's clear it's not a file name.

We could conceivably add a third intermediate mode ("pseudo-dirs that require slash", effectively what pseudo-dir mode was before this PR), but having more modes multiplies the maintenance and testing costs. I don't think it would be worth it.

What we can do is make sure the pseudodir code only triggers when the file isn't there, and I've done that already. We could search for other optimization opportunities. The other thing we can do is update the documentation. I'll be happy to do this if you think it'll be helpful.

@chingor13
Copy link
Contributor

chingor13 commented Oct 9, 2018

@jean-philippe-martin I think extra calls are happening on more than just missing file failures. If you run a simple snippet from the google-cloud-nio README:

try (CloudStorageFileSystem fs = CloudStorageFileSystem.forBucket("bucket")) {
  Path path = fs.getPath("lolcat.csv");
  List<String> lines = Files.readAllLines(path, StandardCharsets.UTF_8);
}

The above will now make an additional API call to list if the file exists.

@jean-philippe-martin
Copy link
Author

Oh you're right, it does! Let's see if I can avoid it.

Remove check for path without '/' in places where it's not required.
@jean-philippe-martin
Copy link
Author

@chingor13 I've pushed an update that removes the call from that case and a few others.

@jean-philippe-martin jean-philippe-martin requested a review from a team as a code owner October 10, 2018 20:42
@chingor13 chingor13 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 10, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 10, 2018
@chingor13 chingor13 merged commit 636588c into googleapis:master Oct 10, 2018
This was referenced Oct 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect directories even when not ending with "/"
5 participants