gvfs-helper: add gvfs.fallback and unit tests #665

jeffhostetler · 2024-06-28T17:33:07Z

By default, GVFS Protocol-enabled Scalar clones will fall back to the origin server if there is a network issue with the cache servers. However (and especially for the prefetch endpoint) this may be a very expensive operation for the origin server, leading to the user being throttled. This shows up later in cases such as 'git push' or other web operations.

To avoid this, create a new config option, 'gvfs.fallback', which defaults to true. When set to 'false', pass '--no-fallback' from the gvfs-helper client to the child gvfs-helper server process.

This will allow users who have hit this problem to avoid it in the future. In case this becomes a more widespread problem, engineering systems can enable the config option more broadly.

Enabling the config will of course lead to immediate failures for users, but at least that will help diagnose the problem when it occurs instead of later when the throttling shows up and the server load has already passed, damage done.

This change only applies to interactions with Azure DevOps and the
GVFS Protocol.

This change only applies to interactions with Azure DevOps and the
GVFS Protocol.

jeffhostetler · 2024-06-28T17:36:21Z

@derrickstolee I took your #664 and added new unit tests. Hopefully, it all makes sense.

I debated putting the config lookup in gvfs-helper.exe, rather than in the client code, but decided that was more risk than I wanted (not that it was hard, but that it would change the historical default behavior, since gvfs-helper.exe does not have fallback turned on and only does it when requested, so it felt kinda backwards to move the decision at this point.)

dscho

Very nice!

dscho · 2024-07-01T11:17:11Z

t/t5799-gvfs-helper.sh

+	stop_gvfs_protocol_server &&
+
+	grep -q "error: get: (http:503)" OUT.stderr &&
+	verify_connection_count 3


Clever, the connection count is 6 with the fall-back, and 3 without it. I had to puzzle about the difference for a bit because the test cases are so nearly identical that I suspect that it might make for a fine addition to the commit message to explain this fine point?

dscho · 2024-07-01T11:35:08Z

@jeffhostetler thank you so much for taking the time to add the tests. This will be an invaluable source of documentation for future changes to the gvfs-helper. ❤️

Construct 2 new unit tests to explicitly verify the use of `--fallback` and `--no-fallback` arguments to `gvfs-helper`. When a cache-server is enabled, `gvfs-helper` will try to fetch objects from it rather than the origin server. If the cache-server fails (and all cache-server retry attempts have been exhausted), `gvfs-helper` can optionally "fallback" and try to fetch the objects from the origin server. (The retry logic is also applied to the origin server, if the origin server fails on the first request.) Add new unit tests to verify that `gvfs-helper` respects both the `--max-retries` and `--[no-]fallback` arguments. We use the "http_503" mayhem feature of the `test_gvfs_protocol` server to force a 503 response on all requests to the cache-server and the origin server end-points. We can then count the number of connection requests that `gvfs-helper` makes to the server and confirm both the per-server retries and whether fallback was attempted. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>

By default, GVFS Protocol-enabled Scalar clones will fall back to the origin server if there is a network issue with the cache servers. However (and especially for the prefetch endpoint) this may be a very expensive operation for the origin server, leading to the user being throttled. This shows up later in cases such as 'git push' or other web operations. To avoid this, create a new config option, 'gvfs.fallback', which defaults to true. When set to 'false', pass '--no-fallback' from the gvfs-helper client to the child gvfs-helper server process. This will allow users who have hit this problem to avoid it in the future. In case this becomes a more widespread problem, engineering systems can enable the config option more broadly. Enabling the config will of course lead to immediate failures for users, but at least that will help diagnose the problem when it occurs instead of later when the throttling shows up and the server load has already passed, damage done. Signed-off-by: Derrick Stolee <stolee@gmail.com>

Create new `cache_http_503` mayhem method where only the cache server sends a 503. The normal `http_503` directs both cache and origin server to send 503s. This will be used to help test fallback. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>

jeffhostetler · 2024-07-01T14:08:19Z

I just added enhanced comments in-line in the test script and in the commit messages.

dscho · 2024-07-01T14:08:46Z

I just added enhanced comments in-line in the test script and in the commit messages.

Thank you!

Please ignore the test failure, this is my mistake.

dscho · 2024-07-01T14:11:41Z

Please ignore the test failure, this is my mistake.

@jeffhostetler actually, if I could get your review of #666, that would be nice...

derrickstolee

Thanks for writing these tests, @jeffhostetler! I have a lot more confidence in the change, now.

…er-fallback-config Let's include #666 to let the PR builds pass. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

derrickstolee

Approving once more!

By default, GVFS Protocol-enabled Scalar clones will fall back to the origin server if there is a network issue with the cache servers. However (and especially for the prefetch endpoint) this may be a very expensive operation for the origin server, leading to the user being throttled. This shows up later in cases such as 'git push' or other web operations. To avoid this, create a new config option, 'gvfs.fallback', which defaults to true. When set to 'false', pass '--no-fallback' from the gvfs-helper client to the child gvfs-helper server process. This will allow users who have hit this problem to avoid it in the future. In case this becomes a more widespread problem, engineering systems can enable the config option more broadly. Enabling the config will of course lead to immediate failures for users, but at least that will help diagnose the problem when it occurs instead of later when the throttling shows up and the server load has already passed, damage done. This change only applies to interactions with Azure DevOps and the GVFS Protocol. --- * [x] This change only applies to interactions with Azure DevOps and the GVFS Protocol.

jeffhostetler self-assigned this Jun 28, 2024

jeffhostetler mentioned this pull request Jun 28, 2024

gvfs-helper: add gvfs.fallback config option #664

Closed

1 task

jeffhostetler requested review from dscho and derrickstolee June 28, 2024 18:59

dscho approved these changes Jul 1, 2024

View reviewed changes

jeffhostetler and others added 4 commits July 1, 2024 09:41

t5799: add unit tests for new gvfs.fallback config setting

8668b8e

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>

jeffhostetler force-pushed the jh/gvfs-helper-fallback-config branch from 2466cad to 8668b8e Compare July 1, 2024 14:04

derrickstolee approved these changes Jul 1, 2024

View reviewed changes

Merge remote-tracking branch 'microsoft/vfs-2.45.2' into jh/gvfs-help…

4bfdab6

…er-fallback-config Let's include #666 to let the PR builds pass. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

derrickstolee approved these changes Jul 1, 2024

View reviewed changes

dscho merged commit 648c5a2 into vfs-2.45.2 Jul 1, 2024
91 checks passed

dscho deleted the jh/gvfs-helper-fallback-config branch July 1, 2024 20:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gvfs-helper: add gvfs.fallback and unit tests #665

gvfs-helper: add gvfs.fallback and unit tests #665

jeffhostetler commented Jun 28, 2024

jeffhostetler commented Jun 28, 2024

dscho left a comment

dscho Jul 1, 2024

dscho commented Jul 1, 2024

jeffhostetler commented Jul 1, 2024

dscho commented Jul 1, 2024

dscho commented Jul 1, 2024

derrickstolee left a comment

derrickstolee left a comment

gvfs-helper: add gvfs.fallback and unit tests #665

gvfs-helper: add gvfs.fallback and unit tests #665

Conversation

jeffhostetler commented Jun 28, 2024

jeffhostetler commented Jun 28, 2024

dscho left a comment

Choose a reason for hiding this comment

dscho Jul 1, 2024

Choose a reason for hiding this comment

dscho commented Jul 1, 2024

jeffhostetler commented Jul 1, 2024

dscho commented Jul 1, 2024

dscho commented Jul 1, 2024

derrickstolee left a comment

Choose a reason for hiding this comment

derrickstolee left a comment

Choose a reason for hiding this comment