[JUJU-1868] Prevent replacement of nil request body on retry #98

manadart · 2023-01-19T12:48:56Z

When operating a MAAS controller via TLS, Juju can observe the following error during (sufficiently concurrent) provisioning:

net/http: cannot rewind body after connection loss

Based on pouring over the http standard library, this appears to happen for GET requests without a body, where the retry logic incorrectly replaces a nil body with a (non-nil) empty buffer.

Here we avoid doing the body reset when it is nil.

I was not able to recreate the exact failure scenario, but there is a new test that verifies that this change behaves correctly.

non-nil empty buffer when the body is nil. This should prevent observed "net/http: cannot rewind body after connection loss" errors.

manadart · 2023-01-19T13:42:50Z

/merge

#15096 The linked issue is being encountered when we saturate the provisioner worker pool and make many concurrent calls to the `ProvisioningInfo` API method for each machine pending provisioning. Within this method, we create an `environ`, which for MAAS makes calls to the provider API. Our API for MAAS had issues with its own retry logic that should be fixed under juju/gomaasapi#98. The new version of that library is recruited in this patch. In addition, we eschew here multiple calls to `ProvisioningInfo` for each machine. Instead, we get all provisioning info for pending machines in a single bulk call, and pass the appropriate info to each pooled job. The testing suite is improved by removal of some custom mock/stub logic in favour of a new generated mock. ## QA steps We don't have a MAAS with sufficient resources to provision many machines in parallel, but behaviour can be verified by repeatedly: - Adding a LXD model. - Running `juju add-machine -n 10` and waiting for quiescence. - Destroying the model. For regression, I also set my MAAS up for access via TLS and verified that provisioning still works. ## Documentation changes None. ## Bug reference https://bugs.launchpad.net/juju/+bug/1987009

Ensures that retried request dispatching does not reset the body with a

fb34589

non-nil empty buffer when the body is nil. This should prevent observed "net/http: cannot rewind body after connection loss" errors.

SimonRichardson approved these changes Jan 19, 2023

View reviewed changes

jujubot merged commit fee8032 into juju:v2 Jan 19, 2023

manadart deleted the v2-client-retry-fix branch January 19, 2023 13:47

manadart mentioned this pull request Jan 24, 2023

[JUJU-1868] Fix concurrent provisioning issues in MAAS juju/juju#15096

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JUJU-1868] Prevent replacement of nil request body on retry #98

[JUJU-1868] Prevent replacement of nil request body on retry #98

manadart commented Jan 19, 2023

manadart commented Jan 19, 2023

[JUJU-1868] Prevent replacement of nil request body on retry #98

[JUJU-1868] Prevent replacement of nil request body on retry #98

Conversation

manadart commented Jan 19, 2023

manadart commented Jan 19, 2023