Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-7198 control: Use join request context instead of timeout #5380

Merged
merged 1 commit into from
Apr 9, 2021

Conversation

mjmac
Copy link
Contributor

@mjmac mjmac commented Apr 8, 2021

In the MS joinLoop, a short timeout was used to avoid blocking
the loop if a join request handler exited before receiving the
batched join response. Rather than relying on an arbitrary
timeout value, we should instead pass in the join request's
context so that we can correctly wait for the handler to receive
the response or for the context to be canceled. Either way, it
won't block the join loop indefinitely.

Copy link
Collaborator

@daosbuild1 daosbuild1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No errors found by checkpatch.

@mjmac mjmac self-assigned this Apr 8, 2021
Copy link
Collaborator

@daosbuild1 daosbuild1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No errors found by checkpatch.

In the MS joinLoop, a short timeout was used to avoid blocking
the loop if a join request handler exited before receiving the
batched join response. Rather than relying on an arbitrary
timeout value, we should instead pass in the join request's
context so that we can correctly wait for the handler to receive
the request or for the context to be canceled. Either way, it
won't block the join loop indefinitely.

Signed-off-by: Michael MacDonald <mjmac.macdonald@intel.com>
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-5380/4/display/redirect

Copy link
Collaborator

@daosbuild1 daosbuild1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No errors found by checkpatch.

@mjmac mjmac requested review from tanabarr and kjacque April 9, 2021 15:52
@@ -388,6 +387,7 @@ func (svc *mgmtSvc) Join(ctx context.Context, req *mgmtpb.JoinReq) (*mgmtpb.Join
bjr := &batchJoinRequest{
JoinReq: *req,
peerAddr: replyAddr,
joinCtx: ctx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here we are ignoring the language's recommendation not to include contexts inside structs: https://blog.golang.org/context-and-structs . It should be noted that there is discussion to relax that recommendation golang/go#22602 and given this is a temporary storage in a request it seems okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the guidance is not really about having a context in a struct -- i.e. it's not that there is something inherently unsafe about this, it's that storing a context in a long-lived struct rather than accepting a context in that struct's methods is an anti-pattern for various reasons explained elsewhere.

From the second link:

While we've told people not to add contexts to structs, I think that guidance is over-aggressive. The real advice is not to store contexts. They should be passed along like parameters. But if the struct is essentially just a parameter, it's okay. I think this concern can be addressed with package-level documentation and examples.

The batchJoinRequest struct is indeed a parameter and it's used to convey information from the Join RPC handler into the joinLoop.

@jolivier23 jolivier23 merged commit 52539bd into master Apr 9, 2021
@jolivier23 jolivier23 deleted the mjmac/DAOS-7198 branch April 9, 2021 21:35
mjmac added a commit that referenced this pull request Apr 9, 2021
In the MS joinLoop, a short timeout was used to avoid blocking
the loop if a join request handler exited before receiving the
batched join response. Rather than relying on an arbitrary
timeout value, we should instead pass in the join request's
context so that we can correctly wait for the handler to receive
the request or for the context to be canceled. Either way, it
won't block the join loop indefinitely.

Master-PR: #5380

Signed-off-by: Michael MacDonald <mjmac.macdonald@intel.com>
johannlombardi pushed a commit that referenced this pull request Apr 12, 2021
In the MS joinLoop, a short timeout was used to avoid blocking
the loop if a join request handler exited before receiving the
batched join response. Rather than relying on an arbitrary
timeout value, we should instead pass in the join request's
context so that we can correctly wait for the handler to receive
the request or for the context to be canceled. Either way, it
won't block the join loop indefinitely.

Master-PR: #5380

Signed-off-by: Michael MacDonald <mjmac.macdonald@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants