Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: connection closed on receiving a requeue timeout higher than the MaxReqTimeout #865

Closed
tsholmes opened this issue Mar 23, 2017 · 5 comments
Labels

Comments

@tsholmes
Copy link
Contributor

Running on 0.3.8, however the same issue is on 1.0.0.

I believe the correct behavior here should be returning an error to the client, not killing the connection (protocol.NewClientErr vs protocol.NewFatalClientErr).

nsq/nsqd/protocol_v2.go

Lines 715 to 718 in fc261f9

if timeoutDuration < 0 || timeoutDuration > p.ctx.nsqd.getOpts().MaxReqTimeout {
return nil, protocol.NewFatalClientErr(nil, "E_INVALID",
fmt.Sprintf("REQ timeout %d out of range 0-%d", timeoutDuration, p.ctx.nsqd.getOpts().MaxReqTimeout))
}

A snippet from logs scoped to one failing client:

[nsqd] 2017/03/23 22:56:42.363204 [10.50.102.108:57424] IDENTIFY: {ShortID:e24143c26459 LongID:e24143c26459 ClientID:e24143c26459.1 Hostname: HeartbeatInterval:0 OutputBufferSize:0 OutputBufferTimeout:0 FeatureNegotiation:true TLSv1:false Deflate:false DeflateLevel:0 Snappy:false SampleRate:0 UserAgent:nsq.js/0.15.1 MsgTimeout:900000}
[nsqd] 2017/03/23 22:56:42.591672 ERROR: [10.50.102.108:57424] - E_INVALID REQ timeout 3897854000000 out of range 0-3600000000000
[nsqd] 2017/03/23 22:56:42.591710 PROTOCOL(V2): [10.50.102.108:57424] exiting ioloop
[nsqd] 2017/03/23 22:56:42.591744 ERROR: client(10.50.102.108:57424) - E_INVALID REQ timeout 3897854000000 out of range 0-3600000000000
[nsqd] 2017/03/23 22:56:42.591807 PROTOCOL(V2): [10.50.102.108:57424] exiting messagePump
@mreiferson
Copy link
Member

Hi @tsholmes — this is working as intended. Errors related to a client IDENTIFYing are fatal, because otherwise a consumer would proceed in an unrecoverable and unexpected state.

@mreiferson mreiferson changed the title nsqd drops connection on receiving a requeue timeout higher than the MaxReqTimeout nsqd: connection closed on receiving a requeue timeout higher than the MaxReqTimeout Mar 24, 2017
@tsholmes
Copy link
Contributor Author

@mreiferson I'm going to have to disagree, for two reasons.

  1. If a client identifies with a MsgTimeout that is higher than MaxMsgTimeout, it just keeps the default value [1]
  2. This error happens not on identify, but on requeue. We are running with a maxInFlight of >1, so there are other messages in flight on this connection that would normally succeed that are dropped when the connection is dropped, resulting in duplicate processing by another consumer.

[1] https://github.com/nsqio/nsq/blob/master/nsqd/client_v2.go#L457-L459

@mreiferson
Copy link
Member

Ahh, you're right! I misread the logs/description, sorry.

You interested in opening a PR to fix?

@tsholmes
Copy link
Contributor Author

@mreiferson got one ready in #868

@mreiferson
Copy link
Member

😍 thanks! Closing in favor of #868.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants