-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: only destroy socket once error is sent #26356
Conversation
This patch fixes a race condition: sometimes the socket is destroyed before the data is sent. If node is proxied by, for example, an AWS ELB, a 504 error would be returned instead of the expected 400.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be so kind and provide a regression test for this as well?
This popped up in review comments here #24757. |
It would be nice to have a regression test but I think it's not easy because the test added in #24757 works and it's not flaky. I mean, testing with no delay does not reproduce the issue. |
This does not completely fix the 504 errors I am seeing after all 😞 Not sure if much better than without the fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we should land this without a unit test.
Moreover, this is
this.write(response); | ||
this.write(response, () => { | ||
this.destroy(e); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't think there is a guarantee that this callback is actually called.
Let me know if this comment is too much of a support issue and not a bug issue and I can move it somewhere else. I guess I don't know if this is a bug in node or not and I don't know how to find out if it is. I also can't find anyone else with a similar issue. Hey @yannh, I think I have a similar or same problem as you and I'm a bit lost on how I could fix it. I feel like I have exhausted all of my options and maybe the bug might be in AWS ELB or node itself. I have an AWS ELB pointing to a node express application running in kubernetes and I get intermittent 504s. I have simplified my code to a basic hello world server like so:
I get my app deployed and I do run this I still get 504s. I've also tried
Then I thought that maybe the problem exists in
I tried using I also tried a python server with Keep in mind, I can't reproduce the 504s consistently, so I can confirm that it doesn't work, but it's hard to say that it works 100%. I just poll it for ~ 1 hour with no 504s. Another thing I tried is to I did a I don't know what else I can try to trace the packet from being received on the conatiner -> being sent to the node application. I don't know if node or the ELB is closing the tcp connection and causing a 504. I'm not sure what else to try. I think the AWS ALB might have worked. I have to do more testing with it before I switch to it. EDIT: I tried the ALB and polled it for more than an hour without any 504s. I'm still not sure if this is an ELB, node, or my own code issue. I think the ALB uses HTTP/2, at least that's what I see in the HTTP response. Not sure if that is relevant. |
@dosentmatter you should handle the |
@dosentmatter > Yes, this is the same exact issue I've been chasing too (also ELBs, express, Kubernetes). I have been doing the same tests that you are doing. There is one thing you should ensure - is that the Express connection timeout it set to a higher number than the ELB timeout (search for ELB / timeout). Still that hasn't solved it for me. I will close this ticket as I have seen this happen on regular requests too. |
@yannh, I haven't tried changing the timeout for express. I think that would be I read the troubleshooting ELB 504 page
#13391 isn't exactly the same issue, but I tried I've also tried the last 3 LTS - 6, 8, and 10. Hmm, thanks for the article. I'll check it out. If you don't have anymore time to debug, maybe you can switch to an Application Load Balancer (ALB) and come back to the issue later. It seems to work for me. Thanks, @mcollina, I'll try out the I read that the HTTP header size for node was 80KB and is currently 8KB. My large header is smaller than that, ~5KB. The rest of the headers are pretty small - less than 100 bytes. @yannh, do you have anything out of the ordinary going on with your headers? |
This patch fixes the race condition: sometimes the socket is
destroyed before the data is sent. If node is proxied by, for
example, an AWS ELB, a 504 error would be returned instead
of the expected 400.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes