-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport TCP window, client closing fixes to 6.0.x #50
Backport TCP window, client closing fixes to 6.0.x #50
Conversation
a4c726e
to
879be71
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a minimal set of specs and found multiple additional bugs around shutdown sequencing.
They are all fixed here, and will need to be forward-ported to main
/6.1.
if server? | ||
# server-mode clean-up | ||
@closed.make_true | ||
@server_socket.shutdown rescue nil if @server_socket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to be forward-ported to main
/6.1 series, which sends TCPServer#close
instead of this TCPServer#shutdown
, and causes the plugin to crash during shutdown because a blocking call on TCPServer#accept
throws an IOError
when the server is closed out of under it.
end | ||
else | ||
# client-mode clean-up | ||
@client_socket&.close |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All related @client_socket
bits will need to be forwarded to main
/6.1-series. It cannot be a local variable in register
because we need to be able to close it in our close
method
lib/logstash/outputs/tcp.rb
Outdated
|
||
# Now send the payload | ||
client_socket.syswrite(payload) if w.any? | ||
@logger.trace("transmitting #{payload.bytesize} bytes", host: @host, port: @port, socket: @client_socket&.peer) if @logger.trace? && payload && !payload.empty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logging should be forward-ported to main
/6.1 series
lib/logstash/outputs/tcp.rb
Outdated
:exception => e, :backtrace => e.backtrace) | ||
client_socket.close rescue nil | ||
client_socket = nil | ||
log_warn "client socket failed:", e, host: @host, port: @port, socket: @client_socket&.peer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TCPSocket#to_s
is useless object inspection, so we should send TCPSocket#peer
, which gives us useful information about the socket we are connected to. This will need to be forward-ported to main
/6.1 too.
@@ -73,4 +73,109 @@ | |||
end | |||
end | |||
end | |||
|
|||
context 'client mode' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all new specs will need to be forward-ported to main
/6.1 series as well.
f0b8b20
to
ea5c5c9
Compare
8f6b61a
to
dd06530
Compare
dd06530
to
3a6113d
Compare
CI is failing intermittently for the cross-ported fixes for server mode, and I am having a very difficult time figuring out why as everything is very consistently green in local docker tests (on my M1 Mac) with identical parameters. Every single Since 3a6113d all of the failures have been in the Logstash 7 series, so I am beginning to think that I have found a bug in Jruby. |
🤦🏼 when I reduced the amount of memory available to my local docker, I was able to reproduce.
EDIT: that wasn't it either. Because server-mode's |
because server-mode's TCP#receive does not block, we need to wait until the client thread has popped the payload from its queue and sent it over the socket before telling the TCP output to close, or else our close interrupts an in-flight event and our specs get a partial payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I've left a comment on a doubt.
This is a follow up thought I had reviewing the code, but it's more related to the plugin implementation than this PR. I would really appreciate to listen to your thoughts. I think we have a problem in the shutdown of client threads in server-mode. loop do
begin
payload = @queue.pop
@socket.write(payload)
rescue
break
end
end The close method of the def close
@socket.close
end The client threads could be blocked in 2 places:
So my point is that we need to improve the shutdown sequence, interrupting the thread and joining it. |
I agree. There are a lot of places in this plugin that need improvement. In server mode the queue for each client is also unbounded. Lots of room for improvement. In the end, though, this PR makes the plugin significantly better than it was before and I would rather deliver these improvements immediately and chase down the further fixes separately.
You are correct; both I'll then merge this PR, forward-port the relevant bits to the 6.1 series, and then file separate work to chase down these other issues. (including hopefully making 6.1 compatible with Logstash 7.x again so that we don't have to maintain two somewhat-significantly-diverged branches). |
I'm ok to merge this PR as it's. A lot of work has been done! My last comment was intended to open the discussion on what I've seen reviewing it, but that's not related to this specific PR. |
Because 6.1.0 changed this plugin's dependency on Logstash to require Logstash 8 and we still commercially support Logstash 7.13+, I have branched from 6.0.2 and am backporting relevant fixes to the 6.0.x series.
This is not typical SEMVER, and I would like to find a way to separately get 6.1.x of this plugin running on Logstash 7, even if it is without one or more of the features that were introduced for the 6.1 series of this plugin (e.g., works unless you try to use the new feature on LS7).
Here, we manually backport two main fixes, each in its own commit for clarity:
IO#syswrite
return size. #49)We also:
IO#syswrite
return size. #49 that handled overlarge payloads in client mode and applied them to server-mode.Unfortunately, in addition to breaking compatibility with LS 7.x, #47 also introduced two bugs that were fixed in #49 (repeating thread names, logger access), one more that is first-addressed here (crash while closing plugin), and will need to be forward-ported) so the lines are muddled.
I have elected to group whole fixes by commit as much as possible, instead of doing true backports.