Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netty Connector's JerseyClientHandler should defer notification for arriving content #4285

Closed
olotenko opened this issue Oct 10, 2019 · 6 comments · Fixed by #4312
Closed

Comments

@olotenko
Copy link

olotenko commented Oct 10, 2019

JerseyClientHandler overrides only channelRead0. Each socket read may produce more than one channelRead0 invocation. This produces unnecessary wake ups for the thread processing the response:

- first this, when the Response headers become available, then this: - when the Response body chunk becomes available (or similar invocation for LastHttpContent a few lines further down). The timings observed in the tests suggest that these two manage to wake up the thread twice even for trivial responses, wasting CPU on wake-up calls and context switches.

JerseyClientHandler should override channelReadComplete, and defer the wake up until either LastHttpContent becomes available, or such channelReadComplete is invoked, whichever comes first.

A proof of concept is available to demonstrate the reduction in CPU utilisation.

@olotenko
Copy link
Author

olotenko commented Oct 10, 2019

netty_defer_notify.zip

The changes to NettyInputStream may be a bit over the top - I was trying to see what it takes to get rid of LinkedBlockingQueue, because it isn't really a highly concurrent case: one and only one thread produces ByteBuffers, one and only one consumes them, contention very rare - should be fine with simpler low overhead synchronization mechanisms.

The only concurrent cases where it even makes sense to have this ping-pong is when the Client attempts to consume the Response body as an InputStream. In all other cases the Client is unable to proceed until the Response is available in its entirety. (eg JSON deserialization cannot allow the Client to proceed partway through receiving the Response body) So there seems to be some benefit from not notifying until LastHttpContent at all, if we can tell this is not consumed as an InputStream. This POC does not attempt to make such guesses.

I am not sure the changes to JerseyClientHandler are complete, either. There seems to be some muddle about having two CompletableFutures to complete. The key change here is the notifyResponse, and the decisions whether to do it in the asynchronous thread pool (when we cannot tell that it won't get stuck in a blocking call to read the rest of the Response) or the same pool (when we know it has all the information from the Response - cannot get blocked waiting for more information to be read from the same socket by the thread it is blocking). This latter decision may be needs to be configurable, because it should be safe to do only for Clients written consciously avoiding any blocking operations whatsoever. But the idea is removing the context switches altogether, when the responses are available in single socket reads. The response times can be seen to be lower even with just the deferred notification, ie always handing off to the async thread pool.

Another important difference - retaining Netty's ByteBuffers until the data has been copied into the byte[] in NettyInputStream.read(...), thus eliminating some copying and memory allocations.

@olotenko
Copy link
Author

netty_defer_notify.zip
This is a reworked Netty connector POC that I am more happy with. netty_defer_notify.diff is against 66f5cfa. It contains a crude fix for issue #4286. netty_defer_notify_less_async.diff is on top of netty_defer_notify.diff.

The key idea of the refactoring is to separate concerns, and make it plain who is responsible for what. Like, responseAvailable is a CompletableFuture that is guaranteed to be signalled when (part of) Response becomes available, and responseDone is guaranteed to be signalled when the end of Response has been received. Managing the channel pipeline then is concentrated in one place, instead of spreading across different files. Managing error propagation is also tied to one place, instead of propagation of CloseListeners or notification of multiple futures everywhere. Interaction with AsyncConnectorCallback is separated from the generic routine with CompletableFutures.

Note Netty's ByteBufs are retained, thus eliminates one level of ByteBuffer allocation and copying. This will be helpful for well-behaved apps that remember to read the body and close the body InputStream, including Jersey reading Entity. Ill-behaved apps can cause Netty to detect that the ByteBufs are not returned properly to the pool, but I propose to bank on doing the well-behaved apps a good service.

Testware update: a couple of tests are not well-behaved apps :) They receive HTTP 200, but do not dispose of Response properly.

netty_defer_notify_less_async.diff makes an additional speculative assumption that when the Response body is available in its entirety, a well-behaved app will utilise the CPU more efficiently, if it is allowed to process the Response on the same Netty thread - it will not block waiting for the data from the same socket while reading response Entity, and will use asynchronous API to communicate with any additional services, or interact with any other blocking services. This behaviour may need guarding with a configuration option.

010gvr added a commit to 010gvr/jersey that referenced this issue Nov 6, 2019
This commit contains a rewritten implementation of Netty Jersey server and client with the following major changes:
1) Consume `ByteBuf`s from `HttpContent` as-is for processing, reduces a lot of overhead.
2) Fixes a bug on reading 0-byte or 1-byte JSON content
3) Add three new configurable `ChunkedInput` buffer properties through `jersey.ci.capacity`, `jersey.ci.read.timeout` and `jersey.ci.write.timeout`
4) Add configurable payload size through `jersey.max.http.request.entitySizeMb` , defaults to 50 Mb

This change should fix some of the long pending bug reports in eclipse-ee4j#3500 , eclipse-ee4j#3568 and eclipse-ee4j#4286 . This may also resolve eclipse-ee4j#4285

Signed-off-by: Venkat Ganesh 010gvr@gmail.com
010gvr added a commit to 010gvr/jersey that referenced this issue Nov 6, 2019
This commit contains a rewritten implementation of Netty Jersey server and client with the following major changes:
1) Consume `ByteBuf`s from `HttpContent` as-is for processing, reduces a lot of overhead.
2) Fixes a bug on reading 0-byte or 1-byte JSON content
3) Add three new configurable `ChunkedInput` buffer properties through `jersey.ci.capacity`, `jersey.ci.read.timeout` and `jersey.ci.write.timeout`
4) Add configurable payload size through `jersey.max.http.request.entitySizeMb` , defaults to 50 Mb

This change should fix some of the long pending bug reports in eclipse-ee4j#3500 , eclipse-ee4j#3568 and eclipse-ee4j#4286 . This may also resolve eclipse-ee4j#4285

Signed-off-by: Venkat Ganesh 010gvr@gmail.com
010gvr added a commit to 010gvr/jersey that referenced this issue Nov 6, 2019
This commit contains a rewritten implementation of Netty Jersey server and client with the following major changes:
1) Consume `ByteBuf`s from `HttpContent` as-is for processing, reduces a lot of overhead.
2) Fixes a bug on reading 0-byte or 1-byte JSON content
3) Add three new configurable `ChunkedInput` buffer properties through `jersey.ci.capacity`, `jersey.ci.read.timeout` and `jersey.ci.write.timeout`
4) Add configurable payload size through `jersey.max.http.request.entitySizeMb` , defaults to 50 Mb

This change should fix some of the long pending bug reports in eclipse-ee4j#3500 , eclipse-ee4j#3568 and eclipse-ee4j#4286 . This may also resolve eclipse-ee4j#4285

Signed-off-by: Venkat Ganesh 010gvr@gmail.com
010gvr added a commit to 010gvr/jersey that referenced this issue Nov 6, 2019
This commit contains a rewritten implementation of Netty Jersey server and client with the following major changes:
1) Consume `ByteBuf`s from `HttpContent` as-is for processing, reduces a lot of overhead.
2) Fixes a bug on reading 0-byte or 1-byte JSON content
3) Add three new configurable `ChunkedInput` buffer properties through `jersey.ci.capacity`, `jersey.ci.read.timeout` and `jersey.ci.write.timeout`
4) Add configurable payload size through `jersey.max.http.request.entitySizeMb` , defaults to 50 Mb

This change should fix some of the long pending bug reports in eclipse-ee4j#3500 , eclipse-ee4j#3568 and eclipse-ee4j#4286 . This may also resolve eclipse-ee4j#4285

Signed-off-by: Venkat Ganesh 010gvr@gmail.com
@olotenko
Copy link
Author

olotenko commented Nov 29, 2019

@senivam I am not sure why this is closed. This issue is more than just the use of buffers. Please, justify.

@senivam senivam reopened this Nov 29, 2019
@senivam
Copy link
Contributor

senivam commented May 15, 2020

this solution (netty_defer_notyfy.zip) was integrated into Jersey within #4286. Afterwards there were some attempts to re-do thread-pulling using native Netty API but it appears the actual solution is more efficient. So, I would close this as resolved.

@olotenko
Copy link
Author

@senivam makes sense.

@senivam
Copy link
Contributor

senivam commented May 18, 2020

thank you, closing

@senivam senivam closed this as completed May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants