Optimize performance for h2c protocol #1400
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
优化效果
MacOS 13.3.1 (a) (22E772610a)
六核Intel Core i7 16G
mac OS 10.15.7
JDK 1.8.0_291
VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
VM options: -Xmx1g -Xms1g -XX:MaxDirectMemorySize=4g -XX:+UseG1GC -Djmh.ignoreLock=true -Dserver.host=localhost -Dserver.port=12200 -Dbenchmark.output=
Blackhole mode: full + dont-inline hint
Warmup: 1 iterations, 10 s each
Measurement: 1 iterations, 300 s each
Timeout: 10 min per iteration
Threads: 1000 threads, will synchronize iterations
Benchmark mode: Throughput, ops/time
Benchmark: com.alipay.sofa.benchmark.Client.existUser
思路
在当前
com.alipay.sofa.rpc.transport.netty.NettyChannel#writeAndFlush
的代码如下在Netty4+的版本我们通过源码可以看到,当调用channel的writeAndFlush方法时,Netty4会判断当前发送请求的线程是否是当前channel所绑定的EventLoop线程,如果不是EventLooop则会构造一个写任务WriteTask并将其提交到EventLoop中稍后执行。
从上面的代码我们可以知道Netty4写消息时总是会保证把任务提交到EventLoop线程上处理,而
每调度一次EventLoop线程去执行写任务WriteTask只能写一个消息
,也就是这时候是一对一的。那么这个时候我们可以考虑将所有的消息都先提交到一个WriteQueue消息写队列上,内部会获取一次EventLoop并提交一个任务,然后从消息队列上不断的取消息出来并调用Netty4的write。
com.alipay.sofa.rpc.common.BatchExecutorQueue#run
部分代码执行该flush的逻辑时,是处于EventLoop线程的,而从前面的Netty源码我们知道,当写动作处于EventLoop线程中时是会立即执行写动作的,此时不会出现线程切换的行为。那么相较于之前每次都直接在用户线程中调用writeAndFlush而言,大幅度的减少了用户线程与EventLoop线程的切换次数,也使得一次WriteTask写出的消息数量有了大幅度提高,达到批量发包的效果。
示意图如下