Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

希望能够定制disruptor的提交方式 #754

Closed
tsgmq opened this issue Jan 13, 2022 · 5 comments · Fixed by #764
Closed

希望能够定制disruptor的提交方式 #754

tsgmq opened this issue Jan 13, 2022 · 5 comments · Fixed by #764
Milestone

Comments

@tsgmq
Copy link

tsgmq commented Jan 13, 2022

目前raft中都使用的tryPublishEvent的方式,当系统压力大的时候,整个集群就挂了,但是由于我们业务的特点,需要严格保序,前面一个logEntry没有执行成功,后面的也不能执行,所以更希望使用 void publishEvent(EventTranslator translator);
阻塞的方式严格保序执行,希望能够有些可以配置的地方,让业务方结合各自的特点来选择提交方式,
目前报错如下
com.alipay.sofa.jraft.error.RaftException: FSMCaller is overload.
at com.alipay.sofa.jraft.core.FSMCallerImpl.enqueueTask(FSMCallerImpl.java:236) ~[jraft-core-1.3.9.jar!/:?]
at com.alipay.sofa.jraft.core.FSMCallerImpl.onCommitted(FSMCallerImpl.java:245) ~[jraft-core-1.3.9.jar!/:?]
at com.alipay.sofa.jraft.core.BallotBox.setLastCommittedIndex(BallotBox.java:241) ~[jraft-core-1.3.9.jar!/:?]
at com.alipay.sofa.jraft.core.NodeImpl$FollowerStableClosure.run(NodeImpl.java:1881) ~[jraft-core-1.3.9.jar!/:?]
at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$AppendBatcher.flush(LogManagerImpl.java:469) ~[jraft-core-1.3.9.jar!/:?]
at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:569) ~[jraft-core-1.3.9.jar!/:?]
at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:496) ~[jraft-core-1.3.9.jar!/:?]
at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:168) [disruptor-3.4.2.jar!/:?]

@fengjiachun
Copy link
Contributor

看堆栈应该还是业务状态机的 onApply 慢了然后反压上去导致 fsm 过载了,先调大 disruptor 的 buf size 能解决问题吗?
老版本是 publishEvent 的,这里有比较复杂的原因

@killme2008
Copy link
Contributor

目前这个实现还有优化空间,原来是为了解决死锁问题,我们再看看。

@killme2008
Copy link
Contributor

不过目前如果你 apply(task) 是顺序提交的,肯定会确保顺序执行的。 内部的 fsm, logmanager 的 try publish 都是在锁内的,保证顺序

@tsgmq
Copy link
Author

tsgmq commented Jan 13, 2022

先调大 disruptor 的 buf size是可以解决问题的,但是我担心一直上调,会有一天调整的起不来了。我现在尽量减少流转批的size大小,让每个onApply尽量快点

googlespot pushed a commit to googlespot/sofa-jraft that referenced this issue Jan 24, 2022
googlespot pushed a commit to googlespot/sofa-jraft that referenced this issue Jan 24, 2022
googlespot pushed a commit to googlespot/sofa-jraft that referenced this issue Jan 25, 2022
@killme2008
Copy link
Contributor

这里我系统想个反压方案吧,这里核心的矛盾点是两个异步过程有相互调用,在满负载的情况下导致潜在的死锁_

googlespot pushed a commit to googlespot/sofa-jraft that referenced this issue Feb 16, 2022
googlespot pushed a commit to googlespot/sofa-jraft that referenced this issue Feb 16, 2022
@killme2008 killme2008 added this to the 1.4.0 milestone Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants