Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remaining data race in BurningManAccountingStore #7047

Conversation

stejbac
Copy link
Contributor

@stejbac stejbac commented Feb 24, 2024

Add missing synchronisation to the BurningManAccountingStore.toProtoMessage method, by first copying the internal list of accounting blocks inside a read-lock, prior to serialisation (still outside the lock, to maximise concurrency). Since we only make a shallow copy, this should be fast and take no more than a MB or so of extra memory.

This prevents a data race seen to cause a ConcurrentModificationException during store persistence, that sometimes occurred when the application resumed from a long suspension, as seen in the following log snippet (from a few months back, but probably still relevant):

Oct-11 07:28:15.987 [ForkJoinPool.commonPool-worker-2] INFO  b.c.d.b.a.s.BurningManAccountingStore: Add new accountingBlock at height 800762 at Sat Jul 29 19:20:05 PST 2023 with 1 txs 
Oct-11 07:28:16.023 [ForkJoinPool.commonPool-worker-2] INFO  b.c.d.b.a.s.BurningManAccountingStore: Add new accountingBlock at height 800763 at Sat Jul 29 19:31:01 PST 2023 with 2 txs 
Oct-11 07:28:16.050 [JavaFX Application Thread] ERROR b.c.p.PersistenceManager: Error in saveToFile toProtoMessage: BurningManAccountingStore, BurningManAccountingStore_v3 
Oct-11 07:28:16.052 [JavaFX Application Thread] ERROR b.c.s.CommonSetup: Uncaught Exception from thread JavaFX Application Thread 
Oct-11 07:28:16.052 [JavaFX Application Thread] ERROR b.c.s.CommonSetup: throwableMessage= java.util.ConcurrentModificationException 
Oct-11 07:28:16.052 [JavaFX Application Thread] ERROR b.c.s.CommonSetup: throwableClass= class java.lang.RuntimeException 
Oct-11 07:28:16.056 [ForkJoinPool.commonPool-worker-2] INFO  b.c.d.b.a.s.BurningManAccountingStore: Add new accountingBlock at height 800764 at Sat Jul 29 19:35:05 PST 2023 with 0 txs 
Oct-11 07:28:16.057 [JavaFX Application Thread] ERROR b.c.s.CommonSetup: Stack trace:
java.lang.RuntimeException: java.util.ConcurrentModificationException
        at bisq.common.persistence.PersistenceManager.persistNow(PersistenceManager.java:441)
        at bisq.common.persistence.PersistenceManager.persistNow(PersistenceManager.java:419)
        at bisq.common.persistence.PersistenceManager.lambda$maybeStartTimerForPersistence$8(PersistenceManager.java:407)
        at bisq.common.reactfx.FxTimer.lambda$restart$0(FxTimer.java:93)
        at com.sun.scenario.animation.shared.TimelineClipCore.visitKeyFrame(TimelineClipCore.java:239)
        at com.sun.scenario.animation.shared.TimelineClipCore.playTo(TimelineClipCore.java:180)
        at javafx.animation.Timeline.doPlayTo(Timeline.java:172)
        at javafx.animation.AnimationAccessorImpl.playTo(AnimationAccessorImpl.java:39)
        at com.sun.scenario.animation.shared.SingleLoopClipEnvelope.timePulse(SingleLoopClipEnvelope.java:103)
        at javafx.animation.Animation.doTimePulse(Animation.java:1186)
        at javafx.animation.Animation$1.lambda$timePulse$0(Animation.java:204)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
        at javafx.animation.Animation$1.timePulse(Animation.java:203)
        at com.sun.scenario.animation.AbstractPrimaryTimer.timePulseImpl(AbstractPrimaryTimer.java:343)
        at com.sun.scenario.animation.AbstractPrimaryTimer$MainLoop.run(AbstractPrimaryTimer.java:266)
        at com.sun.javafx.tk.quantum.QuantumToolkit.pulse(QuantumToolkit.java:559)
        at com.sun.javafx.tk.quantum.QuantumToolkit.pulse(QuantumToolkit.java:543)
        at com.sun.javafx.tk.quantum.QuantumToolkit.pulseFromQueue(QuantumToolkit.java:536)
        at com.sun.javafx.tk.quantum.QuantumToolkit.lambda$runToolkit$11(QuantumToolkit.java:342)
        at com.sun.glass.ui.InvokeLaterDispatcher$Future.run(InvokeLaterDispatcher.java:96)
        at com.sun.glass.ui.gtk.GtkApplication._runLoop(Native Method)
        at com.sun.glass.ui.gtk.GtkApplication.lambda$runLoop$11(GtkApplication.java:277)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.util.ConcurrentModificationException
        at java.base/java.util.LinkedList$LLSpliterator.forEachRemaining(LinkedList.java:1246)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
        at bisq.core.dao.burningman.accounting.storage.BurningManAccountingStore.toProtoMessage(BurningManAccountingStore.java:157)
        at bisq.common.proto.persistable.PersistableEnvelope.toPersistableMessage(PersistableEnvelope.java:30)
        at bisq.common.persistence.PersistenceManager.persistNow(PersistenceManager.java:427)
        ... 22 more
 
Oct-11 07:28:16.081 [ForkJoinPool.commonPool-worker-2] INFO  b.c.d.b.a.s.BurningManAccountingStore: Add new accountingBlock at height 800765 at Sat Jul 29 19:38:42 PST 2023 with 0 txs 
Oct-11 07:28:16.109 [ForkJoinPool.commonPool-worker-2] INFO  b.c.d.b.a.s.BurningManAccountingStore: Add new accountingBlock at height 800766 at Sat Jul 29 19:45:33 PST 2023 with 2 txs 

Most of the accounting store data races were fixed a year ago in #6551, which ensured all access to the internal linked list of accounting blocks is synchronised, except that via toProtoMessage.

Add missing synchronisation to the 'toProtoMessage' method, by first
copying the internal list of blocks inside a read-lock, prior to
serialisation (still outside the lock, to maximise concurrency). Since
we only make a shallow copy, this should be fast and take no more than a
MB or so of extra memory.

This prevents a race seen to cause a ConcurrentModificationException
during store persistence, that sometimes occurred when the application
resumed from a long suspension.
Remove the last 10 blocks one-by-one from the end of the internal linked
list of blocks, instead of rebuilding a truncated list from scratch.
(This all takes place within a write-lock anyway, so it's atomic.)
Copy link
Contributor

@alvasw alvasw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

Thank you!

Copy link
Contributor

@alejandrogarcia83 alejandrogarcia83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@alejandrogarcia83 alejandrogarcia83 merged commit f941ef3 into bisq-network:master Feb 25, 2024
3 checks passed
@alejandrogarcia83 alejandrogarcia83 added this to the v1.9.16 milestone Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants