-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client/core: OrderBook update notifications races #1
base: master
Are you sure you want to change the base?
Conversation
client/core/bookie.go
Outdated
// Acquiring dc.booksMtx here in dc.bookie serves as a barrier for notifications that | ||
// come after we sent new subscription request (e.i. notes that need to be applied | ||
// on top of snapshot we get from new subscription request will wait on that mutex | ||
// here). Note, we don't acquire dc.booksMtx for the whole handleBookOrderMsg | ||
// duration; that leaves some room for notifications from previous subscription to | ||
// come and apply concurrently with us applying snapshot we just got from new | ||
// subscription request. | ||
// TODO: I'm not sure at the moment whether is OK to execute this notification from | ||
// old subscription while Resetting order book concurrently. Could this result into | ||
// mixed OrderBook state (for orders, candles cache, or something else) ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to know answer to that TODO question (that's in handleEpochReportMsg
func).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see anything in epoch_report
notification that would need to be atomic with respect to current orders
in the book or seq
number, so probably don't even need this "barrier" here (and are locking dc.booksMtx
in dc.bookie
just to get the bookie). Acted on TODO from above accordingly.
client/core/bookie.go
Outdated
// Acquiring dc.booksMtx here in dc.bookie serves as a barrier for notifications that | ||
// come after we sent new subscription request (e.i. notes that need to be applied | ||
// on top of snapshot we get from new subscription request will wait on that mutex | ||
// here). Note, we don't acquire dc.booksMtx for the whole handleBookOrderMsg | ||
// duration; that leaves some room for notifications from previous subscription to | ||
// come and apply concurrently with us applying snapshot we just got from new | ||
// subscription request. | ||
// TODO: I'm not sure at the moment whether is OK to execute this notification from | ||
// old subscription while Resetting order book concurrently. Could this result into | ||
// mixed OrderBook state (for orders, candles cache, or something else) ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to know answer to that TODO question (that's in handleEpochOrderMsg
func).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epoch orders don't seem to be interfering with outstanding orders, so looks like epoch_order
notification is meant to be executing concurrently with the rest order book updates (which it currently is both on master and in this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On closer inspection, it seems epoch orders (epoch_orders
notification and other related stuff such as epochQueues
, currentEpoch
and proofedEpoch
) are kinda half-baked now:
- it doesn't have anything to do with
orders
orseq
, but for some reasonseq
is incremented with eachepoch_orders
notification; we should probably removeseq
usage byepoch_orders
notification - but similar to
orders
<->seq
atomic relationshipepochQueues
<->currentEpoch
<->proofedEpoch
could benefit from atomicity because currently these fall out of sync quite often, which results in spamming something like this in logs (and what's more important long term, does not allow client to differentiate between server sending invalid match proof note and client's view on epoch orders being out of sync = non-atomic):
2023-03-09 05:28:53.450 [ERR] CORE: Route 'match_proof' notification handler error (DEX dex.decred.org:7232): match proof validation failed: epoch 167833983 match proof note references 1 orders, but local epoch queue is empty
...
2023-03-09 14:41:50.809 [ERR] CORE: Route 'match_proof' notification handler error (DEX dex.decred.org:7232): match proof validation failed: epoch 167837301 match proof note references 3 orders, but local epoch queue has 2
that atomicity isn't trivial to implement though ^ (and whether it should be done with current subscribe/resubscribe
mechanisms or have a completely separate parallel machinery working is something to investigate) and it's not high priority any time soon; so I'm downgrading log level to Trace
in this PR and supply relevant comments there too since it isn't useful to anyone to see as error the way it is currently implemented. Acted on TODO from above accordingly too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, saw relevant discussions somewhere around decred#278 (review), but nothing conclusive.
// Return any non-nil error, but still revoke purged orders (done already at | ||
// this point) and clear the book. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to revoke all active orders before we Reset order book ? We might want to do it to not hold dc.booksMtx
longer than necessary.
I didn't test this change extensively just yet. |
I think this is also likely fixing a reason (if not The reason) for decred#1543 if we get order book snapshot right after order qty got updated to 0 here and drop When testing I saw ghost order of non-zero quantity, meaning when resubscribe finished I saw 3 orders in the book on 1st |
6bfa596
to
8bef7bc
Compare
So far, this PR solves missing update notifications caused by insufficient This seems like a bug to me. I think state change shouldn't apply in that case, not just be logged as error, shall I change that (probably better to split it off #4) ? The comments/changes I've added in this PR so far assume just drop too. |
edf407d
to
6bbb728
Compare
This is caused by insufficient dc.booksMtx locking, this change takes care of: - clarifying a lot of non-trivial stuff around dc.booksMtx - handling client->sever subscribe/resubscribe/suspend flow such that server order book is observed in consistent state all the time by dexc client (and also feed receivers that subscribe to dexc order book feeds, which is just as important); that means it should fix order book notifications being dropped occasionally
6bbb728
to
87e1108
Compare
Just moved some unrelated comments out of this PR somewhere else in that last force-push ^. |
c1bcd0d
to
cee0a95
Compare
ad9d264
to
e298e04
Compare
f9959eb
to
a66bbae
Compare
e3c487c
to
333e9f6
Compare
This PR supersedes decred#2203.
Targeting 2 separate things:
chan
when possible)dc.booksMtx
, clarifying a lot of non-trivial stuff around it and handling sever subscribe/resubscribe/suspend flow such that server order book is observed in consistent state all the time bydexc
client (and alsofeed
receivers that subscribe todexc
order book feeds, which is just as important); that means it should fix order book notifications being dropped occasionally