-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client/core: async init/redeem requests #911
client/core: async init/redeem requests #911
Conversation
client/core/core_test.go
Outdated
func (tdb *TDB) setUpdateMatchHook(ch chan order.MatchStatus) { | ||
tdb.updateMatchMtx.Lock() | ||
defer tdb.updateMatchMtx.Unlock() | ||
tdb.updateMatchChan = ch | ||
} | ||
|
||
func (tdb *TDB) waitForMatchUpdate() (order.MatchStatus, error) { | ||
tdb.updateMatchMtx.RLock() | ||
defer tdb.updateMatchMtx.RUnlock() | ||
var latestStatus order.MatchStatus | ||
if tdb.updateMatchChan == nil { | ||
return latestStatus, errors.New("not configured to track match db updates") | ||
} | ||
select { | ||
case latestStatus = <-tdb.updateMatchChan: | ||
return latestStatus, nil | ||
case <-time.After(1 * time.Second): | ||
return latestStatus, errors.New("no match db update in 1 second") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous concurrent reading and writing of tdb.updateMatchChan
caused some panics for me. Introduced the mutex lock and these helper functions to prevent littering the file with mutex locks in each site where tdb.updateMatchChan
is set or accessed.
client/core/core_test.go
Outdated
select { | ||
case latestStatus = <-tdb.updateMatchChan: | ||
return latestStatus, nil | ||
case <-time.After(1 * time.Second): | ||
return latestStatus, errors.New("no match db update in 1 second") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also ran into a situation where the test previously expected one db update for setting both {M,T}aker{SwapCast,Redeemed}
statuses AND auth.{Init,Redeem}Sig
. Because only one db update was required, the tdb.updateMatchChan
was only drained once but the changes in this PR now triggers 2 db updates causing the channel to block when a third db update call was triggered by some other operation.
The test timed out before I could determine that the fix was to call tdb.waitForMatchUpdate()
twice rather than once for init and redeem scenarios. This change makes it easier and quicker to catch this kind of issue subsequently.
client/core/core_test.go
Outdated
// TakerSwapCast -> MatchComplete (MakerRedeem skipped when redeem ack is received with valid sig) | ||
status = <-rig.db.updateMatchChan | ||
if status != order.MatchComplete { | ||
t.Fatalf("wrong match status wanted %v, got %v", order.MatchComplete, status) | ||
// TakerSwapCast -> MakerRedeem update (after bcasting redeem) | ||
newMatchStatus, err = rig.db.waitForMatchUpdate() | ||
if err != nil { | ||
t.Fatalf("unexpected db error: %v", err) | ||
} | ||
if newMatchStatus != order.MakerRedeemed { | ||
t.Fatalf("wrong match status. wanted %v, got %v", order.MakerRedeemed, newMatchStatus) | ||
} | ||
// MakerRedeem -> MatchComplete update (after maker's redeem ack is received with valid sig) | ||
newMatchStatus, err = rig.db.waitForMatchUpdate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the change I alluded to in my previous comment. Broadcasting a redeem now triggers 2 db updates. The first update, sets the status to TakerSwapCast -> MakerRedeemed
for Maker instead of the previous single db update setting the status to TakerSwapCast -> MatchComplete
. The second db update, happens after sendRedeemAsync
completes successfully and sets the status to MakerRedeemed -> MatchComplete
.
54c5d4c
to
a596580
Compare
Running tests locally:
full
|
A panic when trying again: test logs
Will try tests a few times on master to be sure it isn't something there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After rebasing this onto current master, I don't see any test failures. So, maybe something was fixed?
Spoke too soon, this is with rebase onto master:
log
2021-01-15 16:31:25.116 [INF] TCORE: Audited contract (btc: 2575bee13ffd0bf9f91eab6aa34725d931c7bd63c34ab42388ba351f6290bfbe0714751f) paying to iaLeii2qF4g5nNCAMisbjJqHJh9DiZi3kG for order 28036b34d8ef269884dd53c34b8a631fedf824d7319ab39e7d73cc75da50bc6a, match 53809e4ed60a0e2af07f1b2a6bb5a6017a578a27cbdc20a1759f76b0889a83ce
--- FAIL: TestRefunds (0.05s)
core_test.go:3559: audit sig not set for taker
==================
WARNING: DATA RACE
Write at 0x00c0005deba0 by goroutine 149:
decred.org/dcrdex/client/core.(*trackedTrade).processAuditMsg.func1()
/home/joe/git/dcrdex/client/core/trade.go:1922 +0x63b
Previous read at 0x00c0005deba0 by goroutine 144:
decred.org/dcrdex/client/core.TestRefunds()
/home/joe/git/dcrdex/client/core/core_test.go:3558 +0x2609
testing.tRunner()
/home/joe/.local/go/src/testing/testing.go:1108 +0x202
Goroutine 149 (running) created at:
decred.org/dcrdex/client/core.(*trackedTrade).processAuditMsg()
/home/joe/git/dcrdex/client/core/trade.go:1907 +0x304
decred.org/dcrdex/client/core.handleAuditRoute()
/home/joe/git/dcrdex/client/core/core.go:4493 +0x2d4
decred.org/dcrdex/client/core.TestRefunds()
/home/joe/git/dcrdex/client/core/core_test.go:3545 +0x23d8
testing.tRunner()
/home/joe/.local/go/src/testing/testing.go:1108 +0x202
Goroutine 144 (running) created at:
testing.(*T).Run()
/home/joe/.local/go/src/testing/testing.go:1159 +0x796
testing.runTests.func1()
/home/joe/.local/go/src/testing/testing.go:1430 +0xa6
testing.tRunner()
/home/joe/.local/go/src/testing/testing.go:1108 +0x202
testing.runTests()
/home/joe/.local/go/src/testing/testing.go:1428 +0x5aa
testing.(*M).Run()
/home/joe/.local/go/src/testing/testing.go:1338 +0x4eb
decred.org/dcrdex/client/core.TestMain.func1()
/home/joe/git/dcrdex/client/core/core_test.go:918 +0x7e
decred.org/dcrdex/client/core.TestMain()
/home/joe/git/dcrdex/client/core/core_test.go:920 +0x176
main.main()
_testmain.go:127 +0x271
==================
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on a first pass.
proof := &match.MetaData.Proof | ||
coinID := []byte(coinIDs[i]) | ||
if match.Match.Side == order.Taker { | ||
match.SetStatus(order.MatchComplete) // could this cause the match to be retired before the `redeem` request succeeds? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this cause the match to be retired before the
redeem
request succeeds?
I think so. While it's critical to set the coin here, I think the status update could wait until after the redeem
message.
Or maybe you just need to check the RedeemSig
before retiring an order, in which case you would need to carefully revoke the match for most errors from the redeem
request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we're already checking the RedeemSig
in matchIsActive
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEXOrdersWithActiveMatches
does not check the redeemSig
if the match status is MatchComplete
and a dexc restart will cause this match to not be loaded. I'm re-considering updating the match status only after the redeem
request succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How bout fixing DEXOrdersWithActiveMatches
to work the same way as matchIsActive
? EDIT: Although that could make DEXOrdersWithActiveMatches
more expensive if it has to decode the match proof for all statuses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always thought both methods should behave similarly, but will wait to see what @buck54321 thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reflecting matrix discussion, I'd prefer that we not mark anything as MatchComplete
until all step are done. We can infer the sub-step from the coin IDs and sigs in the match metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on board with handling MatchComplete
as a special status for now, whereby it is only updated after all steps, with the longer-term goal of creating a TakerRedeemed
that is consistent with the other statues that reflect the transaction broadcast state. Both changes have potential to break things in lots of ways, but just changing MatchComplete
shouldn't be too bad. It will of course require attention to the match status resolvers however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolving this in #957.
9e3ea84
to
aab9d4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems good to go. The match status thing is a pre-existing bug, but if you choose to do the quicker fix in this PR, I'll swing back and re-approve.
aab9d4d
to
b24dd39
Compare
bcf4c22
to
67e2c6c
Compare
@JoeGruffins 67e2c6c should fix the issues uncovered by your go tests. |
@@ -24,7 +24,7 @@ do | |||
|
|||
# build and run tests | |||
if [ "$m" != '.' ]; then go build; fi | |||
env GORACE="halt_on_error=1" go test -race -short ./... | |||
env GORACE="halt_on_error=1" go test -race -short -count 1 ./... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intended for local testing, to re-run all tests even if results were previously cached. I'm not sure if CI caches test results, might it be necessary to ask CI to ignore cached results and do a more thorough testing? Will leave this here in the meantime to get opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never noticed a "(cached)" on CI, but this is fine I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All working well. Just a couple log lines need a small typo fixed.
For follow-up, we should look at using Core.wg
for the goroutines launched by sendInitAsync
and sendRedeemAsync
, plus wiring signAndRequest
>sendRequest
with Core
's context so that it can break out of the reply wait when Core
starts shutting down.
Resolves #863.