-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GetIsValid
bugs (once and for all?)
#279
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good!
I opened #284 to find out what the cumulative fix of all three approaches to fixing this getIsValid
bug is in the end (spoiler: I think it is a four-line diff it is a bit more than that). Please double-check and see if you agree.
Also a good idea to test with the seeds for the previous iterations of this bug; of course, it might be that these do no longer result in the same commands being generated due to unrelated changes.
@@ -436,109 +440,10 @@ addBlock cfg blk m = Model { | |||
(currentChain m) | |||
$ consideredCandidates | |||
|
|||
-- See note [Getting the valid blocks] just below | |||
valid' = foldl | |||
(Chain.foldChain (\s b -> Set.insert (blockHash b) s)) | |||
(valid m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this to extend the previous valid m
? The tests seem to pass even with
(valid m) | |
Set.empty |
Motivation is that this would be a difference to #284.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant towards this, because the model could "forget" valid blocks if the blocks are on a chain that isn't connected to the Genesis block anymore. That was the motivation for IntersectMBO/ouroboros-network#3651 (comment).
Assume we use Set.empty
instead. Test.O.C.ChainDB.StateMachine
mentions the following about garbage collection:
Garbage collection on the model is instantaneous, while garbage collection on the SUT is not. The real VolatileDB knows about the validity of all blocks in it, even if those blocks are disconnected.
If we ask the SUT if a disconnected block is valid, and it returns Just True
, then the model will return Nothing
(the model doesn't know), because the model only collected valid, connected blocks, and this results in a test failure. By extending valid m
, we can be reasonably sure that valid'
is a non-strict superset of the set of valid blocks that the SUT knows about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now also unsure whether we should have the Map.member hash (volatileDbBlocks m)
check in Model.getIsValid
, because it would suffer from the same non-determinism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point, I went back to 90b3bdf, where I can reproduce IntersectMBO/ouroboros-network#3389. Now it turns out that c8a6186 (what I originally thought would be enough as the entire fix, ie that was the fifth commit in #284 before I force-pushed it) does not fix the bug.
Instead, maintaining the set of valid blocks cumulatively does fix the bug, I pushed that know to #284. Note that it does not restrict the model getIsValid
to just blocks in the VolDB.
I'm now also unsure whether we should have the
Map.member hash (volatileDbBlocks m)
check inModel.getIsValid
, because it would suffer from the same non-determinism
So I think we could indeed remove this.
-- Considered candidates are candidates that extend the immutable part of | ||
-- the chain. Furthermore, the current selected chain is not considered as a | ||
-- candidate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could say that sortChains
takes the current chain separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add your suggestion in #279 (comment), then the comment can be reverted to what it previously was.
-- Considered candidates are candidates that extend the immutable part of | ||
-- the chain. Furthermore, the current selected chain is not considered as a | ||
-- candidate. | ||
consideredCandidates = filter ((/= CPS.chainState (cps m)) . fst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the other changes, is this now actually necessary to make any tests pass? Again, motivation is that this would be a difference to #284.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I don't think it's necessary. If the current chain is the best candidate, then chain selection is idempotent (not sure if it's the right word). Including the current chain if it isn't the best candidate shouldn't have any effect on the outcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use consideredCandidates
for valid'
as in https://github.com/input-output-hk/ouroboros-consensus/pull/279/files#r1291126148, that would also motivate not filtering out the current chain.
Good idea! It makes sense to compile everything (once and for all?)
I wasn't able to reproduce ouroboros-network#3389 and ouroboros-network#3440 with the seeds listed there, but the two more recent issues I could reproduce by reverting some changes to the model. Maybe we should test the seeds on #284 too? |
-- by real algorithm, but the set of known invalid blocks is a superset of | ||
-- the ones known by the real algorithm. See the note | ||
-- Ouroboros.Storage.ChainDB.StateMachine.[Invalid blocks]. | ||
(map fst candidates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As valid m
is cumulative, shouldn't it always be enough to use consideredCandidates
here? Both the model and the SUT will only ever validate blocks that are on a chain that extends the immutable chain.
I am asking this both as this is a difference to #284, and as consideredChains
is currently only used as an input to selectChain
, but not to build valid'
.
@@ -873,12 +778,8 @@ validChains cfg m bs = | |||
-- first, which results in the valid chain A->B', which is then chosen | |||
-- over the equally preferable A->B as it will be the first in the list | |||
-- after a stable sort. | |||
sortChains $ pruneKnownInvalid <$> chains bs | |||
sortChains $ chains bs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment above still accurate? Wouldn't the reader be puzzled about why we don't filter the invalid chains first as in the SUT?
-- Considered candidates are candidates that extend the immutable part of | ||
-- the chain. Furthermore, the current selected chain is not considered as a | ||
-- candidate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- Considered candidates are candidates that extend the immutable part of | |
-- the chain. Furthermore, the current selected chain is not considered as a | |
-- candidate. | |
-- Considered candidates are candidates that extend the immutable part of | |
-- the chain, __except for__ the current selected chain. |
Closed in favour of #284, which describes the cumulative fix of this and previous PRs related to |
Description
Closes #123.
This PR changes two parts of the ChainDB model:
valid
andinvalid
fields of the model are now updated more rigorously.This has two effects:
GetIsValid
command. The model might know in advance which blocks are valid/invalid even before the SUT knows about them, but the only condition we require for test success in the case of theGetIsValid
command is: if the SUT says that a block is valid or invalid, then the model should have the same response. Conversely, if the SUT says it does not know if a block is valid or invalid, then the model's response is ignored (or in other words, it can be anything).Future test failures
Assuming that above arguments are correct, if we find a
GetIsValid
failure again in the future, the first thing to check would be if we are somehow not recording some valid/invalid blocks. Model chain selection should not have to change.Logs
Known failing test seeds were used to rerun tests, and the tests pass.
ouroboros-consensus#123
Recreate test failure
... $ cabal run storage-test -- -p 'ChainDB q-s-m' --quickcheck-replay=616511 ouroboros-storage Storage ChainDB ChainDB q-s-m sequential: OK (78.88s) +++ OK, passed 100000 tests. ... All 1 tests passed (78.89s)
ouroboros-network#3689
ouroboros-network-3689.patch
Recreate test failure
... $ cabal run storage-test -- -p 'ChainDB q-s-m' --quickcheck-replay=455411 ouroboros-storage Storage ChainDB ChainDB q-s-m sequential: OK (80.63s) +++ OK, passed 100000 tests. ... All 1 tests passed (80.63s)
ouroboros-network#3389
... $ cabal run storage-test -- -p 'ChainDB q-s-m' --quickcheck-replay=999301 ouroboros-storage Storage ChainDB ChainDB q-s-m sequential: OK (83.41s) +++ OK, passed 100000 tests. ... All 1 tests passed (83.41s)
ouroboros-network#3440
... $ cabal run storage-test -- -p 'ChainDB q-s-m' --quickcheck-replay=758015 ouroboros-storage Storage ChainDB ChainDB q-s-m sequential: OK (78.40s) +++ OK, passed 100000 tests. ... All 1 tests passed (78.40s)