Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomized Fuzzer: tried to delete a non-existent validator #2326

Closed
4 tasks
ValarDragon opened this issue Sep 13, 2018 · 9 comments
Closed
4 tasks

Randomized Fuzzer: tried to delete a non-existent validator #2326

ValarDragon opened this issue Sep 13, 2018 · 9 comments
Assignees

Comments

@ValarDragon
Copy link
Contributor

Steps to Reproduce

I got the following from test_sim_modules on develop, in the staking simulation. Re-running a few times will probably recreate another failing seed.

Starting SimulateFromSeed with randomness created with seed 1536821388207640227
Starting the simulation from time Tue Jul 10 10:36:43 UTC 30723, unixtime 907373443003
Simulating... block 9/10, operation 50/70.  --- FAIL: TestStakeWithRandomMessages (0.67s)
	random_simulate_blocks.go:367: tried to delete a nonexistent validator
FAIL

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@rigelrozanski
Copy link
Contributor

rigelrozanski commented Sep 13, 2018

oh that one might have been introduced by staking transient store… pretty confused as to why though - I can’t reproduce locally with make test_sim_modules (I’ve tried a couple dozen times) although I did notice it failed non-deterministically on the transient store PR once (#2310)

@ValarDragon
Copy link
Contributor Author

Its failing on most PR's: https://circleci.com/gh/cosmos/cosmos-sdk/30294?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link, I guess we should make a command so its easy to reproduce the seed for these failing simulations.

@alexanderbez
Copy link
Contributor

@ValarDragon do most of those PRs have develop with transient store support merged in/rebased?

@ValarDragon
Copy link
Contributor Author

I think so. It seems likely to me that transient store break this, as the failures began after that was merged, and are happening really early on. (On the weekend I was running a couple simulations that got to 1500 blocks, and many that failed at like blocks 100-200, never sub 10 blocks.)

@rigelrozanski
Copy link
Contributor

I'll think we should hold off on heavily investigating this error specifically until we have done the refactors as the update store will be removed at that point (however the transient store should still be getting used).

Unless of course this is due to some new form of non-determinism IN the transient store (which is seems as though it may be? - so maybe actually still valuable to still investigate

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Sep 14, 2018

There isn't non-determinism. The reason for it only sometimes failing is because gaia_test_sim_modules is set to use a different seed each run. Maybe transient store isn't getting set again properly?

@alexanderbez
Copy link
Contributor

So how should we proceed with this issue?

@rigelrozanski
Copy link
Contributor

still think this is low priority short term, as the solution may be thrown away - But if you're looking for something to do then it may reveal some valuable deeper level bug to do with the transient store. However - there is also a store refactor that joon is working on (which may affect this?) - so yeah probs not the best way to spend time

@cwgoes
Copy link
Contributor

cwgoes commented Oct 2, 2018

I expect whatever this was has been fixed by #2394 - at minimum, the old code path is no longer relevant.

Reopen if it can be replicated.

@cwgoes cwgoes closed this as completed Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants