Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Ensure election offchain workers don't overlap #8828

Merged
15 commits merged into from
May 18, 2021
Merged

Conversation

kianenigma
Copy link
Contributor

  • make sure we only recompute if
    1. no solution was cached
    2. the cached solution was not feasible.

The most common scenario is that we have a cached solution, and it has a worse score. In this case, we do nothing from now on.

@kianenigma kianenigma marked this pull request as ready for review May 17, 2021 10:46
@github-actions github-actions bot added the A0-please_review Pull request needs code review. label May 17, 2021
Copy link
Contributor

@gui1117 gui1117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code seems good to me, but I want to take a second fresh look

frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
kianenigma and others added 3 commits May 17, 2021 13:00
Co-authored-by: Gavin Wood <gavin@parity.io>
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
@kianenigma kianenigma added B7-runtimenoteworthy D5-nicetohaveaudit ⚠️ PR contains trivial changes to logic that should be properly reviewed. D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited and removed D5-nicetohaveaudit ⚠️ PR contains trivial changes to logic that should be properly reviewed. labels May 17, 2021
@s3krit s3krit added this to the polkadot v0.9.2 milestone May 17, 2021
@kianenigma kianenigma requested review from gavofyork and gui1117 May 17, 2021 12:44
coriolinus
coriolinus previously approved these changes May 17, 2021
Copy link
Contributor

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of nits, but I think the fundamental logic is sound.

As pointed out elsewhere, a Drop implementation on a lock guard might be a good enhancement, but IMO it's not necessary within the scope of this PR.

frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
@kianenigma kianenigma dismissed coriolinus’s stale review May 17, 2021 13:53

significant code changed.

@kianenigma
Copy link
Contributor Author

rewrote the whole thing with the fancier storage lock. Tested well locally, and deploying it now to dry-run on kusama's next election.

A new pair of reviews would be good @tomusdrw @coriolinus @thiolliere

What I miss is some unit tests, will add again.

@kianenigma kianenigma requested review from tomusdrw and coriolinus May 17, 2021 13:58
Copy link
Contributor

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fundamental logic hasn't changed, just the implementation. I'd just like to see at least one test demonstrating that the OCW won't ever attempt to mine two solutions simultaneously, even if Self::ensure_offchain_repeat_frequency is satisfied.

primitives/runtime/src/offchain/storage_lock.rs Outdated Show resolved Hide resolved
Comment on lines 948 to 961
// after election finalization, clear OCW solution storage.
if <frame_system::Pallet<T>>::events()
.into_iter()
.filter_map(|event_record| {
let local_event = <T as Config>::Event::from(event_record.event);
local_event.try_into().ok()
})
.find(|event| {
matches!(event, Event::ElectionFinalized(_))
})
.is_some()
{
unsigned::kill_ocw_solution::<T>();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this function is not run on this specific block due to the newly introduced lock ?

It shouldn't harm too much, as it will be removed due to failing feasibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you suggest anywhere else where we would kill the solution?

Thinking about it again, if we simply don't kill this, it will be replaced with a fresh one on the next round. In the past it would have been tricky since we could race around it. But now, the initial OCW will attempt to mine a new solution and replace this, and no other OCW will work in parallel, so it should be safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have very convincing suggestion, maybe we could match the phase and remove when the phase is close, but removing on every block when the phase is close is a bit useless too.

anyway I agree with what you say, current implementation is fine, even if not removed, it is not an issue

Copy link
Contributor

@gui1117 gui1117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

kianenigma and others added 2 commits May 17, 2021 18:37
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
frame/election-provider-multi-phase/src/lib.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/lib.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/lib.rs Outdated Show resolved Hide resolved
@@ -284,7 +284,7 @@ pub fn testnet_genesis(
}).collect::<Vec<_>>(),
},
pallet_staking: StakingConfig {
validator_count: initial_authorities.len() as u32 * 2,
validator_count: initial_authorities.len() as u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offchain elections are only acceptable if the provide exactly enough validators. So since we have double the initial_authorities, offchain elections always fail on a dev chain and it produces some warn/erorr logs.

can revert if it offends anything else.

kianenigma and others added 2 commits May 18, 2021 08:30
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Copy link
Contributor

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Only a couple of nits in the tests.

frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
frame/election-provider-multi-phase/src/unsigned.rs Outdated Show resolved Hide resolved
@kianenigma
Copy link
Contributor Author

bot merge

@ghost
Copy link

ghost commented May 18, 2021

Waiting for commit status.

@ghost ghost merged commit 3d753ec into master May 18, 2021
@ghost ghost deleted the kiz-guard-election-ecw branch May 18, 2021 12:22
@jakoblell jakoblell added D1-audited 👍 PR contains changes to fund-managing logic that has been properly reviewed and externally audited and removed D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited labels May 20, 2021
nazar-pc pushed a commit to autonomys/substrate that referenced this pull request Aug 8, 2021
* Initial version, well tested, should work fine.

* Add one last log line

* Update frame/election-provider-multi-phase/src/unsigned.rs

Co-authored-by: Gavin Wood <gavin@parity.io>

* Update frame/election-provider-multi-phase/src/unsigned.rs

Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>

* Update frame/election-provider-multi-phase/src/unsigned.rs

Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>

* Fix a few more things

* fix build

* rewrite the whole thing with a proper lock

* clean

* clean some nits

* Add unit tests.

* Update primitives/runtime/src/offchain/storage_lock.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Fix test

* Fix tests

Co-authored-by: Gavin Wood <gavin@parity.io>
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. D1-audited 👍 PR contains changes to fund-managing logic that has been properly reviewed and externally audited
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants