Runtime Task Executor + Example for staking slashing spans #8197

kianenigma · 2021-02-24T14:19:02Z

Could fix https://github.com/paritytech/srlabs_findings/issues/57
Fixes #6835
Fixes #5188
Will also lay the foundation of #7911 and paritytech/polkadot-sdk#465 via the scheduler.

The gist of it is that: From now on each vote has an enabled flag, and we flip that upon slash. This means for each slash event we need to iterate all nominators, which is why I made the scheduler.

@shawntabrizi pointed out an interesting idea that we could outsource this operation: If we track the total voters of each validator, we can have someone submit the list of nominators that need to be iterated, potentially saving a lot of time.

I am okay with that as well, in which case we need to decouple this PR. But note that even then, iterating a few hundred keys might be something that we want to split among blocks. Overall, the scheduler is a personal interest of mine, and I want to experiemnt with it anyhow.

Also, I belive perhaps the work of @athei for contracts can be rewritten using a similar approach as what we do here.

Needs a lot of tests and documentation more.
📢📢📢🚨 Introduction video about the PR and its goals (parity users can access): https://drive.google.com/file/d/1608ZrYLyQA9U9dpqHW2PEqDQQsX3mSAH/view?usp=sharing

frame/staking/src/scheduler.rs

athei · 2021-02-25T07:43:19Z

Also, I belive perhaps the work of @athei for contracts can be rewritten using a similar approach as what we do here.

You mean the lazy storage removal? From a glance over this code I think I am doing the same thing as you are doing here. Just in a less generalized way (queuing tasks in storage and working on them in on_initialize). If we could extract this RuntimeTaskScheduler to its own crate I could make use of it to implement lazy storage removal. Is that what you mean?

This would allow us to have a global queue of tasks. It would solve the problem of coming up with proper weight limits for each pallet's on_initialize workqueue.

kianenigma · 2021-02-25T07:53:35Z

Also, I belive perhaps the work of @athei for contracts can be rewritten using a similar approach as what we do here.

You mean the lazy storage removal? From a glance over this code I think I am doing the same thing as you are doing here. Just in a less generalized way (queuing tasks in storage and working on them in on_initialize). If we could extract this RuntimeTaskScheduler to its own crate I could make use of it to implement lazy storage removal. Is that what you mean?

This would allow us to have a global queue of tasks. It would solve the problem of coming up with proper weight limits for each pallet's on_initialize workqueue.

yeah exactly, I thought maybe we can try this in the contracts pallet as well to see if it is any good or not.

As you said, I think they should fit nicely as they are doing the same thing, this one is just trying to be reusable.

athei

Not sure if my comments are useful because it is still WIP.

athei · 2021-02-25T08:07:30Z

frame/staking/src/scheduler.rs

+
+	/// Add a new task to the internal queue.
+	pub fn add_task(&mut self, task: Task) {
+		self.tasks.push(task)


We would need to cap the size of the queue because the weight to decode the Vec is linear to the size of the queue. Also we should make use of append here to have O(1) costs when pusing items to storage.

Yeah so in the new API, ideally we need both append and decode_len in StoredExecutor.

Although, this situation exists because I want to abstract the task management from the pallet. As you see now, as a pallet you just accept a type E: StoredExecutor, put it in storage and mutate it whenever you want to .execute. This is a okay-ish abstraction (compared to you manually being in charge of putting a Vec<Task> in storage), but we kinda miss the benefit of decode_len and append.

but I am sure we can build these abstractions, if the whole idea of having StoredExecutor seems like a good way to go.

frame/staking/src/lib.rs

frame/staking/src/benchmarking.rs

apopiak · 2021-02-26T16:40:22Z

frame/staking/src/lib.rs

@@ -1721,16 +1824,16 @@ decl_module! {
 			ensure!(!targets.is_empty(), Error::<T>::EmptyTargets);
 			ensure!(targets.len() <= MAX_NOMINATIONS, Error::<T>::TooManyTargets);

-			let old = Nominators::<T>::get(stash).map_or_else(Vec::new, |x| x.targets);
+			let old = Nominators::<T>::get(stash).map_or_else(Vec::new, |x| x.targets.into_iter().map(|(x, _)| x).collect::<Vec<_>>());

 			let targets = targets.into_iter()
 				.map(|t| T::Lookup::lookup(t).map_err(DispatchError::from))
 				.map(|n| n.and_then(|n| if old.contains(&n) || !Validators::<T>::get(&n).blocked {


same note about allocations here

Not sure how it can be avoided here?

kianenigma · 2021-03-12T10:19:30Z

This is now ready for review. I'd be happy to either try myself, or if anyone's interested, explore both #7911 paritytech/polkadot-sdk#465 using this abstraction. You'll see that it is a bit over-engineered just for the purpose of slashing tasks, and the reason is that I envision building other multi-block stuff with it.

kianenigma · 2021-03-12T10:29:36Z

Actually, I can pretty much see the possibility of creating a new pallet that has just a storage task executor, and creating trait interfaces so that for example staking would only delegate its own tasks to the pallet-executor.

athei

I am worried by the fact that the StoredExecutor declares its own storage related weights out of scope. This means that accounting for the weight that results from encoding and decoding the potentially unbounded task queue is solely upon the user of this feature. I am not even sure if it would be possible for the user to do so.

The dependent introduced in this PR (staking) already disregards those weights (see my other comments). I can't tell if this is a problem but I wager that it would profit if the executor would offer an abstraction to deal with it.

I am not sure if this feature should be changed to take coding weight into account or a higher level abstraction can be build onto it that handles it. This would also need to include storage append support.

I won't oppose the merge as you surely have a purpose for it. However, I cannot use it in contracts for the reasons stated.

athei · 2021-03-12T11:33:10Z

frame/support/src/executor.rs

+use crate::{weights::Weight, traits::Get};
+use codec::{Encode, Decode};
+use sp_runtime::traits::Zero;
+use crate::{RuntimeDebugNoBound, PartialEqNoBound, EqNoBound, CloneNoBound};


I would prefer to merge those crate:: imports.

frame/support/src/executor.rs

frame/staking/src/lib.rs

athei · 2021-03-12T12:04:13Z

frame/staking/src/lib.rs

+			// if this is a non-zero slash, schedule tasks to chill their nominations.
+			if !slash_fraction.is_zero() {
+				let task = SlashTask::new(stash.clone());
+				<SlashTaskExecutor<T>>::mutate(|e| e.add_task(task));


The add_task complexity is linear to the amount of already existing tasks in storage because of decoding and encoding of all tasks by mutate. Are the number of queued slashes capped and benchmarked for the worst case?

athei · 2021-03-12T12:05:11Z

frame/staking/src/lib.rs

+
+			let task_weight = <SlashTaskExecutor<T>>::mutate(|e| e.execute());
+			// The additional weight of reading the tasks, and writing back the result.
+			add_weight(1, 1, task_weight);


This does not only add the storage accesses but also decoding and encoding weight proportional to the amount of queued tasks.

Indeed, this is one of the things that need to be fixed. Not sure where, but there's a TODO for it.

athei · 2021-03-12T12:06:03Z

frame/support/src/executor.rs

+pub trait RuntimeTask:
+	Sized + Clone + Default + Encode + Decode + PartialEq + Eq + sp_std::fmt::Debug


Why do we need Default?

I believe everything going to storage needs Default.

But the RuntimeTask does not go into storage. The StoredExecutor does which does not need its task to implement Default in order to implement it.

athei · 2021-03-12T12:06:59Z

frame/support/src/executor.rs

+	/// Return a vector of all tasks.
+	#[cfg(any(test, feature = "std"))]
+	fn tasks(&self) -> Vec<Self::Task>;
+	// TODO: providing an iter() might also be good.


I would really like to have those as a github issue rather than a TODO comment. But this is a personal preference. Feel free to ignore.

A PR shall never be merged with TODOs, these are notes that I want my reviewers to also see and be aware that I maybe intend to do them.

frame/support/src/executor.rs

Co-authored-by: Alexander Theißen <alex.theissen@me.com>

kianenigma · 2021-03-12T12:38:47Z

I am worried by the fact that the StoredExecutor declares its own storage related weights out of scope. This means that accounting for the weight that results from encoding and decoding the potentially unbounded task queue is solely upon the user of this feature. I am not even sure if it would be possible for the user to do so.

The dependent introduced in this PR (staking) already disregards those weights (see my other comments). I can't tell if this is a problem but I wager that it would profit if the executor would offer an abstraction to deal with it.

I am not sure if this feature should be changed to take coding weight into account or a higher level abstraction can be build onto it that handles it. This would also need to include storage append support.

I totally agree with this, as you already mentioned it in your previous round. Honestly, I haven't addressed it yet because I am sure it is solvable and it is not the core of the work. An executor that has a bounded queue, IMO, can initially be deployed even with this code as-is, since the amount of work that the executor is doing under the hood + linear decoding is simply negligible.

So we have two issues to fix here:

Each implementation of StoredExecutor needs to be benchmarked only for its internal operations. This is not the decoding stuff that you said, only the in-memory operations (such as iterating over the queue etc). I honestly think this is so minuscule that in the initial versions we can ignore it.
Then, there's the weight of the executor itself being stored in storage. Of course, StoredExecutor itself cannot know this because it doesn't know where and how it is stored. This is the responsibility of the user of StoredExecutor, namelt staking or contracts. I currently do take it account by adding 1 read and 1 write, and you are totally right that the decode/encode weight is missing.

I see the second one being more important to fix.

I won't oppose the merge as you surely have a purpose for it. However, I cannot use it in contracts for the reasons stated.

I think what contracts is doing is a good use case and rather bend to your requirements, as long as they are reasonable. If your only grumbles are the two points above, I think they are easily fixable.

kianenigma · 2021-07-07T16:22:30Z

shush bot, this is important work, and needs to stay around :D

stale · 2021-08-06T17:01:19Z

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

kianenigma added 2 commits February 23, 2021 06:21

Some boilerplate for scheduler

02cd063

A working PoC

cd3f72d

xlc reviewed Feb 24, 2021

View reviewed changes

frame/staking/src/scheduler.rs Outdated Show resolved Hide resolved

athei reviewed Feb 25, 2021

View reviewed changes

apopiak reviewed Feb 26, 2021

View reviewed changes

frame/staking/src/benchmarking.rs Outdated Show resolved Hide resolved

apopiak reviewed Feb 26, 2021

View reviewed changes

kianenigma added 2 commits March 10, 2021 11:02

New design and location

c896956

Master.into()

40f1f49

Make it all work

135c509

kianenigma requested review from athei and apopiak March 12, 2021 10:15

small nits

d370f50

kianenigma marked this pull request as ready for review March 12, 2021 10:16

github-actions bot added the A0-please_review Pull request needs code review. label Mar 12, 2021

paritytech-ci added the D2-breaksapi label Mar 12, 2021

Fix build and tests.

1f9299d

github-actions bot added the A7-needspolkadotpr label Mar 12, 2021

Fix build and tests again.

726648f

kianenigma requested a review from andresilva as a code owner March 12, 2021 10:42

Fix support tests.

77604d8

athei reviewed Mar 12, 2021

View reviewed changes

kianenigma and others added 2 commits March 12, 2021 13:31

Update frame/support/src/executor.rs

0e1fa9c

Co-authored-by: Alexander Theißen <alex.theissen@me.com>

Update frame/support/src/executor.rs

c7bb378

Co-authored-by: Alexander Theißen <alex.theissen@me.com>

kianenigma mentioned this pull request Mar 26, 2021

helper macro to create storage types on the fly #8456

Merged

Allow for append and decode_len

b0acd21

kianenigma changed the title ~~Runtime Task Scheduler + Example for staking slashing spans~~ Runtime Task Executor + Example for staking slashing spans Mar 26, 2021

kianenigma added 3 commits April 6, 2021 12:22

Remove multi-pass stuff

de3c56e

Master.into()

cbfbdc6

Fix nit

710b137

gnunicorn added the A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label May 19, 2021

paritytech deleted a comment from cla-bot-2021 bot Jun 3, 2021

kianenigma mentioned this pull request Jun 22, 2021

Enable an easy way to unbond, if you are not *exposed* #8436

Closed

stale bot closed this Jul 7, 2021

kianenigma reopened this Jul 7, 2021

stale bot removed the A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label Jul 7, 2021

kianenigma mentioned this pull request Aug 6, 2021

Tracking Issue for multi-block election + dedicated election system parachain paritytech/polkadot-sdk#461

Open

19 tasks

stale bot added the A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label Aug 6, 2021

stale bot closed this Aug 20, 2021

kianenigma deleted the kiz-full-slash-reversal branch November 4, 2021 00:34

kianenigma restored the kiz-full-slash-reversal branch November 4, 2021 00:34

sea212 mentioned this pull request Oct 12, 2022

Investigation: How to safely remove outcome assets ("garbage collect") over multiple blocks? zeitgeistpm/zeitgeist#792

Closed

kianenigma mentioned this pull request Aug 24, 2023

[FRAME Core] General system for recognising and executing service work paritytech/polkadot-sdk#206

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime Task Executor + Example for staking slashing spans #8197

Runtime Task Executor + Example for staking slashing spans #8197

kianenigma commented Feb 24, 2021 •

edited

Loading

athei commented Feb 25, 2021 •

edited

Loading

kianenigma commented Feb 25, 2021

athei left a comment

athei Feb 25, 2021

kianenigma Mar 12, 2021

apopiak Feb 26, 2021

kianenigma Mar 12, 2021

kianenigma commented Mar 12, 2021

kianenigma commented Mar 12, 2021

athei left a comment

athei Mar 12, 2021

athei Mar 12, 2021

kianenigma Mar 12, 2021

athei Mar 12, 2021

kianenigma Mar 12, 2021

athei Mar 12, 2021

kianenigma Mar 12, 2021

athei Mar 12, 2021 •

edited

Loading

athei Mar 12, 2021

kianenigma Mar 12, 2021

kianenigma commented Mar 12, 2021

kianenigma commented Jul 7, 2021

stale bot commented Aug 6, 2021

		pub trait RuntimeTask:
		Sized + Clone + Default + Encode + Decode + PartialEq + Eq + sp_std::fmt::Debug

Runtime Task Executor + Example for staking slashing spans #8197

Runtime Task Executor + Example for staking slashing spans #8197

Conversation

kianenigma commented Feb 24, 2021 • edited Loading

athei commented Feb 25, 2021 • edited Loading

kianenigma commented Feb 25, 2021

athei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kianenigma commented Mar 12, 2021

kianenigma commented Mar 12, 2021

athei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

athei Mar 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kianenigma commented Mar 12, 2021

kianenigma commented Jul 7, 2021

stale bot commented Aug 6, 2021

kianenigma commented Feb 24, 2021 •

edited

Loading

athei commented Feb 25, 2021 •

edited

Loading

athei Mar 12, 2021 •

edited

Loading