Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SC-2863] Failing to produce a block should deduct Online Credits / Rep #1048

Closed
morelazers opened this issue Dec 16, 2021 · 25 comments
Closed
Assignees
Labels
approved ✅ bug Something isn't working discussion Discussions about proposed changes or ideas effort-3 Will need to plan this before implementing p1-asap State Chain

Comments

@morelazers
Copy link

morelazers commented Dec 16, 2021

Describe the changes you'd like

Currently there's an Active Validator on Soundcheck not producing blocks.

We should be punishing this, but instead we're giving them Online Credits and Reputation for submitting Heartbeats 😅

Validator ID: 5CX9KrU7aFS5jM67qh7yuXZzhBimGpCrBJuTAQssBGXMvDzD

Does this issue have a Shortcut Story?

Not yet.

Implementation

We can punish inactive block producers by including a reward for block authorship. This way, missing an aura slot has a real (and proportionate) cost to the misbehaving node.

The way to do this would be in an on_initialize hook (probably in the rewards pallet) that pulls the block author from pallet_authorship and simply mints the emissions for the current block to the account of the block author.

A side benefit of this would be that we would likely be able to remove the current (fairly complex) system we have for lazy apportionment of rewards. If each validator pays out their rewards once every num-validators blocks, rather than paying out all rewards every reward_interval blocks, then the performance benefit of the lazy apportionment / redeeming mechanism falls away.

If we apply this as part of a runtime upgrade we'd have to ensure that all rewards are paid out during the upgrade. It might be worth leaving the redeem extrinsic in temporarily as a no-op to avoid breaking assumptions in the CFE / UI / CLI.

Describe alternatives you've considered

Inherents: I initially though inherents would be the right approach for this, however on reflection I think inherents are appropriate when some eternal information needs to be passed in (such as the timestamp). However, in this case, we already have the information required, via pallet_authorship.

Aura: we could handle this by slashing authors who missed their authorship slot. This would be more explicit and arguably the 'correct' solution to the problem. However, it would require a custom fork of the aura pallet to revert the changes from this commit, plus some extra offence handling logic, and I would rather not touch aura if it can be avoided... particularly considering that (a) it's not clear why this was removed from aura in the first place - there might be a good reason and (b) there is an open substrate issue to add this back to core substrate (as per my comment below).

Additional context

session.validators: Vec<ValidatorId>
[
  5F9ofCBWLMFXotypXtbUnbLXkM1Z9As47GRhDDFJDsxKgpBu
  5Ge1xF1U3EUgKiGYjLCWmgcDHXQnfGNEujwXYTjShF6GcmYZ
  5DJVVEYPDFZjj9JtJRE2vGvpeSnzBAUA74VXPSpkGKhJSHbN
  5Decocf1WeTtWdyNMpuC4y6xzERYecat71Xk5GkcT8Y26EBZ
  5GVnJYpiknGcmbXPG97JYENNkzEhvyw9RfGPsTrrCF6tzDzo
  5CX9KrU7aFS5jM67qh7yuXZzhBimGpCrBJuTAQssBGXMvDzD <-- the offline one
]

The Validators currently producing blocks (5CX... is not in the list):

image

The Online and Reputation states for the Account:

image

Tag specific people for comment

@dandanlen @andyjsbell

@morelazers morelazers added bug Something isn't working State Chain labels Dec 16, 2021
@andyjsbell
Copy link

What's the priority on this one?

@morelazers
Copy link
Author

p1

@andyjsbell
Copy link

Is #553 related?

@morelazers
Copy link
Author

Probably, yep

@dandanlen
Copy link
Collaborator

Some interesting context (also relevant to #553) - I found the following comment in the aura pallet source:

// TODO [#3398] Generate offence report for all authorities that skipped their
// slots.

The associated issue has not been touched for > 2 years but links to some extra context. According to this issue and this PR, the current approach to use the im_online pallet to punish non-participating Aura nodes (so similar to our hearbeats).

However, according to that issue, it would be better to use Aura directly rather than hearbeats for heartbeats... so similar to our conclusion? I'll read the PR in detail to get a better understanding.

@dandanlen dandanlen added discussion Discussions about proposed changes or ideas for approval 🙏 Blessing is required labels Dec 28, 2021
@andyjsbell
Copy link

Nice and simple. LGTM.

A side benefit of this would be that we would likely be able to remove the current (fairly complex) system we have for lazy apportionment of rewards. If each validator pays out their rewards once every num-validators blocks, rather than paying out all rewards every reward_interval blocks, then the performance benefit of the lazy apportionment / redeeming mechanism falls away.

Nice side effect 👍

If we apply this as part of a runtime upgrade we'd have to ensure that all rewards are paid out during the upgrade. It might be worth leaving the redeem extrinsic in temporarily as a no-op to avoid breaking assumptions in the CFE / UI / CLI.

We might be able to remove it as we are quite still early in the day.

We have briefly discussed this but we should also consider how long they take to authour a block as it has consequences on other parts of the system which use blocks as a notion of time. Probably best in a new issue but I am just wondering on what the timeout is on failing to produce a block.

@dandanlen
Copy link
Collaborator

We have briefly discussed this but we should also consider how long they take to authour a block as it has consequences on other parts of the system which use blocks as a notion of time. Probably best in a new issue but I am just wondering on what the timeout is on failing to produce a block.

I think this is something to consider separately. It's a tricky one since a block author can take longer to author a block through no fault of their own (ie. the block might just be heavy due to some benchmarking issue). So it would have to be tracked over time and we would have to compare averages between different nodes or something. Also not sure we have access to reliable block production times - we have the timestamp but it's not precise and is relatively easy to fake.

@andyjsbell
Copy link

Also not sure we have access to reliable block production times - we have the timestamp but it's not precise and is relatively easy to fake.

Right, but I presume there is a timeout in not producing one so we are depending on a timestamp in the end anyhow??

@dandanlen
Copy link
Collaborator

I'm not sure how aura decides that a block author has missed their slot. The timestamp is allowed to drift by up to 30 seconds according to the implementation in the timestamp pallet. Maybe if the next block author 'overtakes' the current one and authors another valid block first?

Now that I think about it, I'm not entirely sure what the "last block" time in the polka js interface means. Is it the time required by the local node to process the latest block? Or the time taken to author & propagate it? The time taken by the network as a whole? Probably some combination of all of these but I'm not sure. Either way, I don't think we need to focus on this right now.

@morelazers morelazers added approved ✅ and removed wip 🚧 for approval 🙏 Blessing is required labels Dec 29, 2021
@morelazers
Copy link
Author

LGTM, just one question:

allows the block author to mint the emissions

Is this going to be automatic, or would the Author's CFE be required to submit something? I'm assuming the former since it's neither inherent nor extrinsic, but I thought I'd ask.

@dandanlen
Copy link
Collaborator

Automatic. I've edited the proposal to make this clearer:

The way to do this would be in an on_initialize hook (probably in the rewards pallet) that pulls the block author from pallet_authorship and simply mints the emissions for the current block to the account of the block author.

@nakul-cf nakul-cf changed the title Failing to produce a block should deduct Online Credits / Rep [SC-2863] Failing to produce a block should deduct Online Credits / Rep Jan 5, 2022
@dandanlen dandanlen pinned this issue Jan 26, 2022
@andyjsbell andyjsbell unpinned this issue Feb 1, 2022
@dandanlen
Copy link
Collaborator

@Janislav if you take a look at this - it might be enough to just plug in a different implementation of the RewardDistribution trait instead of the current one that is based on the rewards pallet.

@Janislav
Copy link
Contributor

Janislav commented Feb 3, 2022

Makes sense to me. So the solution would be (probably) changing the implementation of the distribute method implemented on the OnDemandRewardsDistribution struct to mint the rewards not the reserve but to the account of the current block author. I would also consider moving implementation to the runtime because it's easier to access Authorship from here.

@dandanlen
Copy link
Collaborator

Yes that's the basic idea, but you can delete OnDemandRewardsDistribution and give it a better name - like BlockAuthorRewardDistribution.

I would also consider moving implementation to the runtime because it's easier to access Authorship from here.

Yes that sounds good.

@Janislav
Copy link
Contributor

Janislav commented Feb 3, 2022

Alright. After I played a bit around I would probably prefer the following:

  1. Leaving the implementation in the rewards pallet (turned out it's probably not so easy to move this to the runtime)
  2. Introducing a new trait which we implement in the Runtime and returns us the current block author
  3. Add a new config to the rewards pallet which takes the implementation of the trait we implemented in the runtime

With this, we have the AccountId of the current block author available in the rewards pallet and can mint the rewards to his account.

NOTE: Moreover, I think leaving the implementation in the rewards makes more sense because it logically belongs there.

@dandanlen
Copy link
Collaborator

dandanlen commented Feb 3, 2022

The thing is, the entire rewards pallet exists purely to implement the lazy rewards distribution. Once we remove OnDemandRewardsDistribution, I think you'll find you can remove all the storage and methods from the rewards pallet, and there will be nothing left - we will have a pallet where all we really need it a trait implementation. At least that's my impression from reading the code - I haven't tried implementing it myself.

If you want, start along the path you think makes sense, and open a draft PR, I can take a look.

@Janislav
Copy link
Contributor

Janislav commented Feb 3, 2022

ah okay... well that changes the situation

@Janislav
Copy link
Contributor

Janislav commented Feb 3, 2022

will investigate this a bit more tomorrrow👌

@Janislav
Copy link
Contributor

Janislav commented Feb 4, 2022

Okay, so as far as I can tell removing the rewards and pallet and payout the rewards directly should be possible. I just ask myself why we initially decided to pay out the rewards at the end of each epoch and implement this additional logic to make it possible. Pretty sure this was a design decision based on requirements and assumptions, right? Would like to check and fully understand what this would mean.

@dandanlen
Copy link
Collaborator

dandanlen commented Feb 4, 2022

Ideally, we would pay out rewards equally to all validators on every block: this would give us the most accurate accounting and would be the most equitable way of doing things as long as we can rely on all validators behaving themselves.

However it is very inefficient since we have to iterate through a storage map + read/write to ~150 storage locations on every block. Hence we implemented the 'lazy' version where we track what each validator is entitled to without actually crediting it to their account until they ask for it, or until we run an auction. The lazy version is more efficient but significantly more complex.

The latest approach quite neatly solves the efficiency issue, and also adds an added incentive for validators to stay online author blocks, and it's much less complex than lazy rewards.

@dandanlen
Copy link
Collaborator

Btw. the new approach may require a change to the cli claim procedure - I believe it currently needs to do make a call to redeem to redeem the rewards before making a claim. This should no longer be necessary. Should be straightforward to fix though.

@Janislav
Copy link
Contributor

Janislav commented Feb 4, 2022

Cool. Sounds like a plan. Then I would now:

  1. Implementing RewardsDistribution in the runtime and paying out the rewards to the block author
  2. Delete the rewards pallet
  3. Remove the usage of all structs implemented in the rewards pallet (RewardRollover, Rewarder)

@dandanlen
Copy link
Collaborator

Sounds good. Also note that apart from the cli this will be a breaking change for the CFE (the pallet indexes will change when we remove the rewards pallet) and also probably for a bunch of stuff on the UI side of things...

@morelazers morelazers added the effort-3 Will need to plan this before implementing label Feb 4, 2022
@dandanlen
Copy link
Collaborator

@Janislav, pretty sure this can be closed, right?

@Janislav
Copy link
Contributor

Yes, this is already merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ✅ bug Something isn't working discussion Discussions about proposed changes or ideas effort-3 Will need to plan this before implementing p1-asap State Chain
Projects
None yet
Development

No branches or pull requests

4 participants