Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Penalties+activs+exits only at 64-epoch boundary #2192

Closed
wants to merge 23 commits into from
Closed

Conversation

vbuterin
Copy link
Contributor

Moves all validator activations and exits to a multiple of 64 epochs, and amortizes offline penalties so that they are also processed at that boundary. This makes chains with "empty epochs" easier to process, because all but 1/64 of epochs would not require validator-set or balance-related processing.

Also adds per-validator quadratic leaks as in #2125.

Moves all validator activations and exits to a multiple of 64 epochs, and amortizes offline penalties so that they are also processed at that boundary. This makes chains with "empty epochs" easier to process, because all but 1/64 of epochs would not require validator-set or balance-related processing.

Also adds per-validator quadratic leaks as in #2125.
Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the first pass.
clean!

Base automatically changed from accounting-reform to dev February 4, 2021 17:16
@djrtwo
Copy link
Contributor

djrtwo commented Feb 23, 2021

Note, I reverted this PR to pre-justin cleanups and moved @JustinDrake's cleanups to a PR to this PR for easier review

#2212

@djrtwo djrtwo force-pushed the vbuterin-patch-10 branch from cded289 to f4dbb40 Compare March 2, 2021 14:26
Copy link
Contributor

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of the per-validator quadratic leak, but skeptical of the amortised epoch transition for these reasons:

  1. Having a regular penalty that wipes out previous gains is a bit strange from the perspective of a validator and I suspect we will have to spend considerable energy explaining why this is OK (or necessary). We already have validators on Discord freaking out about the tiniest penalties for a single epoch, let alone a punishing ~0.0005 ETH penalty every 64 epochs which can't be avoided. I can't see a way to remove the sawtooth penalty mechanism without unmaking the amortisation.
  2. The immediate benefit of the amortisation would be a compute time benefit for forks < 64 epochs in duration, yet I think resilience against DoS by forking is determined more by how clients store states in memory and on disk. In Lighthouse for example, we are still copying around full BeaconState objects in memory (no copy-on-write binary merkle trees yet), and storing full states on disk with repetition (no fancy deduplication or storing of diffs). The way we've made the client good enough for current network conditions is by ruthlessly reducing the frequency with which clones and disk writes of these states occur (particularly for finalized states, which are de-duplicated a little). In this sense I'm concerned the PR might be a premature optimisation. On the other hand, a mostly-immutable validator set could be exploited for memory + storage optimisations in future.
  3. Clients try to avoid spikes in resource consumption, yet the amortisation of epoch processing works contrary to this aim. Pushing extra tree hashing (caused by mutation of the validator set) and state storage to the period boundary could spike load, potentially impacting other duties (block proposal, etc). This also creates a larger DoS target for an attacker who is willing to wait for a period boundary to throw a bunch of forks at the network.

I'm open to changing my mind, but these are my thoughts after ~1 day chewing over this proposal

@@ -341,49 +360,72 @@ def get_flag_deltas(state: BeaconState,
for index in get_eligible_validator_indices(state):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop could be written more succinctly as:

for index in unslashed_participating_indices:
    base_reward = get_base_reward(state, index)
    penalty_canceling_reward = base_reward * numerator // FLAG_DENOMINATOR
    # ... and so on

(as there's no else branch for the penalty anymore)

Copy link
Contributor

@JustinDrake JustinDrake Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any such cleanups please see #2212 :) I believe this is addressed there.


for index in get_eligible_validator_indices(state):
# Inactivity-leak-specific penalty
if state.leak_score[index] >= EPOCHS_PER_ACTIVATION_EXIT_PERIOD * LEAK_SCORE_BIAS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking that the logic behind this threshold is that quadratic leaks on timespans shorter than EPOCHS_PER_ACTIVATION_EXIT_PERIOD are inconsequential anyway (would evaluate to a 0 penalty)?

@JustinDrake JustinDrake added the Altair aka HF1 label Mar 3, 2021
@vbuterin
Copy link
Contributor Author

vbuterin commented Mar 3, 2021

Thanks for the feedback, this is helpful!

I'm in favour of the per-validator quadratic leak, but skeptical of the amortised epoch transition for these reasons:

1. Having a regular penalty that wipes out previous gains is a bit strange from the perspective of a validator and I suspect we will have to spend considerable energy explaining why this is OK (or necessary). We already have validators on Discord freaking out about the tiniest penalties for a single epoch, let alone a punishing ~0.0005 ETH penalty every 64 epochs _which can't be avoided_. I can't see a way to remove the sawtooth penalty mechanism without unmaking the amortisation.

Hmm, I see the point and how it could make the validator experience less intuitive. But does there have to be a sawtooth at UI level though? You could just show the balances at the beginning of the period (and we could enshrine this further by having an incrementing counter instead of updating balances), or we could add a non-consensus function get_predicted_balance that pre-computes the inactivity penalty and have the APIs point to that.

2. The immediate benefit of the amortisation would be a compute time benefit for forks < 64 epochs in duration, yet I think resilience against DoS by forking is determined more by how clients store states in memory and on disk. In Lighthouse for example, we are still copying around full `BeaconState` objects in memory (no copy-on-write binary merkle trees yet), and storing full states on disk with repetition (no fancy deduplication or storing of diffs). The way we've made the client good enough for current network conditions is by ruthlessly reducing the frequency with which clones and disk writes of these states occur (particularly for finalized states, which are de-duplicated a little). In this sense I'm concerned the PR might be a premature optimisation. On the other hand, a mostly-immutable validator set could be exploited for memory + storage optimisations in future.

By far the bulk of the state is the validator set, so if this PR ensures that the validator set is completely immutable between 64 epoch boundaries, that would be a large win for dealing with that kind of DoS attack, no? I imagine it would also make fork choice optimizations easier.

3. Clients try to avoid spikes in resource consumption, yet the amortisation of epoch processing works contrary to this aim. Pushing extra tree hashing (caused by mutation of the validator set) and state storage to the period boundary could spike load, potentially impacting other duties (block proposal, etc). This also creates a _larger_ DoS target for an attacker who is willing to wait for a period boundary to throw a bunch of forks at the network.

Would there actually be that much more extra tree hashing at the period boundary? Validator balance updates would only happen when hysteresis demands it, which would continue to be on average something like once every 2-6 months per validator.

@djrtwo
Copy link
Contributor

djrtwo commented Mar 4, 2021

  1. I agree that this could damage the psychology of staking for some, but this could easily be fixed in the UI/UX layer for at least block explorers. I think if we get ahead of it, this won't be too much of an issue.
  2. Interesting. I don't want to get in a place where we optimize the spec without entirely understanding how this will affect practical engineering optimizations. I see how this at the base targets compute-time first without considering if compute-time is the real bottleneck. Before we commit to this, I'd want to better understand if (1) compute-time is a serious bottleneck and (2) if an immutable active-vset during a longer epoch period opens up valuable optimizations.
  3. same question as vitalik. The idea is that the period boundary (every 64 epochs) would be similar cost to a normal epoch boundary today

@michaelsproul
Copy link
Contributor

michaelsproul commented Mar 6, 2021

Thank you both for the thoughtful defence. Regarding (1) I'm happy with the get_predicted_balance solution!

Regarding (2) and (3)

The idea is that the period boundary (every 64 epochs) would be similar cost to a normal epoch boundary today

Having thought about it more I think you're probably right. For sake of argument let's assume that the period boundary is approximately the same cost as an epoch boundary today. Even under this assumption, I think the optimisation is of dubious usefulness for security. In the case where an attacker is trying to DoS the network by producing forks, even with this change they can DoS just as effectively by simply waiting for a period boundary before they produce their fork blocks. By making blocks with parents equal to boundary_slot - 1, boundary_slot - 2, etc, they force clients to compute multiple distinct period boundary transitions. Loading historic states like this also incurs substantial I/O costs, amortisation or not (e.g. Lighthouse loads a full state and replays up to 32 blocks on a cache miss). In the case where the period boundary is more expensive, it's worse than the situation today, which was my original point. But I'm now willing to weaken that to say that if the costs are equal, we're in the same situation as we are today.

That's looking at things from a security PoV. In terms of optimising the chain during normal operation it might be more viable. I suspect that writing the validator set to disk involves a large constant factor, something like cost = C + k * nodes_modified, where C could represent the latency of doing any disk write. If we obviate the need to write every epoch, we might save 63 * C. That's somewhat theoretical though, and the I'm not sure if a state storage scheme designed to exploit this would be optimal in other ways, i.e. more research required. I know there's already a lot more we could do to optimise disk access patterns (something like Geth's snapshots) without a spec update, and it would be my preference not to introduce spec updates until we push it further and hit a limit. I would be interested to know if other clients feel similarly (cc @arnetheduck @ajsutton @terencechain).

@arnetheduck
Copy link
Contributor

arnetheduck commented Mar 9, 2021

By far, like in lighthouse, the dominant bottleneck in Nimbus is disk I/O - to give an idea, processing an epoch transition takes ~15ms, interacting with a state on disk can take up to a second when combined with other epoch-related database processing, so like lighthouse, we do everything we can to avoid that - in particular, we will only write states that result from blocks that are viable for fork choice.

The second, more dubious perhaps, thing we do is limit the horizon of empty slot processing for attestation validation - for example, if someone votes for a target epoch rooted in a very old block, we will not process the attestation as this would require hitting the disk and do many epochs of empty slot processing - in a healthy network, this condition is never triggered because finalization ensures that the target root must be recent, but in the case of non-finality, we deem such votes unlikely to bring significant value and simply discard them as loading the state in this scenario is too costly at current disk I/O rates.

Finally, if someone produces a block that requires a lot of empty slot processing, the bottleneck will still be disk I/O, but it seems that the rate at which one can produce such forks is pretty low and a sustained attack is difficult.

To mitigate fork scenarios (and to prep for 200k validators) we're in the process of splitting out the truly immutable part of the validator set (keys, withdrawals) from the mutable parts and storing these separately - this allows us, for each snapshot we consider useful, to not store 12mb out of the 16mb that a state takes up (mainnet, 100k validators) - it turns out that the mutable part also compresses well with snappy, while the immutable part doesn't, further adding to the benefits - there are some benchmark numbers in the linked PR that might be interesting to look at.

(3)

same question as vitalik. The idea is that the period boundary (every 64 epochs) would be similar cost to a normal epoch boundary today

Excluding BLS, we currently spend ~25% of CPU time updating the merkle tree cache and ~20% shuffling - if accumulating 64 epochs of work ends up dirtying a larger part of the tree, this could make the tree cache update a tiny bit more expensive, but probably not significantly so. In the same vein, we don't have an optimal tree cache right now - we don't cache hashes of individual fields but rather treat the entire validator as one cache entry - this is close enough that we didn't want to complicate the code further at this time - maybe we could offset any 64-epoch buildup by addressing these kinds of issues, but it doesn't seem likely that we would, any time soon.

By far the bulk of the state is the validator set

By far the bulk of the validator set is the key and withdrawal - the rest compresses away.

All in all, this change looks both low-risk and low-benefit, and we'd be happy to leave it for another fork - the risks it mitigates seem remote, while at the same time there are much more expensive things going on in clients, specially wrt managing the truly immutable part of the validator set - "completely immutable for 64 epochs" isn't as attractive an optimization target as "completely immutable, period" because of the complexity involved in managing those mutations coupled with the limited benefits (because compression etc). We'll be in a pretty good spot already when this is the lowest fruit on the tree :)

@djrtwo
Copy link
Contributor

djrtwo commented Mar 11, 2021

This optimization will not go into Altair. More R&D is desired before making such spec optimizations.

See call notes here ethereum/eth2.0-pm#208

@djrtwo djrtwo closed this Mar 11, 2021
@jtraglia jtraglia deleted the vbuterin-patch-10 branch January 22, 2025 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants