-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add getter to ExecutionEngine to get TTD #2643
Comments
Is TTD not already part of the synchronization API? cc @mkalinin |
What if we hardcode TTD on the EL (execution layer) side and have a This approach should reduce the number of potential failure points, in particular in how to sync this value between the layers if it is defined in each of them with an option to be overridden. |
You mean Engine API? It's not yet a part of it, at least not a part of its Interop Edition. And this is a good time to make a decision on how to handle the hardcoding and overriding parts of this paramter |
+1 on only putting this in the EL and requiring CL to get it. Definitely reduces failures in synchronization between the two layers |
The downside however is that now the CL can't startup until the EL is already running and accepting requests. Previously it would have been able to startup, begin finding peers and begin optimistic sync. Given the huge number of things that already have to be in-sync between the CL and EL, I'm not sure the complexity of having to retrieve this and continuously poll for changes is going to reduce bugs. It will definitely increase complexity significantly and that usually leads to an increase in bugs. |
CL can't validate terminal block conditions without functioning EL and it is still possible for CL to do an optimistic sync even though EL is not functioning at all. In the case of optimistic sync CL will have to get back to terminal block validation in the case if the transition hasn't been finalized yet, but in this case it also have to rely on a functioning EL party to verify that the execution of the block built on top of it is valid. This is rather an attempt to reduce the surface of misconfiguration issues induced by users than bugs induced by developers. Do you think that polling this data from EL is more bug prone than polling the head and verifying TTD? |
So I think this complexity is the crux of my concern more than how much the beacon node can do before the EL starts up. It may be that I'm not understanding the proposal properly, but my understanding is that instead of the beacon node having the TTD hard coded (with CLI option to override) it would poll the EL to get it. Having a hard coded TTD would definitely be less error prone in that case - having to poll for potentially changing configuration data is pretty complex to get right. If however we can get the CL to not care about TTD at all then yes that probably is simpler. I'm not finding Bottom line for me is that having a user specify the same CLI argument in two places is pretty straight forward (we already have that for a number of args between beacon node and validator clients), whereas loading potentially changing configuration data from a web API is significantly more complex and error prone. If dealing with TTD fits reasonably well into existing semantics (or lets us not do some parts of it) then it's probably worth it, if it's just to get the user to only set it in one place then it probably isn't worth it. |
Okay, I've been thinking about this and I want to minimize complexity of this component while ensuring it likely won't be a source of failures. The that end, I suggest the following which keeps CL as the transition leader and the only point of overrides without requiring a new engine endpoint MechanismWe can use the following mechanism Release and override procedure
3675 spec changesRequisite change to EIP 3675:
DiscussionOverrides to expedite the MergeThe above simplifying mechanism assumes that TTD and TBH will only be overridden to be earlier than originally set values, and thus by not eagerly communicating an updated terminal value to EL, the worst that happens is that EL imports PoW blocks until the previously hardcoded TTD (due to the definition of "Terminal PoW Block" in EIP 3675). Expected usage of these overrides:
Overrides to slow the Merge (not supported)There is one other type of usage that we cannot account for with the above proposed mechanism -- setting of TTD to be later than the hard-coded TTD. In the above mechanism, if we don't communicate a TTD override to EL in this case, EL will never reach the new TTD (due to not importing past the hard-coded TTD) [this is @dapplion's first case -- @dapplion's casesRestricting overrides to only expedite the Merge and doing overrides only from CL reduces the problem of what can go wrong to only dapplion's second case (i.e. Now in the case of ^ The above analysis also holds for a TBH override (assuming it is only used to expedite the Merge) |
What if we have the requisite change to the EIP but still keep override options on EL side? If user forgets to configure these settings on EL side then we get to the case when |
Yeah, I almost suggested putting TTD override on EL as well but instead left it as a suggestion to just do a new release (if time permits). Two reasons:
Because of the above, I'd suggest not having the override in EL, but to always encourage updated releases of both CL and EL in the event that we want to ship an override |
Sounds like a good approach to me. Keeps the CL side nice and simple and while there's possible a little more complexity on the EL side it would need to do something to handle the corner case of the CL hitting TTD before it thought it should anyway so hopefully is simple. And yes I think anytime we are telling people to use an override we'd also wind up doing an emergency release with it in, but the override is still useful for bigger setups that have their own pipeline for verifying updates before they reach MainNet production. |
Addressed in ethereum/EIPs#4397 |
If the execution and consensus layer have different total terminal difficulty (TTD) it results in bad failures:
To reduce the chance of this failure cases, the consensus client could check that its TTD matches the execution client TTD. If TTDs don't match, panic or very visibly alert the user. To allow updating nodes in any order absence of this method should not result in a error. Also a different TTD could only result in panic if TTD has been manually overwritten.
CC @mkalinin
The text was updated successfully, but these errors were encountered: