-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/state: move state log mechanism to a separate layer #30569
Conversation
ec06411
to
4a25e24
Compare
After having considered it some more, I am even more convinced that the approach of #30441, adding read-hooks inside the It does not discriminate between event sources.
The "solution" to these sorts of problems would be to, in certain situations, disable the With the layered solution, there are no such complexities, as long as we can switch between the logging-statedb and the non-logging-statedb.
Switching between one and the other can probably be done in many ways, I'm open to suggestions. One way would be to have two interfaces type LoggingEnabled interface {
WithLoggingDisabled() vm.StateDB
}
type LoggingDisabled interface {
WithLoggingEnabled() vm.StateDB
} Example how that would look, going from a dual-layered logging statedb to a single-layered raw statedb: if evm.Config.Tracer != nil && evm.Config.Tracer.OnTxStart != nil {
ctx := evm.GetVMContext()
newctx := &(*ctx) // shallow copy
if sdb, ok := ctx.StateDB.(vm.LoggingEnabled); ok {
newctx.StateDB = sdb.WithLoggingDisabled()
}
evm.Config.Tracer.OnTxStart(newctx, tx, msg.From)
if evm.Config.Tracer.OnTxEnd != nil {
defer func() {
evm.Config.Tracer.OnTxEnd(receipt, err)
}()
}
} |
cae550b
to
68a0aff
Compare
Minus the ugliness regarding swapping between shimmed and non-shimmed state, this PR is mostly done. Ideas for how to make the switching nicer are appreciated |
My understanding for the main use-case of the read hooks is to collect the prestate for the transaction/call. So the ordering and how often we emit a OnReadBalance doesn't matter for the tracers. IF that were to matter you are right, then we would need to add a reason to specify what is this read about.
This was an interesting realization. I think the friction point here is statedb emitting logs for the same methods that are exposed to the tracers via a statedb instance. Honestly I don't like so much that we are exposing statedb to the tracers. We had to do it exactly to fetch prestate values. So IMO if we add read hooks we can drop the statedb. But I'd ask for opinion from users before committing to that. Generally I am ok with your approach if it allows us to keep the read hooks :) |
Honestly I don't like so much that we are exposing statedb to the tracers. We had to do it exactly to fetch prestate values. So IMO if we add read hooks we can drop the statedb.
Well, the prestate tracer is perhaps the first driving usecase, but querying state from a tracer is *very* useful and powerful. I would prefer it to remain, definitely! Trying to collect state by catching per-scope readhooks sounds like a nightmare in comparison :)
Generally I am ok with your approach if it allows us to keep the read hooks :)
With my approach as a basis, I won't object to you read-hooking the entire statedb interface.
|
I think if it's possible to remove state read access from tracers, we should do it. Full state access will become impossible later with stateless clients, so it will have to be removed at that time anyway. |
Well, I would assume users of somewhat advanced tracing to be using stateful clients. Anyway, it is a decision unrelated to this PR. It is related to the other PR, since that one doesn't have the ability to switch between logging/nonlogging statedb. If we want to remove state access, let's someone make a PR and discuss it then, IMO. @fjl , you are a type artist. Any ideas for making the hacks in this PR neater? |
@rjl493456442 implemented an alternative state overriding here: #29950. In this implementation, we switch out the backend from underneath the // Reader defines the interface for accessing accounts and storage slots
// associated with a specific state.
type Reader interface {
// Account retrieves the account associated with a particular address.
//
// - Returns a nil account if it does not exist
// - Returns an error only if an unexpected issue occurs
// - The returned account is safe to modify after the call
Account(addr common.Address) (*types.StateAccount, error)
// Storage retrieves the storage slot associated with a particular account
// address and slot key.
//
// - Returns an empty slot if it does not exist
// - Returns an error only if an unexpected issue occurs
// - The returned storage slot is safe to modify after the call
Storage(addr common.Address, slot common.Hash) (common.Hash, error)
// Copy returns a deep-copied state reader.
Copy() Reader
} But if the the core parts of So we could have e.g. I am not really sure what are the pros and cons with either approach. @rjl493456442 any thoughts? |
0a4002e
to
54d7790
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Notes:
|
Notes:
|
…te layer core/state: wip move state log mechanism in to a separate layer core/state: fix tests core/state: fix miscalc in OnBalanceChange eth/tracers: fix tests core/state: re-enable verkle witness generation internal/ethapi: fix simulation + new logging schema
…g burn core/state, core/vm: refactor statedb hooking core/state: trace consensus finalize and system calls eth/tracers/internal/tracetest: fix tests after refactor core/state: some renaming and cleanup of statedb-hooking system core/state: remove unecessary methods, implement hooked subbalance, more testing
16fa089
to
8ca0cd5
Compare
return prev | ||
} | ||
|
||
func (s *hookedStateDB) SetNonce(address common.Address, nonce uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this SetNonce
to IncreaseNonce
? The semantic always expect to increment the nonce by 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About this. I'd prefer to leave that change out of this PR
@rjl493456442 your commit 8c7526c undoes an intentional change:
|
350d4bb
to
21f5a77
Compare
@holiman I undo my cleanup commit and push various fixes on top |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a really nice refactor. Left a couple thoughts. Looking forward to trying another statedb wrapper for witness gathering.
|
||
// hookedStateDB represents a statedb which emits calls to tracing-hooks | ||
// on state operations. | ||
type hookedStateDB struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracingStateDB?
hookedStateDB means very little to me. I would have kinda expected it to have generic hooks support instead of just tracing hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially called it that. But then I thought, it's using the *tracing.Hooks
. I can go with either
For witness gathering, this PR will not make any noticeable difference. Witnesses are not collected at the level of the statedb API; they're more in the MPT node level. |
This PR moves the logging/tracing-facilities out of `*state.StateDB`, in to a wrapping struct which implements `vm.StateDB` instead. In most places, it is a pretty straight-forward change: - First, hoisting the invocations from state objects up to the statedb. - Then making the mutation-methods simply return the previous value, so that the external logging layer could log everything. Some internal code uses the direct object-accessors to mutate the state, particularly in testing and in setting up state overrides, which means that these changes are unobservable for the hooked layer. Thus, configuring the overrides are not necessarily part of the API we want to publish. The trickiest part about the layering is that when the selfdestructs are finally deleted during `Finalise`, there's the possibility that someone sent some ether to it, which is burnt at that point, and thus needs to be logged. The hooked layer reaches into the inner layer to figure out these events. In package `vm`, the conversion from `state.StateDB + hooks` into a hooked `vm.StateDB` is performed where needed. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
) This PR moves the logging/tracing-facilities out of `*state.StateDB`, in to a wrapping struct which implements `vm.StateDB` instead. In most places, it is a pretty straight-forward change: - First, hoisting the invocations from state objects up to the statedb. - Then making the mutation-methods simply return the previous value, so that the external logging layer could log everything. Some internal code uses the direct object-accessors to mutate the state, particularly in testing and in setting up state overrides, which means that these changes are unobservable for the hooked layer. Thus, configuring the overrides are not necessarily part of the API we want to publish. The trickiest part about the layering is that when the selfdestructs are finally deleted during `Finalise`, there's the possibility that someone sent some ether to it, which is burnt at that point, and thus needs to be logged. The hooked layer reaches into the inner layer to figure out these events. In package `vm`, the conversion from `state.StateDB + hooks` into a hooked `vm.StateDB` is performed where needed. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
In this PR, I have moved the logging-facilities out of
*state.StateDB
, in to a wrapping struct which implementsvm.StateDB
instead.In most places, it was pretty straight-forward.
Some internal code uses the direct object-accessors to mutate the state, particularly in testing and in setting up state overrides, which means that these changes are unobservable for the hooked layer. This is fine, how we configure the overrides are not necessarily part of the API we want to publish.
The trickiest part about the layering is that when the selfdestructs are finally deleted during
Finalise
, there's the possibility that someone sent some ether to it, which is burnt at that point, and thus needs to be logged. The hooked layer reaches into the inner layer to figure out these events.In package
vm
, the conversion fromstate.StateDB + hooks
into a hookedvm.StateDB
is performed where needed.