Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/tracing: state journal wrapper #30441

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
8659e68
core/tracing: add vm context to system call hook
s1na Aug 26, 2024
b4e0174
core/tracing: add GetCodeHash to statedb interface
s1na Aug 26, 2024
f670a7f
core/tracing: emit state change events for journal reverts
s1na Aug 26, 2024
cf873c3
core/tracing: add hook for reverted out blocks
s1na Aug 26, 2024
365b715
log selfdestructs balance revert
s1na Aug 27, 2024
aac4024
Add state read hooks
s1na Sep 2, 2024
dbe5f83
add tracing journal
s1na Sep 15, 2024
b87c4fe
update changelog
s1na Sep 16, 2024
702a42f
fix indent
s1na Sep 16, 2024
c915bed
add block hash read hook
s1na Oct 3, 2024
838fc25
resolve merge conflict
s1na Oct 4, 2024
1cc58cf
fix code and nonce param order
s1na Oct 4, 2024
3c58155
update test
s1na Oct 4, 2024
501f302
pass-through non-journaled hooks
s1na Oct 5, 2024
1a64297
missed two hooks
s1na Oct 5, 2024
1862333
fix journal cur rev Id
s1na Oct 8, 2024
6650000
add note on balanceChangeRevert reason
s1na Oct 9, 2024
d9de74e
refactor WrapWithJournal to use reflection
s1na Oct 9, 2024
d2ba76f
add license to journal_test
s1na Oct 10, 2024
a2ca5f8
add desc for revert change reason
s1na Oct 10, 2024
85a85d0
add OnSystemCallStartV2
s1na Oct 14, 2024
2754b41
drop OnReorg
s1na Oct 17, 2024
92337d8
rm newline
s1na Oct 17, 2024
36b4194
Merge branch 'master' into tracing/v1.1
s1na Oct 17, 2024
efed5a6
fix OnTxEnd
s1na Oct 24, 2024
ea92ef4
Merge branch 'master' into tracing/v1.1
s1na Oct 24, 2024
fbd1d19
fix pre-post block process fns
s1na Oct 24, 2024
0f005af
mv read hooks to statedb_hooked
s1na Oct 24, 2024
5e4d6b8
add whitespace
s1na Oct 24, 2024
4d2fb0e
update changelog
s1na Oct 24, 2024
b37f2ac
Merge branch 'master' into tracing/v1.1
s1na Oct 25, 2024
a0f7cd6
Add test for all underlying hooks being called
s1na Oct 25, 2024
87582a4
Merge branch 'master' into tracing/v1.1
s1na Nov 11, 2024
6e4d14c
resolve merge conflict
s1na Nov 26, 2024
553f023
handle creation nonce in journal
s1na Nov 26, 2024
be93d72
Merge branch 'master' into tracing/v1.1
s1na Dec 2, 2024
1dda30d
Merge branch 'master' into tracing/v1.1
s1na Dec 3, 2024
4acea3b
rm OnCodeSizeRead
s1na Dec 3, 2024
018df6b
rm onreorg type
s1na Dec 3, 2024
60b2222
wrapper func for OnSystemCallStart
s1na Dec 3, 2024
6c56ea5
update changelog
s1na Dec 3, 2024
f4cf2a5
Merge branch 'master' into tracing/v1.1
s1na Dec 4, 2024
7fb2688
run go generate
s1na Dec 4, 2024
3228063
rm read hooks
s1na Dec 9, 2024
95b82cf
lint issue
s1na Dec 9, 2024
de48d55
fix changelog
s1na Dec 10, 2024
9cae376
un-expose hooks copy
s1na Dec 10, 2024
bf51dde
Merge branch 'master' into tracing/v1.1
s1na Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions core/tracing/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,47 @@ All notable changes to the tracing interface will be documented in this file.

## [Unreleased]

The tracing interface has been extended with backwards-compatible changes to support more use-cases and simplify tracer code. The most notable change is a state journaling library which emits reverse events when a call is reverted.

### Deprecated methods

- `OnSystemCallStart()`: This hook is deprecated in favor of `OnSystemCallStartV2(vm *VMContext)`.

### New methods

- `OnBlockHashRead(blockNum uint64, hash common.Hash)`: This hook is called when a block hash is read by EVM.
- `OnSystemCallStartV2(vm *VMContext)`. This allows access to EVM context during system calls. It is a successor to `OnSystemCallStart`.

### Modified types

- `VMContext.StateDB` has been extended with `GetCodeHash(addr common.Address) common.Hash` method used to retrieve the code hash an account.
- `BalanceChangeReason` has been extended with the `BalanceChangeRevert` reason. More on that below.
s1na marked this conversation as resolved.
Show resolved Hide resolved

### State journaling

Tracers receive state changes events from the node. The tracer was so far expected to keep track of modified accounts and slots and revert those changes when a call frame failed. Now a utility tracer wrapper is provided which will emit "reverse change" events when a call frame fails. To use this feature the hooks have to be wrapped prior to registering the tracer. The following example demonstrates how to use the state journaling library:

```go
func init() {
tracers.LiveDirectory.Register("test", func (cfg json.RawMessage) (*tracing.Hooks, error) {
hooks, err := newTestTracer(cfg)
if err != nil {
return nil, err
}
return tracing.WrapWithJournal(hooks)
})
}
```

The state changes that are covered by the journaling library are:

- `OnBalanceChange`. Note that `OnBalanceChange` will carry the `BalanceChangeRevert` reason.
- `OnNonceChange`
- `OnCodeChange`
- `OnStorageChange`

## [v1.14.9](https://github.com/ethereum/go-ethereum/releases/tag/v1.14.9)

### Modified types

- `GasChangeReason` has been extended with the following reasons which will be enabled only post-Verkle. There shouldn't be any gas changes with those reasons prior to the fork.
Expand Down
5 changes: 3 additions & 2 deletions core/tracing/gen_balance_change_reason_stringer.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 27 additions & 0 deletions core/tracing/hooks.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ package tracing

import (
"math/big"
"reflect"

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/core/types"
Expand Down Expand Up @@ -172,6 +173,9 @@ type (

// LogHook is called when a log is emitted.
LogHook = func(log *types.Log)

// BlockHashReadHook is called when EVM reads the blockhash of a block.
BlockHashReadHook = func(blockNumber uint64, hash common.Hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use-case is to have access to the headers of hashes that are accessed by the EVM. Alternative would be if we added a GetHeaderByHash method somewhere. But getting the hash from OnOpcode is also tricky since the hash will be put on the stack after OnOpcode is invoked.

)

type Hooks struct {
Expand Down Expand Up @@ -199,6 +203,25 @@ type Hooks struct {
OnCodeChange CodeChangeHook
OnStorageChange StorageChangeHook
OnLog LogHook
// Block hash read
OnBlockHashRead BlockHashReadHook
}

// copy creates a new Hooks instance with all implemented hooks copied from the original.
// Note: it is not a deep copy. If a hook has been implemented as a closure and acts on
// a mutable state, the copied hook will still act on the same state.
func (h *Hooks) copy() *Hooks {
copied := &Hooks{}
srcValue := reflect.ValueOf(h).Elem()
dstValue := reflect.ValueOf(copied).Elem()

for i := 0; i < srcValue.NumField(); i++ {
field := srcValue.Field(i)
if !field.IsNil() {
dstValue.Field(i).Set(field)
}
}
return copied
}

// BalanceChangeReason is used to indicate the reason for a balance change, useful
Expand Down Expand Up @@ -250,6 +273,10 @@ const (
// account within the same tx (captured at end of tx).
// Note it doesn't account for a self-destruct which appoints itself as recipient.
BalanceDecreaseSelfdestructBurn BalanceChangeReason = 14

// BalanceChangeRevert is emitted when the balance is reverted back to a previous value due to call failure.
s1na marked this conversation as resolved.
Show resolved Hide resolved
s1na marked this conversation as resolved.
Show resolved Hide resolved
// It is only emitted when the tracer has opted in to use the journaling wrapper.
BalanceChangeRevert BalanceChangeReason = 15
)

// GasChangeReason is used to indicate the reason for a gas change, useful
Expand Down
259 changes: 259 additions & 0 deletions core/tracing/journal.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
// Copyright 2024 The go-ethereum Authors
// This file is part of the go-ethereum library.
//
// The go-ethereum library is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The go-ethereum library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.

package tracing

import (
"fmt"
"math/big"
"sort"

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/core/types"
)

const (
CREATE = 0xf0
CREATE2 = 0xf5
)

type revision struct {
id int
journalIndex int
}

// journal is a state change journal to be wrapped around a tracer.
// It will emit the state change hooks with reverse values when a call reverts.
type journal struct {
entries []entry
hooks *Hooks
lastCreator *common.Address // Account that initiated the last contract creation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a hack. I don't see how this can be accurately updated going forward and backward along the entries. I mean, an inner scope will overwrite the outer lastCreator, and when the inner scope is reverted, the lastCreator will not be set back correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if we're inside a creation, and inside the constructor we call ripemd to calculate a signature: we lost lastCreator.


validRevisions []revision
nextRevisionId int
revIds []int
Comment on lines +45 to +47
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why you need to track this. Why not just maintain a list of entries, and you hand out the id which is the current length of the entries?

}
Comment on lines +40 to +48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the linearJournal in my PR #30660 is IMO a better base to start from. It does away with validrevisions and revIds, instead it just as a list of indexes, revisions, which point to an entry.

type linearJournal struct {
	entries []journalEntry         // Current changes tracked by the linearJournal
	dirties map[common.Address]int // Dirty accounts and the number of changes

	revisions []int // sequence of indexes to points in time designating snapshots
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at #30660 and agree it is a better way to do journaling for tracers. The key point for me there is that there will be only 1 revert hook emitted as opposed to one for each change to a state element.

Given that #30660 seems to be still in flux I like to wait on it to be merged and implement it for tracers in a future PR as an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These points were discussed at standup:

  • It looks like we will need to change the model a bit with the set-based journal as it operates on accounts, requiring also a new hook like OnAccountReverted and the inconsistencies there around emitting state changes on the field level and the reverts being on the account level.
  • The point was raised by @holiman that we are exposing a behaviour to tracers that will be hard to revert. The behaviour in question is the reverse of every state change (i.e. if Balance: A->B->C, we emit Balance: C->B->A on revert instead of just Balance: C->A).

On the second point I'd like to add my perspective: This journal is simply a wrapper around the tracers. We are not changing tracing interface semantics at all. Users can copy this file and run it themselves right now. And we are exposing every state modification, and I believe the reverse of it is the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we will need to change the model a bit with the set-based journal

I don't get it. The two PRs, the two journals are unrelated. You have opted to copy-paste the legacy linear journal-implementation. I think the new linear journal-implementation is better/simpler.

You could also have chosen to copy-paste the set-based journal-implementation. I don't care which you choose, really, but I don't see any point in picking one now and switching later. If you want another, pick that one from the get-go ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can copy this file and run it themselves right now.

They can't perform WrapWithJournal from "user-space," can they? Isn't that what makes this a big "blessed" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can't perform WrapWithJournal from "user-space," can they? Isn't that what makes this a big "blessed" ?

Right exact copy wouldn't work. They'd have to implement WrapWithJournal locally, but it's totally possible.

I don't care which you choose, really, but I don't see any point in picking one now and switching later.

I think the current journal is good, it is consistent with the existing interface, and it has been running in geth for quite a while.


type entry interface {
revert(tracer *Hooks)
}

// WrapWithJournal wraps the given tracer with a journaling layer.
func WrapWithJournal(hooks *Hooks) (*Hooks, error) {
if hooks == nil {
return nil, fmt.Errorf("wrapping nil tracer")
}
// No state change to journal, return the wrapped hooks as is
if hooks.OnBalanceChange == nil && hooks.OnNonceChange == nil && hooks.OnCodeChange == nil && hooks.OnStorageChange == nil {
return hooks, nil
}

// Create a new Hooks instance and copy all hooks
wrapped := hooks.copy()
// Create journal
j := &journal{entries: make([]entry, 0), hooks: hooks}
// Scope hooks need to be re-implemented.
wrapped.OnTxEnd = j.OnTxEnd
wrapped.OnEnter = j.OnEnter
wrapped.OnExit = j.OnExit
// Wrap state change hooks.
if hooks.OnBalanceChange != nil {
wrapped.OnBalanceChange = j.OnBalanceChange
}
if hooks.OnNonceChange != nil {
wrapped.OnNonceChange = j.OnNonceChange
}
if hooks.OnCodeChange != nil {
wrapped.OnCodeChange = j.OnCodeChange
}
if hooks.OnStorageChange != nil {
wrapped.OnStorageChange = j.OnStorageChange
}

return wrapped, nil
}

// reset clears the journal, after this operation the journal can be used anew.
// It is semantically similar to calling 'NewJournal', but the underlying slices
// can be reused.
func (j *journal) reset() {
j.entries = j.entries[:0]
j.validRevisions = j.validRevisions[:0]
j.nextRevisionId = 0
}

// snapshot returns an identifier for the current revision of the state.
func (j *journal) snapshot() int {
id := j.nextRevisionId
j.nextRevisionId++
j.validRevisions = append(j.validRevisions, revision{id, j.length()})
return id
}

// revertToSnapshot reverts all state changes made since the given revision.
func (j *journal) revertToSnapshot(revid int, hooks *Hooks) {
// Find the snapshot in the stack of valid snapshots.
idx := sort.Search(len(j.validRevisions), func(i int) bool {
return j.validRevisions[i].id >= revid
})
if idx == len(j.validRevisions) || j.validRevisions[idx].id != revid {
panic(fmt.Errorf("revision id %v cannot be reverted", revid))
}
snapshot := j.validRevisions[idx].journalIndex

// Replay the journal to undo changes and remove invalidated snapshots
j.revert(hooks, snapshot)
j.validRevisions = j.validRevisions[:idx]
}

// revert undoes a batch of journaled modifications.
func (j *journal) revert(hooks *Hooks, snapshot int) {
for i := len(j.entries) - 1; i >= snapshot; i-- {
// Undo the changes made by the operation
j.entries[i].revert(hooks)
}
j.entries = j.entries[:snapshot]
}

// length returns the current number of entries in the journal.
func (j *journal) length() int {
return len(j.entries)
}

func (j *journal) OnTxEnd(receipt *types.Receipt, err error) {
j.reset()
if j.hooks.OnTxEnd != nil {
j.hooks.OnTxEnd(receipt, err)
}
}

func (j *journal) OnEnter(depth int, typ byte, from common.Address, to common.Address, input []byte, gas uint64, value *big.Int) {
j.revIds = append(j.revIds, j.snapshot())
if typ == CREATE || typ == CREATE2 {
j.lastCreator = &from
}
if j.hooks.OnEnter != nil {
j.hooks.OnEnter(depth, typ, from, to, input, gas, value)
}
}

func (j *journal) OnExit(depth int, output []byte, gasUsed uint64, err error, reverted bool) {
if j.lastCreator != nil {
j.lastCreator = nil
}
revId := j.revIds[len(j.revIds)-1]
j.revIds = j.revIds[:len(j.revIds)-1]
if reverted {
j.revertToSnapshot(revId, j.hooks)
}
if j.hooks.OnExit != nil {
j.hooks.OnExit(depth, output, gasUsed, err, reverted)
}
}

func (j *journal) OnBalanceChange(addr common.Address, prev, new *big.Int, reason BalanceChangeReason) {
j.entries = append(j.entries, balanceChange{addr: addr, prev: prev, new: new})
if j.hooks.OnBalanceChange != nil {
j.hooks.OnBalanceChange(addr, prev, new, reason)
}
}

func (j *journal) OnNonceChange(addr common.Address, prev, new uint64) {
// When a contract is created, the nonce of the creator is incremented.
// This change is not reverted when the creation fails.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what? Doesn't that depend on ... things.. ? Or is it always the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For evm-creates (as opposed to create-tx where the flow is different), the new scope begins here: https://github.com/ethereum/go-ethereum/blob/master/core/vm/evm.go#L425 , and the nonce-bump happens a few lines later: https://github.com/ethereum/go-ethereum/blob/master/core/vm/evm.go#L442

However, the actual snapshot is taken even further below: https://github.com/ethereum/go-ethereum/blob/master/core/vm/evm.go#L479 .

So that's the mismatch you see. What the tracing percieves is

SCOPE_START{
    NONCE ++
    OTHER_STUFF
}

But the reality is

NONCE++
SCOPE_START{
    OTHER_STUFF
}

So when the scope reverts, you see this weird inconsistency that somehow nonce is a special snowflake.

The proper fix would be to move the captureBegin-invocation down, so it happens close to where we take the snapshot. The nonce-bump belongs to the parent scope.

if j.lastCreator != nil && *j.lastCreator == addr {
// Skip only the first nonce change.
j.lastCreator = nil
} else {
j.entries = append(j.entries, nonceChange{addr: addr, prev: prev, new: new})
}
if j.hooks.OnNonceChange != nil {
j.hooks.OnNonceChange(addr, prev, new)
}
}

func (j *journal) OnCodeChange(addr common.Address, prevCodeHash common.Hash, prevCode []byte, codeHash common.Hash, code []byte) {
j.entries = append(j.entries, codeChange{
addr: addr,
prevCodeHash: prevCodeHash,
prevCode: prevCode,
newCodeHash: codeHash,
newCode: code,
})
if j.hooks.OnCodeChange != nil {
j.hooks.OnCodeChange(addr, prevCodeHash, prevCode, codeHash, code)
}
}

func (j *journal) OnStorageChange(addr common.Address, slot common.Hash, prev, new common.Hash) {
j.entries = append(j.entries, storageChange{addr: addr, slot: slot, prev: prev, new: new})
if j.hooks.OnStorageChange != nil {
j.hooks.OnStorageChange(addr, slot, prev, new)
}
}

type (
balanceChange struct {
addr common.Address
prev *big.Int
new *big.Int
}

nonceChange struct {
addr common.Address
prev uint64
new uint64
}

codeChange struct {
addr common.Address
prevCodeHash common.Hash
prevCode []byte
newCodeHash common.Hash
newCode []byte
}

storageChange struct {
addr common.Address
slot common.Hash
prev common.Hash
new common.Hash
}
)

func (b balanceChange) revert(hooks *Hooks) {
if hooks.OnBalanceChange != nil {
hooks.OnBalanceChange(b.addr, b.new, b.prev, BalanceChangeRevert)
}
}

func (n nonceChange) revert(hooks *Hooks) {
if hooks.OnNonceChange != nil {
hooks.OnNonceChange(n.addr, n.new, n.prev)
}
}

func (c codeChange) revert(hooks *Hooks) {
if hooks.OnCodeChange != nil {
hooks.OnCodeChange(c.addr, c.newCodeHash, c.newCode, c.prevCodeHash, c.prevCode)
}
}

func (s storageChange) revert(hooks *Hooks) {
if hooks.OnStorageChange != nil {
hooks.OnStorageChange(s.addr, s.slot, s.new, s.prev)
}
}
Loading