Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/tracing: state journal wrapper #30441

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Sep 16, 2024

Implements #30356

NonceReadHook = func(addr common.Address, nonce uint64)

// CodeReadHook is called when EVM reads the code of an account.
CodeReadHook = func(addr common.Address, code []byte)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question: should we add codeHash here to be consistent with OnCodeChange?

@s1na
Copy link
Contributor Author

s1na commented Oct 8, 2024

Ah seems like the journal has a crasher:

revisions: [{0 2} {1 4} {2 4} {3 4} {4 6} {5 6} {6 9} {7 11} {8 12} {9 18} {10 18} {11 20} {12 22} {13 24} {14 24} {18 27}]
panic: revision id 17 cannot be reverted

goroutine 10470 [running]:
github.com/ethereum/go-ethereum/core/tracing.(*journal).revertToSnapshot(0xc050c41c70, 0x11, 0xc0570360e0)
        github.com/ethereum/go-ethereum/core/tracing/journal.go:170 +0x185
github.com/ethereum/go-ethereum/core/tracing.(*journal).OnExit(0xc050c41c70, 0x0, {0xc13a87fe30, 0x64, 0x64}, 0x48dc9, {0x203f680, 0xc018bcc978}, 0x1)
        github.com/ethereum/go-ethereum/core/tracing/journal.go:206 +0x6f
github.com/ethereum/go-ethereum/core/vm.(*EVM).captureEnd(0xc13a9e0780?, 0x0, 0x12e208, 0xe543f, {0xc13a87fe30, 0x64, 0x64}, {0x203da40, 0x2e05070})


### New methods

- `OnReorg(reverted []*types.Block)`: This hook is called when a reorg is detected. The `reverted` slice contains the blocks that are no longer part of the canonical chain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here types block is very very heavy. You should at most pass headers and allow chain access to pull the blocks on demand (chain access in someconstructor, ha)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought what is the issue? it is a slice so passed by reference and the memory can be freed as soon as OnReorg processing is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this is annoying. So reorg in the blockchain at some point in the past used to collect blocks. Turned out that sometimes it became insanely heavy and we've switched so it operates on headers. I guess later someone refactored it back to operate on blocks again. This is an issue when you do setHead or any similar operation; of even if finality fails for a while and you have blocks reorging back and forth. It's very very bad to pull all the block in from disk IMO.

CC @holiman @rjl493456442 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I don't particularly recall switching from headers to blocks....

@@ -194,6 +221,30 @@ type Hooks struct {
OnCodeChange CodeChangeHook
OnStorageChange StorageChangeHook
OnLog LogHook
// State reads
OnBalanceRead BalanceReadHook
OnNonceRead NonceReadHook
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question from triage: how exactly is OnNonceRead used?

@s1na
Copy link
Contributor Author

s1na commented Oct 24, 2024

I have pulled in the latest master changes and moved the state read hooks over to the hooked statedb.

One thing to know about the state read hooks: They will not give you the full prestate by themselves. You will need also the previous values emitted as part of state change hooks. This is because e.g. statedb.AddBalance does not do statedb.GetBalance internally.

@s1na
Copy link
Contributor Author

s1na commented Nov 26, 2024

Copying from the chat with @nebojsa94:

Also, regarding jorunaling logic on your branch tracing V1.1, there’s an edge case with failed contract creation where the nonce is reverted, but it shouldn’t be. This happens because CaptureEnter is triggered before the nonce is incremented for contract creation prior to state snapshotting.

@fjl fjl added this to the 1.14.13 milestone Nov 28, 2024
@s1na s1na changed the title core/tracing: v1.1 core/tracing: state journal wrapper Dec 10, 2024
}

// Copy creates a new Hooks instance with all implemented hooks copied from the original.
func (h *Hooks) Copy() *Hooks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this method needed? It's not obvious to me what the side-effects are. Typically, the hooks might be closures, and the closures are still referenced as if they were not copied.

func TestHooks(t *testing.T) {
	counter := 0
	a := &Hooks{
		OnClose: func() {
			counter++
		},
	}
	a.OnClose()
	t.Logf("counter is %d", counter)
	a.Copy().OnClose()
	t.Logf("counter is %d", counter)
}

outputs:

    hooks_test.go:13: counter is 1
    hooks_test.go:15: counter is 2

I'm curious why you'd ever need to use this Copy method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it because you want to copy all, but not have to specify all manually?

Copy link
Contributor Author

@s1na s1na Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically, the hooks might be closures, and the closures are still referenced as if they were not copied.

Right you are correct I had not foreseen that. After thinking a bit, it feels like for my use-case that is fine. Essentially the clone will replace some of the methods to add some pre-processing logic. The rest are supposed to execute as in the original tracer.

I have un-exported the Copy method to avoid people to shoot themselves in the foot and added a comment to clarify this point.

Edit: right what I want is to add the journal in front of the tracer and process some of the hooks first before proxying back to tracer. And this without having to iterate the list of all hooks which I find very error-prone. I have already had to fix bugs because of missing some hook in there.

@@ -172,6 +173,9 @@ type (

// LogHook is called when a log is emitted.
LogHook = func(log *types.Log)

// BlockHashReadHook is called when EVM reads the blockhash of a block.
BlockHashReadHook = func(blockNumber uint64, hash common.Hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use-case is to have access to the headers of hashes that are accessed by the EVM. Alternative would be if we added a GetHeaderByHash method somewhere. But getting the hash from OnOpcode is also tricky since the hash will be put on the stack after OnOpcode is invoked.

Comment on lines +40 to +48
type journal struct {
entries []entry
hooks *Hooks
lastCreator *common.Address // Account that initiated the last contract creation

validRevisions []revision
nextRevisionId int
revIds []int
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the linearJournal in my PR #30660 is IMO a better base to start from. It does away with validrevisions and revIds, instead it just as a list of indexes, revisions, which point to an entry.

type linearJournal struct {
	entries []journalEntry         // Current changes tracked by the linearJournal
	dirties map[common.Address]int // Dirty accounts and the number of changes

	revisions []int // sequence of indexes to points in time designating snapshots
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at #30660 and agree it is a better way to do journaling for tracers. The key point for me there is that there will be only 1 revert hook emitted as opposed to one for each change to a state element.

Given that #30660 seems to be still in flux I like to wait on it to be merged and implement it for tracers in a future PR as an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These points were discussed at standup:

  • It looks like we will need to change the model a bit with the set-based journal as it operates on accounts, requiring also a new hook like OnAccountReverted and the inconsistencies there around emitting state changes on the field level and the reverts being on the account level.
  • The point was raised by @holiman that we are exposing a behaviour to tracers that will be hard to revert. The behaviour in question is the reverse of every state change (i.e. if Balance: A->B->C, we emit Balance: C->B->A on revert instead of just Balance: C->A).

On the second point I'd like to add my perspective: This journal is simply a wrapper around the tracers. We are not changing tracing interface semantics at all. Users can copy this file and run it themselves right now. And we are exposing every state modification, and I believe the reverse of it is the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we will need to change the model a bit with the set-based journal

I don't get it. The two PRs, the two journals are unrelated. You have opted to copy-paste the legacy linear journal-implementation. I think the new linear journal-implementation is better/simpler.

You could also have chosen to copy-paste the set-based journal-implementation. I don't care which you choose, really, but I don't see any point in picking one now and switching later. If you want another, pick that one from the get-go ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can copy this file and run it themselves right now.

They can't perform WrapWithJournal from "user-space," can they? Isn't that what makes this a big "blessed" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can't perform WrapWithJournal from "user-space," can they? Isn't that what makes this a big "blessed" ?

Right exact copy wouldn't work. They'd have to implement WrapWithJournal locally, but it's totally possible.

I don't care which you choose, really, but I don't see any point in picking one now and switching later.

I think the current journal is good, it is consistent with the existing interface, and it has been running in geth for quite a while.

Comment on lines +45 to +47
validRevisions []revision
nextRevisionId int
revIds []int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why you need to track this. Why not just maintain a list of entries, and you hand out the id which is the current length of the entries?


func (j *journal) OnNonceChange(addr common.Address, prev, new uint64) {
// When a contract is created, the nonce of the creator is incremented.
// This change is not reverted when the creation fails.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what? Doesn't that depend on ... things.. ? Or is it always the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For evm-creates (as opposed to create-tx where the flow is different), the new scope begins here: https://github.com/ethereum/go-ethereum/blob/master/core/vm/evm.go#L425 , and the nonce-bump happens a few lines later: https://github.com/ethereum/go-ethereum/blob/master/core/vm/evm.go#L442

However, the actual snapshot is taken even further below: https://github.com/ethereum/go-ethereum/blob/master/core/vm/evm.go#L479 .

So that's the mismatch you see. What the tracing percieves is

SCOPE_START{
    NONCE ++
    OTHER_STUFF
}

But the reality is

NONCE++
SCOPE_START{
    OTHER_STUFF
}

So when the scope reverts, you see this weird inconsistency that somehow nonce is a special snowflake.

The proper fix would be to move the captureBegin-invocation down, so it happens close to where we take the snapshot. The nonce-bump belongs to the parent scope.

type journal struct {
entries []entry
hooks *Hooks
lastCreator *common.Address // Account that initiated the last contract creation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a hack. I don't see how this can be accurately updated going forward and backward along the entries. I mean, an inner scope will overwrite the outer lastCreator, and when the inner scope is reverted, the lastCreator will not be set back correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if we're inside a creation, and inside the constructor we call ripemd to calculate a signature: we lost lastCreator.

@s1na s1na added the status:marinating PR hasn't been open long enough to get merged label Dec 19, 2024
@fjl fjl removed the status:marinating PR hasn't been open long enough to get merged label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants