-
Notifications
You must be signed in to change notification settings - Fork 20k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inconsistent JS tracer fails #23559
Comments
I was able to reproduce this. Funnily enough my error contains the same address:
Edit: on a second run:
|
Hm, What's odd about this, IMO, is that if we had an underlying trie error, I'd expect the value-lookup to return |
The address
Is it possible the account balance wasn't fetched/updated correctly in the context of tracing? |
Yes I had the same idea, but it's not as straight-forward as that. On the upside, I am now able to somewhat reliably reproduce it, so I'm throwing various debug code at it to narrow it down further. What makes it a bit hard is that "reliably repro" still means it takes ~4 minutes of tracing for it to hit, and it doesn't happen on the same place each time -- even rerunning the tracing without restarting the node doesn't make it hit on the same place. |
FWIW, I also found that the OP report also had the same account producing a block just 10 blocks before.
|
I'm fairly certain that the root cause has been identified, and it's fixed by #23632 . Please give it a test and report back if it seems to work. |
It sure looks like the fix is working. I let the script run overnight and had no fail so far, whereas before it would almost certainly fail after a couple of hours. Thank's a lot for your work everyone, it is very much appreciated 🙏 |
Great, thanks for reporting back! So instead of tracing 0-13M blocks, it's better to trace e.g. 0-1M, 1M-2M, etc. We might fix something so it automatically clears out the in-memory data and starts from a new disk trie at certain intervals, but that's how it is right now. |
Thanks for the heads up. That would actually benefit us even more because once we're done with 0-13M, we'd need to trace 13-14M at some point. However we also want to do this while not having to run a full archive node. Is that possible with your suggestion? Currently when I try to continue tracing from a point where I previously had stopped, I get an error like this:
|
You can give this a spin - it will work badly on archive nodes, but probably work ok on a normal node. Whenever it finds a disk-root, it will throw away the mem-trie and start fresh. Should handle arbitrary long range of blocks.
|
...and the link: holiman@fcd22b0
|
Just wanted to add to what @holiman said: you can use #23646 to find out for which blocks you have the state on disk. It's possible to start the chain tracer from those blocks. But if you did a fast/snap sync then up until the first synced state there'll be no state on disk. So that means it's not right now possible to start the chain tracer from blocks [1, firstSyncedBlock). |
Thanks for the addition, that's very valuable information for us 🙏 |
Yes
Either memory constraints or ~1h of EVM processing, which usually means every ~10-15K blocks, or node restart |
That sounds exactly like what I need, thanks a lot everyone 😌🤩 |
System information
Geth version:
1.10.6-stable
,1.10.9-unstable
OS & Version: Linux
Commit hash :
576681f29b895dd39e559b7ba17fcd89b42e4833
,90987db7334c1d10eb866ca550efedb66dea8a20
Expected behaviour
I am trying to do chain tracing from block 0 until now using a custom JS tracer. We want to extract data which otherwise would be difficult to obtain.
Actual behaviour
However we are experiencing seemingly random aborts of the tracings. At first I thought it was due to my js code, but after further experimenting I found that using even the
noop_tracer.js
can produce these random aborts. Here are some of the error logs when the abort happens:Whats even more frustrating is that these happen inconsistently and I can't reproduce the failures at the same transaction between different runs. For what it's worth, I could confirm this behavior on two different machines on two different versions of geth. The logs also make me think that the fails are due to different causes, which would make this even more annoying.
The furthest I was able to trace was up to block ~2M
Steps to reproduce the behaviour (it eventually happens...)
Geth is started with:
IPC socket tracer command (using
nc -U geth.ipc
):The text was updated successfully, but these errors were encountered: