inconsistent JS tracer fails #23559

tanmaster · 2021-09-09T18:18:27Z

System information

Geth version: 1.10.6-stable, 1.10.9-unstable
OS & Version: Linux
Commit hash : 576681f29b895dd39e559b7ba17fcd89b42e4833, 90987db7334c1d10eb866ca550efedb66dea8a20

Expected behaviour

I am trying to do chain tracing from block 0 until now using a custom JS tracer. We want to extract data which otherwise would be difficult to obtain.

Actual behaviour

However we are experiencing seemingly random aborts of the tracings. At first I thought it was due to my js code, but after further experimenting I found that using even the noop_tracer.js can produce these random aborts. Here are some of the error logs when the abort happens:

- WARN [09-09|15:54:50.297] Chain tracing failed                     start=0          end=12,300,000 transactions=1,127,999 elapsed=10m16.166119292s err="missing trie node fe7d23a257f7528f27cb8d821f9212cdb4ce1835eca729e16464be6423ad1ae2 (path 03)"
- WARN [09-09|16:01:39.373] Tracing failed                           hash=9fe770..825038 block=531,280 err="tracing failed: insufficient funds for gas * price + value: address 0xa6BDc31c431B8F97848c9D60cbbe9660dc450eB7 have 5159916608128893981 want 5165818380000000000"
- WARN [09-09|16:01:39.984] Chain tracing failed                     start=0          end=12,300,000 transactions=566,584   elapsed=4m59.667592415s  err="processing block 531280 failed: could not apply tx 6 [0x9fe77018a2ecc3a011c74dc95b040c86050f67aa84e4f3f82bfc5c9837825038]: insufficient funds for gas * price + value: address 0xa6BDc31c431B8F97848c9D60cbbe9660dc450eB7 have 5159916608128893981 want 5165818380000000000"
- WARN [09-09|16:06:47.806] Chain tracing failed                     start=0          end=12,300,000 transactions=305,093 elapsed=2m31.62350779s  err="can't create storage trie: missing trie node ba59fa34a843ee0ac546128f5f1e9f4c889d38bf060948ecca48a0412b7107c0 (path )"
- WARN [09-09|16:11:57.855] Chain tracing failed                     start=0          end=12,300,000 transactions=207,006 elapsed=1m51.909702218s err="missing trie node 7f0767d73909ad822e4e811db203e6e6781cffdd46d082644ab4262bd6fb7c72 (path 0d0008)"
- WARN [09-09|19:29:08.723] Chain tracing failed                     start=0          end=13,000,000 transactions=542,976 elapsed=4m18.446828689s err="processing block 519064 failed: could not apply tx 1 [0x0c7706562bab01109373b69db37530eb3acd1ae711b709a5395ca23c1f31bef3]: gas limit reached"

Whats even more frustrating is that these happen inconsistently and I can't reproduce the failures at the same transaction between different runs. For what it's worth, I could confirm this behavior on two different machines on two different versions of geth. The logs also make me think that the fails are due to different causes, which would make this even more annoying.

The furthest I was able to trace was up to block ~2M

Steps to reproduce the behaviour (it eventually happens...)

Geth is started with:

geth \
--datadir='/mnt/7F050DA422DCAB55/.ethereum' \
--txlookuplimit=0 \
--verbosity=3 \
--vmdebug \
--maxpeers=0

IPC socket tracer command (using nc -U geth.ipc):

{"id": 1, "method": "debug_subscribe", "params": ["traceChain", "0x0", "0xc65d40", {"tracer": "{step: function(log, db) { }, fault: function(log, db) { }, result: function(ctx, db) { return {}; }}", "timeout": "10000s"}]}

The text was updated successfully, but these errors were encountered:

s1na · 2021-09-21T16:10:34Z

I was able to reproduce this. Funnily enough my error contains the same address:

WARN [09-21|15:13:41.060] Tracing failed                           hash=746e11..0532bc block=532,825 err="tracing failed: insufficient funds for gas * price + value: address 0xa6BDc31c431B8F97848c9D60cbbe9660dc450eB7 have 5011163958128893981 want 5014107070000000000"
WARN [09-21|15:13:41.352] Chain tracing failed                     start=0 end=13,000,000 transactions=568,482 elapsed=33m56.98872039s  err="processing block 532825 failed: could not apply tx 3 [0x746e11c653adc14484ef9ab3cd7ef47c926a0e13d88474d9e798ac6d660532bc]: insufficient funds for gas * price + value: address 0xa6BDc31c431B8F97848c9D60cbbe9660dc450eB7 have 5011163958128893981 want 5014107070000000000"

Edit: on a second run:

WARN [09-21|16:44:17.073] Chain tracing failed                     start=0 end=13,000,000 transactions=551,324 elapsed=32m48.451866988s err="commit aborted due to earlier
error: getDeleteStateObject (22d7957d81497cc48c1020de201b07fc526a0c59) error: missing trie node 33591eee04145bdd97b9d824b81d4852772e2adc3e8790bdfa5dc71a276a14d0 (path 0d00
)"

holiman · 2021-09-22T09:19:14Z

Hm, 5,014,107,078,128,893,981 is the true value of that account right then. The cost is indeed 5,014,107,070,000,000,000.

What's odd about this, IMO, is that if we had an underlying trie error, I'd expect the value-lookup to return 0 balance. Not a "nearly but not quite correct" value.

evertonfraga · 2021-09-22T15:42:28Z

The address 0xa6BDc31c431B8F97848c9D60cbbe9660dc450eB7 had mined several blocks prior to the transaction mentioned above. The closest block was mined only 9 blocks before it.

transaction with failed trace: https://etherscan.io/tx/0x746e11c653adc14484ef9ab3cd7ef47c926a0e13d88474d9e798ac6d660532bc
block mined by the sender of failed trace, nine blocks before the transaction above https://etherscan.io/block/532816

Is it possible the account balance wasn't fetched/updated correctly in the context of tracing?

holiman · 2021-09-22T16:41:43Z

 Is it possible the account balance wasn't fetched/updated correctly in the context of tracing?

Yes I had the same idea, but it's not as straight-forward as that. On the upside, I am now able to somewhat reliably reproduce it, so I'm throwing various debug code at it to narrow it down further.

What makes it a bit hard is that "reliably repro" still means it takes ~4 minutes of tracing for it to hit, and it doesn't happen on the same place each time -- even rerunning the tracing without restarting the node doesn't make it hit on the same place.

evertonfraga · 2021-09-22T16:49:45Z

FWIW, I also found that the OP report also had the same account producing a block just 10 blocks before.

address: 0xa6BDc31c431B8F97848c9D60cbbe9660dc450eB7
tx: 0x9fe77018a2ecc3a011c74dc95b040c86050f67aa84e4f3f82bfc5c9837825038 (block 531280)
mined block: https://etherscan.io/block/531270

holiman · 2021-09-24T15:25:14Z

I'm fairly certain that the root cause has been identified, and it's fixed by #23632 . Please give it a test and report back if it seems to work.

tanmaster · 2021-09-25T09:02:22Z

I'm fairly certain that the root cause has been identified, and it's fixed by #23632 . Please give it a test and report back if it seems to work.

It sure looks like the fix is working. I let the script run overnight and had no fail so far, whereas before it would almost certainly fail after a couple of hours.

Thank's a lot for your work everyone, it is very much appreciated 🙏

holiman · 2021-09-25T09:07:58Z

I let the script run overnight and had no fail so far, whereas before it would almost certainly fail after a couple of hours.

Great, thanks for reporting back!
Just FYI: if you trace long sequences, with the way it works currently, it's guaranteed to go OOM eventually. The chaintracer never flushes to disk (it does memory garbage collection though). Also, it never "resets" the base disk image, which is whatever you started tracing at (so e.g. genesis).

So instead of tracing 0-13M blocks, it's better to trace e.g. 0-1M, 1M-2M, etc. We might fix something so it automatically clears out the in-memory data and starts from a new disk trie at certain intervals, but that's how it is right now.

tanmaster · 2021-09-25T09:17:29Z

So instead of tracing 0-13M blocks, it's better to trace e.g. 0-1M, 1M-2M, etc. We might fix something so it automatically clears out the in-memory data and starts from a new disk trie at certain intervals, but that's how it is right now.

Thanks for the heads up. That would actually benefit us even more because once we're done with 0-13M, we'd need to trace 13-14M at some point. However we also want to do this while not having to run a full archive node. Is that possible with your suggestion?

Currently when I try to continue tracing from a point where I previously had stopped, I get an error like this:

WARN [09-25|11:15:26.160] Chain tracing failed start=50000 end=100,000 transactions=0 elapsed=133.902753ms err="required historical state unavailable (reexec=128)"

holiman · 2021-09-25T10:46:21Z

You can give this a spin - it will work badly on archive nodes, but probably work ok on a normal node. Whenever it finds a disk-root, it will throw away the mem-trie and start fresh. Should handle arbitrary long range of blocks.

holiman · 2021-09-25T10:47:45Z

...and the link: holiman@fcd22b0

s1na · 2021-09-28T15:36:54Z

Whenever it finds a disk-root, it will throw away the mem-trie and start fresh.

Just wanted to add to what @holiman said: you can use #23646 to find out for which blocks you have the state on disk. It's possible to start the chain tracer from those blocks. But if you did a fast/snap sync then up until the first synced state there'll be no state on disk. So that means it's not right now possible to start the chain tracer from blocks [1, firstSyncedBlock).

tanmaster · 2021-09-28T16:34:06Z

Just wanted to add to what @holiman said: you can use #23646 to find out for which blocks you have the state on disk. It's possible to start the chain tracer from those blocks. But if you did a fast/snap sync then up until the first synced state there'll be no state on disk. So that means it's not right now possible to start the chain tracer from blocks [1, firstSyncedBlock).

Thanks for the addition, that's very valuable information for us 🙏
So if the chain data was synchronized using the syncmode=full flag, there are some states that will permanently stay on my machine which I can use as anchors to start tracing? Is it defined somewhere how regularly the states are flushed? By what are these flushes triggered?

holiman · 2021-09-28T18:01:43Z

there are some states that will permanently stay on my machine which I can use as anchors to start tracing?

Yes

By what are these flushes triggered?

Either memory constraints or ~1h of EVM processing, which usually means every ~10-15K blocks, or node restart

tanmaster · 2021-09-28T19:50:23Z

Either memory constraints or ~1h of EVM processing, which usually means every ~10-15K blocks, or node restart

That sounds exactly like what I need, thanks a lot everyone 😌🤩

tanmaster added the type:bug label Sep 9, 2021

holiman mentioned this issue Sep 23, 2021

eth/tracers: avoid unsyncronized mutations on trie database #23632

Merged

holiman closed this as completed in #23632 Sep 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inconsistent JS tracer fails #23559

inconsistent JS tracer fails #23559

tanmaster commented Sep 9, 2021 •

edited by rjl493456442

Loading

s1na commented Sep 21, 2021 •

edited

Loading

holiman commented Sep 22, 2021

evertonfraga commented Sep 22, 2021 •

edited

Loading

holiman commented Sep 22, 2021

evertonfraga commented Sep 22, 2021

holiman commented Sep 24, 2021

tanmaster commented Sep 25, 2021

holiman commented Sep 25, 2021

tanmaster commented Sep 25, 2021 •

edited

Loading

holiman commented Sep 25, 2021 via email

holiman commented Sep 25, 2021 via email

s1na commented Sep 28, 2021

tanmaster commented Sep 28, 2021 •

edited

Loading

holiman commented Sep 28, 2021 •

edited

Loading

tanmaster commented Sep 28, 2021

inconsistent JS tracer fails #23559

inconsistent JS tracer fails #23559

Comments

tanmaster commented Sep 9, 2021 • edited by rjl493456442 Loading

System information

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour (it eventually happens...)

s1na commented Sep 21, 2021 • edited Loading

holiman commented Sep 22, 2021

evertonfraga commented Sep 22, 2021 • edited Loading

holiman commented Sep 22, 2021

evertonfraga commented Sep 22, 2021

holiman commented Sep 24, 2021

tanmaster commented Sep 25, 2021

holiman commented Sep 25, 2021

tanmaster commented Sep 25, 2021 • edited Loading

holiman commented Sep 25, 2021 via email

holiman commented Sep 25, 2021 via email

s1na commented Sep 28, 2021

tanmaster commented Sep 28, 2021 • edited Loading

holiman commented Sep 28, 2021 • edited Loading

tanmaster commented Sep 28, 2021

tanmaster commented Sep 9, 2021 •

edited by rjl493456442

Loading

s1na commented Sep 21, 2021 •

edited

Loading

evertonfraga commented Sep 22, 2021 •

edited

Loading

tanmaster commented Sep 25, 2021 •

edited

Loading

tanmaster commented Sep 28, 2021 •

edited

Loading

holiman commented Sep 28, 2021 •

edited

Loading