-
Notifications
You must be signed in to change notification settings - Fork 20.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: remove unnecessary fields in log #17106
Conversation
19c7446
to
87f5938
Compare
The interesting question we need to figure out is how to do the upgrade path. The code as is in theory works, but is neither forward nor backward compatible. The way we did seamless upgrades until now was to support both formats (for a few geth releases) and run a background thread that makes the conversion. I think it would be important to at least implement support for the combo-format. The database upgrade is not that important since pruning might require a resync anyway, but its still important for auto-updating nodes to remain operational. So, what would be essential for this PR is to expand the Downgrade of course is not possible, so this PR needs a major version bump, but seamless upgrade (or at least continuous operation) is essential. |
Is there a corresponding PR for receipts? Because you can remove some fields there too, including bloom filters. Here is what I have done in Turbo-geth: AlexeyAkhunov@017a9e8 |
@AlexeyAkhunov Thank you for the reference. Will add the optimization to my PR! |
aa441f0
to
bfad48f
Compare
bfad48f
to
ae0e3c2
Compare
This is problematic..
So the db versioning does not work, at all :( |
1f8520c
to
b41c584
Compare
64b4f7c
to
49c1f9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to include the db version in the log output, even if skipDbVersionCheck
is used -- having it in the logs will help us debug potential errors if people switch back and forth between the broken one which overwrites the version number with nil
.
I suggest the following modification:
diff --git a/eth/backend.go b/eth/backend.go
index 2a9d56c5c..0b3625c41 100644
--- a/eth/backend.go
+++ b/eth/backend.go
@@ -139,16 +139,20 @@ func New(ctx *node.ServiceContext, config *Config) (*Ethereum, error) {
bloomIndexer: NewBloomIndexer(chainDb, params.BloomBitsBlocks, params.BloomConfirms),
}
- log.Info("Initialising Ethereum protocol", "versions", ProtocolVersions, "network", config.NetworkId)
+ bcVersion := rawdb.ReadDatabaseVersion(chainDb)
+ var dbVer = "<nil>"
+ if bcVersion != nil {
+ dbVer = fmt.Sprintf("%d", *bcVersion)
+ }
+ log.Info("Initialising Ethereum protocol", "versions", ProtocolVersions, "network", config.NetworkId, "db version", dbVer)
if !config.SkipBcVersionCheck {
- bcVersion := rawdb.ReadDatabaseVersion(chainDb)
if bcVersion != nil && *bcVersion > core.BlockChainVersion {
return nil, fmt.Errorf("database version is v%d, Geth %s only supports v%d", *bcVersion, params.VersionWithMeta, core.BlockChainVersion)
- } else if bcVersion != nil && *bcVersion < core.BlockChainVersion {
- log.Warn("Upgrade blockchain database version", "from", *bcVersion, "to", core.BlockChainVersion)
+ }else if bcVersion == nil || *bcVersion < core.BlockChainVersion{
+ log.Warn("Upgrading blockchain database version", "from", dbVer, "to", core.BlockChainVersion)
+ rawdb.WriteDatabaseVersion(chainDb, core.BlockChainVersion)
}
- rawdb.WriteDatabaseVersion(chainDb, core.BlockChainVersion)
}
var (
vmConfig = vm.Config{
Db versioning now seems to work fine, I've tested with nil, lower, same and higher version numbers.
return nil, common.Hash{}, 0, 0 | ||
} | ||
return receipts[receiptIndex], blockHash, blockNumber, receiptIndex | ||
receipts := ReadReceipts(db, blockHash, *blockNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ReadReceipts
method iterates over the data and returns an []receipts
, and this method iterates over those again to pick out the one we're interested in. Would it be worthwhile to instead have a readReceipt
method that only returns the one we're interested it, or is that just an unnecessary optimisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since normally a block only contains 200 transactions, so I think it won't hit too much performance if we iterate receipt slice twice.
We can have a similar implementation as ReadRecepits
(read blob from db, decode, assemble log), but kind of redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@matthalp My original idea for keeping For the But these all are my own understanding. |
@rjl493456442 Could you point me to an example? I don't believe we store receipts unless we have assume we will have its corresponding body (even in
From what I can tell, almost all of the use cases were |
…ereum#17106) * core: remove unnecessary fields in log * core: bump blockchain database version * core, les: remove unnecessary fields in txlookup * eth: print db version explicitly * core/rawdb: drop txlookup entry struct wrapper
The encoding of Log and LogForStorage is exactly the same now. After tracking it down it seems like #17106 changed the storage schema of logs to be the same as the consensus encoding. Support for the legacy format was dropped in #22852 and if I'm not wrong there's no reason anymore to have these two equivalent types. Since the RLP encoding simply contains the first three fields of Log, we can also avoid creating a temporary struct for encoding/decoding, and use the rlp:"-" tag in Log instead. Note: this is an API change in core/types. We decided it's OK to make this change because LogForStorage is an implementation detail of go-ethereum and the type has zero uses outside of package core/types. Co-authored-by: Felix Lange <fjl@twurst.com>
The encoding of Log and LogForStorage is exactly the same now. After tracking it down it seems like ethereum#17106 changed the storage schema of logs to be the same as the consensus encoding. Support for the legacy format was dropped in ethereum#22852 and if I'm not wrong there's no reason anymore to have these two equivalent types. Since the RLP encoding simply contains the first three fields of Log, we can also avoid creating a temporary struct for encoding/decoding, and use the rlp:"-" tag in Log instead. Note: this is an API change in core/types. We decided it's OK to make this change because LogForStorage is an implementation detail of go-ethereum and the type has zero uses outside of package core/types. Co-authored-by: Felix Lange <fjl@twurst.com>
The encoding of Log and LogForStorage is exactly the same now. After tracking it down it seems like ethereum#17106 changed the storage schema of logs to be the same as the consensus encoding. Support for the legacy format was dropped in ethereum#22852 and if I'm not wrong there's no reason anymore to have these two equivalent types. Since the RLP encoding simply contains the first three fields of Log, we can also avoid creating a temporary struct for encoding/decoding, and use the rlp:"-" tag in Log instead. Note: this is an API change in core/types. We decided it's OK to make this change because LogForStorage is an implementation detail of go-ethereum and the type has zero uses outside of package core/types. Co-authored-by: Felix Lange <fjl@twurst.com>
This PR drop some unnecessary fields in
Recepit
,TxLookup
andTransactionLog
structs to save database storage.