-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flag to save ABCIResults #8028
Comments
👍🏼 for removing these data. Unless we have a pretty good reason, I don't think we should bother having a flag to switch the behaviour. These data aren't exposed by any existing API of the node. |
@marbar3778 Do you know if anyone needs it at all? If they opted out of indexing, I agree with @creachadair that we should just rip it out. It would make #7983 a bit more straightforward since I could rip out all of the little extra shim logic. |
In regards to the sdk, we aim to provide indexing of events from the application and send empty events to tendermint, in this case its not needed. Oasis mentioned wanting to keep it but they also followed up with they could build something for it. I don't have a preference from the sdk perspective. From the node operators point of view I'm up for removing it to have less bloat on disk |
Yes Oasis have indicated that they still plan to use it. I'd be fine with first making it optional, marking it as deprecated and then in a later release removing it (i.e. a more gradual phase out approach) |
If the concern is indexing, then it's worth noting that indexing does not interact with the state or block stores at all—the events are indexed entirely separately, and do not access the stored data. We could delete the stored data today and not affect indexing at all. I don't know if that's the sense in which Oasis is using these data, but if so it might not matter whether we remove them from the databases. |
the app side indexing would be to enable typed events. This is something tendermint can't really provide unless we do something fancy and possibly complicated. This is a niche use case and I wouldn't want to tie everyone using Tendermint to use protobuf events. |
Right, that makes sense—my point is that even the existing Tendermint indexing doesn't use the stored data this issue proposes to get rid of. The ABCI events are plumbed directly to the indexer, and are not read back from block/state store. |
From what I understand, these shouldn't be saved in the State db. They are only needed for the latest block to facilitate replay. If indexing is on, they would be stored separately in the indexing database. So it should be pretty straight forward to just delete the last saved ABCIResults every time we go to save the next one when committing a block so we only ever have one saved at a time. Does that sound right ? |
Right. Not sure anyone uses Merkle proofs of tx results but in principle that could be useful. I guess we should ensure it's possible to get that out of the indexer? |
We actually might not even need to delete stuff, just overwrite. We can just change the |
Yeah we've done a similar thing with the On a more broader note, we've actually been planning to revisit the crash semantics by moving the For v0.34 and v0.35 branches, the simplest solution would be to prune everything but the most recent. |
This issue is actually wrong. it groups events into a general thing that is indexed by the kv indexer but actually the kv indexer only indexes tx events, therefore we are discarding block events entirely. This is how the current implementation works, but I don't think this was intentional, or if it was, node operator should have the option to save block events and discard tx events due to indexing them in the kv_indexer. |
The way the issue was framed was that it would be valuable to totally discard the ABCI results data (a solution to a problem). This naturally has a number of consequences. I would highly recommend that we find a way to clearly express the problem, from a user's perspective, before continuing to implement solutions along these lines (discarding ABCI results is an internal implementation detail from an integrator's perspective). The core of the problem seems to me to be that operators' storage costs are high and we need to find a way to bring them down. But there are different types of operators. We can't simply stop storing certain kinds of data because some people rely on that data. Therefore, operators need more control over what Tendermint stores - along with a clear understanding of the tradeoffs of using those controls. The problem statement should, however, include information on which Tendermint functionality, from a user's/integrator's perspective, is critical to retain. For example, for each kind of integrator (relayers, block explorers, etc.), which RPC endpoints are critical? This would allow the team to look into ways of saving storage space for each major category of integrator. |
I meant I framed it wrong. not anyone else. This issue introduced a bug in the way I opened it. It doesn't account for not indexing data that was discarded. As a user of tendermint I would like to reindex data through either the db layer or rpc layer. A tool I'm writing, in the current form when downloading a snapshot it is unknown if the snapshot has either tx_index.db or complete state.db. If It has the state.db I can reindex everything, if it only has tx_index.db then I can't reindex block events. The same statement can be said about RPC. If the node operator prunes ABCIResponses, but indexes txs via the kv indexer then there is no way to query for block events. This leaves me with needing to write a rpc crawler in order to find nodes that have ABCIResponses, making the tool more complicated than needed. |
Not a problem. Fortunately it's an opt-in feature which we can deprecate over time. One thing we should do is better document the impact of turning the flag on. What I would like to do is open a new issue that starts to outline the various ways in which we can give different types of users more control over their storage usage. |
Summary
Right now on nodes that are not indexing data, the node saves the events anyways through ABCIResults. This is unnecessary since the node opted out of indexing this data anyways.
This inflates the state.db to be 2x+ the size of the block store.db.
Problem Definition
Unnecessary computation and storage use by default.
Proposal
Provide a config in config.toml that allows the node operator to disable saving of events in ABCIResults. The flag would be on by default to keep the same behaviour as now but not operators can turn it off if they want.
I can do the work if this is accepted
For Admin Use
The text was updated successfully, but these errors were encountered: