Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is account sequence validation logic wrong? #13621

Closed
ice-cronus opened this issue Oct 24, 2022 · 11 comments
Closed

is account sequence validation logic wrong? #13621

ice-cronus opened this issue Oct 24, 2022 · 11 comments

Comments

@ice-cronus
Copy link

ice-cronus commented Oct 24, 2022

Summary of Bug

Hello! I trying to submit a lot of transactions concurrently from the same account (signer) with broadcast mode block. And I got a lot of "account sequence mismatch" errors, however I prepared transactions properly with provided sequence (.WithSequence(atomic counter)), so they should be fine.
But I still get those errors because of blockchain is slowed down because of a lot of transactions (for example I get "account sequence mismatch, expected 27, got 28: incorrect account sequence", so I submitted tx with sequence num 28, but blockchain still did not process ante handlers for the 27th one).
I found account sequence validation logic in sigverify.go and I wonder why it validates with not equals. To prevent double replay attacks it is enough to validate with less than: sig.Sequence < acc.GetSequence() isn't it? To stop execution of TXs with previous sequences.

Version

0.46.1

Steps to Reproduce

@ice-cronus ice-cronus changed the title Account sequence validation logic? is account sequence validation logic wrong? Oct 24, 2022
@adizere
Copy link
Contributor

adizere commented Oct 24, 2022

We get a lot of these kind of errors in the Hermes relayer. The relayer is quite aggressive in submitting transactions, similarly to your use-case.

broadcast mode block

do you mean /broadcast_tx_commit ? https://docs.tendermint.com/v0.34/rpc/#/

In case this helps, in our case what we did the following:

  • switched to /broadcast_tx_sync instead of broadcast_tx_commit
  • ignore the case when expected < got because we take this to mean typically that the last submitted transaction(s) has not yet been committed. Specifically, we usually encounter this case during tx simulation, so a workaround we use is to ignore the error and use default gas instead of simulating the tx and gas.

The above is far from ideal, so we're still iterating on a solution pending some changes from the SDK. See #13009

@ice-cronus
Copy link
Author

Yep, I mean broadcast_tx_commit. But we we have to wait for updated state, so broadcast_tx_sync is not an option for us.
Btw thank you for the information about ignoring errors in case of expected < got, but I am unsure - tx fails completelly because of that error, so only option is to retry and submit it with another sequence. No way to make it retry-free?

@tac0turtle
Copy link
Member

commit is deprecated and wont be supported in the next release. If you are not handling your nonce incrementing manually then this problem is a lot worse, other wise network distribution is part of the problem. We have had discussions on how to approach this in the future (needs research) but for now handling your own nonce and watching blocks on the web socket for tx inclusion is one approach, the second is to use multiple accounts.

We can look into your suggestion for now.

@ice-cronus
Copy link
Author

@tac0turtle I played a bit with my suggested approach via a fork - it reduces error rate for error code 32 ("account sequence mismatch") significantly. With 1000 goroutines I have about 400-500 errors with regular cosmos SDK with non-equals condition (which is 40-50%) and about 5-7 errors with the patched one (which is 0.5%). But I am new to cosmos ecosystem and aware of any side-effects.

@alexanderbez
Copy link
Contributor

alexanderbez commented Oct 24, 2022

As @tac0turtle pointed out, the block commit mode is deprecated and will soon be completely phased out -- it's best to use sync. If you need state guarantees, then just wait for one tx to commit prior to submitting the next.

In any case, more generally speaking, since you're on 0.46.x, that includes TM with a priority mempool, which does NOT support multi-sender txs. So please be sure to have the legacy/old mempool enabled in config.toml. I believe it's v0? Note, you still might encounter sequence mismatch (as you seem to currently do!) and this is most likely due to the fact that TM's mempool is FIFO-based, meaning depending on who you broadcast these txs to and how they're committed, you could still see these errors due to how txs are processed in order.

Finally, with SDK 0.47 soon being released, the mempool will exist completely app-side and will also give you full control! In fact, the default app mempool will support guaranteed multi-sender txs so this should no longer be a concern for you or any relayers (cc @adizere)! 😄

@ice-nestor
Copy link

Is there any approximate date of 0.47 release ?

@alexanderbez
Copy link
Contributor

This quarter. Planned RC in a week or two.

@Unique-Divine
Copy link

@tac0turtle, what's the rationale behind the deprecation of block commit mode in favor of sync?

@alexanderbez
Copy link
Contributor

@tac0turtle, what's the rationale behind the deprecation of block commit mode in favor of sync?

Tendermint/Comet deprecated it. Also, it was never meant to be used as RPC timeouts would often trigger prior to tx block commitment.

@tac0turtle
Copy link
Member

@tac0turtle, what's the rationale behind the deprecation of block commit mode in favor of sync?

block was meant for testing, but it snuck into production. Secondly it actually causes more problems than we may think since its unknown if the timeout is long enough to get included in a block. Its simpler to submit and poll the blocks and potential mempool api to see when its included

@tac0turtle
Copy link
Member

with the creation of the accounts module teams are free to implement custom nonce designs, we also have unorderedtxs coming out in the next release this should fix the issue here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants