Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consensus failure (invalid chunk size) when attempting PayForData #433

Closed
4 tasks
lzrscg opened this issue May 22, 2022 · 11 comments · Fixed by #419
Closed
4 tasks

Consensus failure (invalid chunk size) when attempting PayForData #433

lzrscg opened this issue May 22, 2022 · 11 comments · Fixed by #419
Assignees
Labels
bug Something isn't working

Comments

@lzrscg
Copy link

lzrscg commented May 22, 2022

Summary of Bug

When attempting a pay for data transaction
Screen Shot 2022-05-21 at 4 09 44 PM
I am getting a consensus failure
Screen Shot 2022-05-21 at 4 09 28 PM

Text version

celestia-appd tx payment payForData 1111111111111111 222222222222222222 --from=validator --keyring-backend test --chain-id test
5:28PM INF Timed out dur=4969.972836 height=6090 module=consensus round=0 step=1
5:28PM ERR failure to erasure the data square while creating a proposal block error="invalid chunk size"
5:28PM ERR CONSENSUS FAILURE!!! err="invalid chunk size" module=consensus stack="goroutine 86 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x88\ngithub.com/tendermint/tendermint/internal/consensus.(*State).receiveRoutine.func2(0x4000051880, 0x228bf20)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:789 +0x48\npanic({0x18cf920, 0x40003d8cd0})\n\t/usr/local/go/src/runtime/panic.go:1038 +0x224\ngithub.com/celestiaorg/celestia-app/app.(*App).PrepareProposal(0x4000424a00, {0x4003fad570, 0x14ffc89})\n\t/home/parallels/celestia-app/app/prepare_proposal.go:34 +0x228\ngithub.com/tendermint/tendermint/abci/client.(*localClient).PrepareProposalSync(0x40002abaa0, {0x2625978, 0x4000130028}, {0x4003fad570, 0x14ffc89})\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/abci/client/local_client.go:383 +0xfc\ngithub.com/tendermint/tendermint/internal/proxy.(*appConnConsensus).PrepareProposalSync(0x4000a75050, {0x2625978, 0x4000130028}, {0x4003fad570, 0x14ffc89})\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/proxy/app_conn.go:94 +0x54\ngithub.com/tendermint/tendermint/internal/state.(*BlockExecutor).CreateProposalBlock(0x4000354d00, 0x17ca, {{{0xb, 0x0}, {0x4000a5c438, 0x11}}, {0x40004b4dfc, 0x4}, 0x1, 0x17c9, ...}, ...)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/state/execution.go:139 +0x308\ngithub.com/tendermint/tendermint/internal/consensus.(*State).createProposalBlock(0x4000051880)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:1290 +0x268\ngithub.com/tendermint/tendermint/internal/consensus.(*State).defaultDecideProposal(0x4000051880, 0x17ca, 0x0)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:1200 +0x4c\ngithub.com/tendermint/tendermint/internal/consensus.(*State).enterPropose(0x4000051880, 0x17ca, 0x0)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:1177 +0x670\ngithub.com/tendermint/tendermint/internal/consensus.(*State).enterNewRound(0x4000051880, 0x17ca, 0x0)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:1096 +0xa48\ngithub.com/tendermint/tendermint/internal/consensus.(*State).handleTimeout(0x4000051880, {0x1283bc464, 0x17ca, 0x0, 0x1}, {0x17ca, 0x0, 0x1, {0x3110621d, 0xeda1b7aa5, ...}, ...})\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:972 +0x4b0\ngithub.com/tendermint/tendermint/internal/consensus.(*State).receiveRoutine(0x4000051880, 0x0)\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:854 +0x574\ncreated by github.com/tendermint/tendermint/internal/consensus.(*State).OnStart\n\t/home/parallels/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.2.1-tm-v0.35.4/internal/consensus/state.go:417 +0x134\n"

Version

latest commit (745bd99973a9183f72777eaf63e4eb57ba58ec8a)

Steps to Reproduce

  1. Install and set up single node using contrib/single-node.sh
  2. Run celestia-appd tx payment payForData 1111111111111111 222222222222222222 --from=validator --keyring-backend test --chain-id test

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@liamsi
Copy link
Member

liamsi commented May 22, 2022

The original error originates in rsmt2d; either from here or here.

@evan-forbes
Copy link
Member

evan-forbes commented May 22, 2022

Yes, I believe this was fixed in one of the release candidates v0.5.0-rc2 that is based on #419

That release is significantly more stable than master atm

I'll dig into this more tmrw, just to make sure

@lzrscg
Copy link
Author

lzrscg commented May 22, 2022

v0.5.0-rc2 does not have an error, but the transaction is still not being included in the block

celestia-appd tx payment payForData 1111111111111111 222222222222222222 --from=validator --keyring-backend test --chain-id test
{"body":{"messages":[{"@type":"/payment.MsgWirePayForData","signer":"celestia1skj9vlldzhn2zahysg3h3rqhhdjfwj86kk9kg7","message_name_space_id":"ERERERERERE=","message_size":"9","message":"IiIiIiIiIiIi","message_share_commitment":[{"k":"128","share_commitment":"trTCjgdxBY+RvbQBsX7JNGzM81Pqci3t7DpYAYVNRsY=","signature":"Xwv9fiI2POXF8EMJzh9zo+3DioE+L3ouZrUUFudOVrAu4P05NeXjH6+KcH0iWMEMJn5t5L0efOJu3xLHwVjw2A=="},{"k":"128","share_commitment":"trTCjgdxBY+RvbQBsX7JNGzM81Pqci3t7DpYAYVNRsY=","signature":"Xwv9fiI2POXF8EMJzh9zo+3DioE+L3ouZrUUFudOVrAu4P05NeXjH6+KcH0iWMEMJn5t5L0efOJu3xLHwVjw2A=="},{"k":"64","share_commitment":"trTCjgdxBY+RvbQBsX7JNGzM81Pqci3t7DpYAYVNRsY=","signature":"Xwv9fiI2POXF8EMJzh9zo+3DioE+L3ouZrUUFudOVrAu4P05NeXjH6+KcH0iWMEMJn5t5L0efOJu3xLHwVjw2A=="}]}],"memo":"","timeout_height":"0","extension_options":[],"non_critical_extension_options":[]},"auth_info":{"signer_infos":[],"fee":{"amount":[],"gas_limit":"200000","payer":"","granter":""},"tip":null},"signatures":[]}

confirm transaction before signing and broadcasting [y/N]: y
code: 0
codespace: ""
data: ""
events: []
gas_used: "0"
gas_wanted: "0"
height: "0"
info: ""
logs: []
raw_log: ""
timestamp: ""
tx: null
txhash: DA16AC17016D3B3A7E78746049482A1D944654D03F2A685097F2E0FFC8EF5A4A

Here I am searching for the tx. It's also not showing up in the block when I query the block it would have been included in

curl http://localhost:26657/tx?hash=0xDA16AC17016D3B3A7E78746049482A1D944654D03F2A685097F2E0FFC8EF5A4A
{
  "jsonrpc": "2.0",
  "id": -1,
  "error": {
    "code": -32603,
    "message": "Internal error",
    "data": "tx (DA16AC17016D3B3A7E78746049482A1D944654D03F2A685097F2E0FFC8EF5A4A) not found, err: %!w(\u003cnil\u003e)"
  }

@evan-forbes
Copy link
Member

I think you need to specify a square size, in this case it should be 2, so use the flag --square-sizes 2

celestia-appd tx payment payForData 1111111111111111 222222222222222222 --from=validator --keyring-backend test --chain-id test --square-sizes 2

we can add square sizes just by using a comma so --square-sizes 2,4,8,16

if the transaction is not signed over a small enough size, then the tx will not get included until a block producer creates a block of that specific size. We also don't yet have a way to keep track of which transactions are signed over which square sizes in the mempool. Though we might want to look into doing this.

The easiest solution would be to simply sign over all the viable square sizes by default.

I have code in tests in #419 to generate the square sizes for us, so we should do that by default imo.

but we still need to communicate this better to the user

thanks for finding this @lzrscg !

@evan-forbes
Copy link
Member

evan-forbes commented May 22, 2022

referencing #239 and #434 #435

@lzrscg
Copy link
Author

lzrscg commented May 22, 2022

ok here is what I am finding

unfortuately, adding the square size does not work and produces the same output as before

celestia-appd tx payment payForData 1111111111111111 222222222222222222 --from=validator --keyring-backend test --chain-id test --square-sizes 2
{"body":{"messages":[{"@type":"/payment.MsgWirePayForData","signer":"celestia1skj9vlldzhn2zahysg3h3rqhhdjfwj86kk9kg7","message_name_space_id":"ERERERERERE=","message_size":"9","message":"IiIiIiIiIiIi","message_share_commitment":[{"k":"2","share_commitment":"trTCjgdxBY+RvbQBsX7JNGzM81Pqci3t7DpYAYVNRsY=","signature":"Xwv9fiI2POXF8EMJzh9zo+3DioE+L3ouZrUUFudOVrAu4P05NeXjH6+KcH0iWMEMJn5t5L0efOJu3xLHwVjw2A=="}]}],"memo":"","timeout_height":"0","extension_options":[],"non_critical_extension_options":[]},"auth_info":{"signer_infos":[],"fee":{"amount":[],"gas_limit":"200000","payer":"","granter":""},"tip":null},"signatures":[]}

confirm transaction before signing and broadcasting [y/N]: y
code: 19
codespace: sdk
data: ""
events: []
gas_used: "0"
gas_wanted: "0"
height: "0"
info: ""
logs: []
raw_log: ""
timestamp: ""
tx: null
txhash: 1C280E504D46C15D5839046378FD143C91BB266F923B922505BB404493422C61

setting square size to 4 (as well as other variations other than 2) gives an error

celestia-appd tx payment payForData 1111111111111111 222222222222222222 --from=validator --keyring-backend test --chain-id test --square-sizes 4
{"body":{"messages":[{"@type":"/payment.MsgWirePayForData","signer":"celestia1skj9vlldzhn2zahysg3h3rqhhdjfwj86kk9kg7","message_name_space_id":"ERERERERERE=","message_size":"9","message":"IiIiIiIiIiIi","message_share_commitment":[{"k":"4","share_commitment":"trTCjgdxBY+RvbQBsX7JNGzM81Pqci3t7DpYAYVNRsY=","signature":"Xwv9fiI2POXF8EMJzh9zo+3DioE+L3ouZrUUFudOVrAu4P05NeXjH6+KcH0iWMEMJn5t5L0efOJu3xLHwVjw2A=="}]}],"memo":"","timeout_height":"0","extension_options":[],"non_critical_extension_options":[]},"auth_info":{"signer_infos":[],"fee":{"amount":[],"gas_limit":"200000","payer":"","granter":""},"tip":null},"signatures":[]}

confirm transaction before signing and broadcasting [y/N]: y
code: 32
codespace: sdk
data: ""
events: []
gas_used: "0"
gas_wanted: "0"
height: "0"
info: ""
logs: []
raw_log: 'account sequence mismatch, expected 2, got 1: incorrect account sequence'
timestamp: ""
tx: null
txhash: DB252CE56AE38D9A3785FF21CD9EC3095652DF75AE3C1D36589B14372841BE7E

@evan-forbes
Copy link
Member

evan-forbes commented May 22, 2022

I was unable to recreate this error, do you mind double checking that the binary was restarted?

if that doesn't work, do you mind starting from scratch by deleting the existing chain and running the script again?

@evan-forbes
Copy link
Member

evan-forbes commented May 22, 2022

sorry, didn't mean to close (hit ctrl+enter)

both of those indicate that the original failing transaction is still stuck in the mempool

	// ErrTxInMempoolCache defines an ABCI typed error where a tx already exists
	// in the mempool.
	ErrTxInMempoolCache = Register(RootCodespace, 19, "tx already in mempool")

sequence == nonce

	// ErrWrongSequence defines an error where the account sequence defined in
	// the signer info doesn't match the account's actual sequence number.
	ErrWrongSequence = Register(RootCodespace, 32, "incorrect account sequence")

@lzrscg
Copy link
Author

lzrscg commented May 22, 2022

I was unable to recreate this error, do you mind double checking that the binary was restarted?

if that doesn't work, do you mind starting from scratch by deleting the existing chain and running the script again?

Great idea. Fwiw, I did check that it was running the latest version.

celestia-appd version
0.5.0-rc-2

I will reset the chain

@evan-forbes
Copy link
Member

evan-forbes commented May 23, 2022

To update this: The above solution did work, but there was another bug for the square estimation that was causing very small messages to not be counted properly, and then they would get stuck in the mempool. see 38158e1

double check w/ @lzrscg but I think we can consider this closed with #419

@evan-forbes evan-forbes linked a pull request May 23, 2022 that will close this issue
@evan-forbes evan-forbes added the bug Something isn't working label May 23, 2022
@evan-forbes evan-forbes self-assigned this May 23, 2022
@lzrscg
Copy link
Author

lzrscg commented May 23, 2022

Yes, this is resolved. Thanks!

@lzrscg lzrscg closed this as completed May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants