-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maintain local copy of the nonce for each used address #538
Conversation
@dshulyak thanks! Two questions:
|
@divan GetTransactionCount doesn't include transaction queue when it counts current nonce for an account. So, if we will request GetTransactionCount when the queue has at least one tx from the same account - we will get multiple tx's with the same nonce. Yes, there is ethereum/go-ethereum#15794 and turns out there is a similar method in parity client https://github.com/paritytech/parity/wiki/JSONRPC-parity-module#parity_nextnonce So, maybe it is not critical and can wait for a next release of geth |
That's not how it works at all. Multiple concurrent transactions from one address gets different (incremental) nonces, if you submit via eth/personal If you are you talking about the case where you use Edit: Sorry, I slipped in here from ethereum/go-ethereum#15794, and didn't even notice it was a PR on a separate repo :) |
@holiman thanks for a comment :) , this is exactly about the case when we are using getTransactionCount concurrently. we do this because we are signing tx locally and then sending it with eth_sendrawtransaction. i think that for such use case the best solution would be to add jsonrpc call similar to - does it make sense to propose similar extension for geth jsonrpc?
|
I don't really see how it could work. Because in the gap between composeTransaction and the actual submission of the transaction (which is non-neglible time, since it needs to be signed), another call to composeTransaction will use the same nonce. And if you want to fix that, you'll have to add a second (third?) nonce-counter which is used to maintain a nonce-counter which somehow reserves and increments nonces used only for compose. It will get messy fast. If you have this usecase, I'd think the best approach would be to maintain nonces purely on the client-side, using |
@holiman i forgot about another part of this problem. rpc call getTransactionCount doesn't include transaction queue when it gets a nonce. You can find more info about it here ethereum/go-ethereum#15794 It will work only if two conditions will be met:
Additionally, compose tx rpc call reduces number of network round trips required to send a transaction, currently we have to: estimate gas, get a gas price, get a nonce and then send tx. This is not critical but would be nice.
|
But what I'm saying is that even if it did, it still wouldn't solve the problem. Because even after you get this value, there wll be a gap while you're signing it and haven't submitted it yet, when another tx would get the same nonce. |
So what I'd do, if I were you, would be to:
|
in general this is exactly what I am trying to achieve with this PR
what i tried to say earlier is that we can avoid doing any compose tx calls before tx from previous compose call is sent to geth |
Yes, that would work then. But still it would be problematic (from a user-expectation perspective) to have to have the bug reports over and over again, and tell users "compose works 95%, unless you do it multithreaded, when you'll get nonce-collisions. Please use sendTransaction, which works 100%". So I don't personally think it's a really good addition, but that's just my 5 c. |
e48e0dd
to
6496978
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! We got rid of one patch! 🎉
Please, make sure that go-ethereum
is updated with that.
Looks good, a few nitpicks.
if err != nil { | ||
m.txDone(tx, hash, err) | ||
return hash, err | ||
} | ||
// Send the transaction finally. | ||
hash, err = m.completeTransaction(tx, account, password) | ||
hash, err = m.completeTransaction(config, account, tx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I'd still put tx
on a first place.
geth/transactions/txqueue_manager.go
Outdated
log.Info("complete transaction", "id", queuedTx.ID) | ||
// nonce should be incremented only if tx completed without error | ||
// if upstream node returned nonce higher than ours we will stick to it | ||
localNonce := m.addrLock.LockNonceFor(queuedTx.Args.From) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if it is impossible to lock nonce? a deadlock? an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will block until the previous completeTransaction()
for this address is finished.
// node, after each call local nonce will be incremented | ||
// then, we return higher nonce, as if another node was used to send 2 transactions | ||
// upstream nonce will be equal to 5, we update our local counter to 5+1 | ||
func (s *TxQueueTestSuite) TestLocalNone() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: TestLocalNonce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! The whole idea looks good but I would work on the AddrLocker
interface. Maybe we can expose only Lock()
and Unlock()
methods? It's a bit strange that it says LockNonce
as we actually lock an address and additionally its local cached nonce.
geth/transactions/addrlock.go
Outdated
l.lock(address).Lock() | ||
return l.localNonce[address] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition? l.localNonce[address]
is not guarded with a mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l.lock(address).Lock() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, my bad. In this case, it's safe.
geth/transactions/addrlock.go
Outdated
} | ||
|
||
// lock returns the lock of the given address. | ||
func (l *AddrLocker) lock(address common.Address) *sync.Mutex { | ||
l.mu.Lock() | ||
defer l.mu.Unlock() | ||
if l.localNonce == nil { | ||
l.localNonce = map[common.Address]uint64{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why make
is not used as it is three lines below?
geth/transactions/addrlock.go
Outdated
// same nonce until the lock is released. The mutex prevents the (an identical nonce) from | ||
// being read again during the time that the first transaction is being signed. | ||
func (l *AddrLocker) LockAddr(address common.Address) { | ||
func (l *AddrLocker) LockNonceFor(address common.Address) uint64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe LockAddrAndReturnLocalNonce()
? As it's a bit hard to figure out if this method is just a getter or has some side-effects.
Or maybe just go with Lock()
? I have some trouble with an idea of locking nonce as we actually lock an address and additionally keep track of its local nonce.
geth/transactions/addrlock.go
Outdated
func (l *AddrLocker) UnlockAddr(address common.Address) { | ||
// UnlockNonceFor unlocks the mutex of the given account and updates nonce. | ||
func (l *AddrLocker) UnlockNonceFor(address common.Address, nonce uint64) { | ||
l.localNonce[address] = nonce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessarily, it depends on the usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So like an ordinary race condition :D AddrLocker
should be immune to bad usage in my opinion. Ideally, it could verify if a given address
is actually locked. If not, an error should be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verifying locked address in golang is not that easy, it requires one more synchronization mechanism like Atomic.CompareAndSwap
log.Info("finally completed transaction", "id", tx.ID, "hash", hash, "err", err) | ||
m.txDone(tx, hash, err) | ||
return hash, err | ||
} | ||
|
||
func (m *Manager) validateAccount(tx *common.QueuedTx) (*common.SelectedExtKey, error) { | ||
func (m *Manager) validateAccount(config *params.NodeConfig, tx *common.QueuedTx, password string) (*common.SelectedExtKey, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't get config from Manager
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mainly to avoid multiple calls in mocks
geth/transactions/addrlock.go
Outdated
// UnlockAddr unlocks the mutex of the given account. | ||
func (l *AddrLocker) UnlockAddr(address common.Address) { | ||
// UnlockNonceFor unlocks the mutex of the given account and updates nonce. | ||
func (l *AddrLocker) UnlockNonceFor(address common.Address, nonce uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here maybe we can also go with Unlock()
as we again kind of locking an address, not a nonce.
geth/transactions/txqueue_manager.go
Outdated
log.Info("complete transaction", "id", queuedTx.ID) | ||
// nonce should be incremented only if tx completed without error | ||
// if upstream node returned nonce higher than ours we will stick to it | ||
localNonce := m.addrLock.LockNonceFor(queuedTx.Args.From) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will block until the previous completeTransaction()
for this address is finished.
geth/transactions/txqueue_manager.go
Outdated
localNonce := m.addrLock.LockNonceFor(queuedTx.Args.From) | ||
var nonce uint64 | ||
defer func() { | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is ok but usually, a special case happens when err != nil
.
rst := manager.WaitForTransaction(tx) | ||
// simple sanity checks | ||
s.NoError(err) | ||
s.Equal(rst.Error, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.NoError(rst.Error)
maybe?
// node, after each call local nonce will be incremented | ||
// then, we return higher nonce, as if another node was used to send 2 transactions | ||
// upstream nonce will be equal to 5, we update our local counter to 5+1 | ||
func (s *TxQueueTestSuite) TestLocalNone() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks good! I would also add a fail case where it is verified that nonce was not incremented.
@adambabik I don't like this interface either. Would it make sense If I will simply store a map on a transactionManager, guarded by AddrLocker? |
649af41
to
f0e1429
Compare
I removed changes to AddrLocker, now it looks simpler. Will do a rebase tmrw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner interface with nonces put in txqueue_manager.go
!
geth/transactions/txqueue_manager.go
Outdated
|
||
func (m *Manager) completeTransaction(config *params.NodeConfig, selectedAccount *common.SelectedExtKey, queuedTx *common.QueuedTx) (hash gethcommon.Hash, err error) { | ||
log.Info("complete transaction", "id", queuedTx.ID) | ||
// nonce should be incremented only if tx completed without error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe put this comment inside defer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will do it with rebase
geth/transactions/txqueue_manager.go
Outdated
notify bool | ||
completionTimeout time.Duration | ||
|
||
addrLock *AddrLocker | ||
localNonce map[gethcommon.Address]uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is actually not thread safe now
If multiple concurrent transactions are sent from the same address nonce for some of them will be the same. It will result in the "known transaction error" and such transaction will be dropped from a queue. There is 2 alternatives in solving such problem: - adding an auto-retry to transaction queue, which is very complex and error prone, this way we can actually retry transactions that are already processed by ethereum - maintain nonce locally Second approach is a straightforward. We keep asking for a nonce from upstream, but if our local nonce is higher we steak to it. Our local nonce is updated only if transaction succeeds, so there is no way to send out of order transaction. Signed-off-by: Dmitry Shulyak <yashulyak@gmail.com>
PR in status-go/go-ethereum status-im/go-ethereum#41 |
@dshulyak merged |
@mandrigin thanks, updated |
@divan it would be cool if you can take a look at this PR... |
geth-patches/README.md
Outdated
- [`0008-tx-pool-nonce.patch`](./0008-tx-pool-nonce.patch) - On GetTransactionCount request with PendingBlockNumber get the nonce from transaction pool | ||
- [`0010-geth-17-fix-npe-in-filter-system.patch`](./0010-geth-17-fix-npe-in-filter-system.patch) - Temp patch for 1.7.x to fix a NPE in the filter system | ||
|
||
- [`0010-geth-17-fix-npe-in-filter-system.patch`](./0010-geth-17-fix-npe-in-filter-system.patch) - Temp patch for 1.7.x to fix a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix what? 😅 What is npe
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
null pointer exception
} | ||
// 5 transactions in total | ||
for i := 0; i < txCount+2; i++ { | ||
s.setupStatusBackend(account, password, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit strange we call s.setupStatusBackend(account, password, nil)
five times with the same args. Maybe we can clarify this in the comment: setup expectations for 5 transactions in total
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If multiple concurrent transactions are sent from the same address
nonce for some of them will be the same. It will result in the
"known transaction error" and such transaction will be dropped
from a queue.
There are 2 alternatives to solving such problem:
and error-prone, this way we can actually retry transactions
that are already processed by the Ethereum
The second approach is a straightforward. We keep asking for a nonce
from upstream, but if our local nonce is higher we will stick to it.
Our local nonce is updated only if transaction succeeds, so there is
no way to send out of order transaction.
Blocked by: #537