-
Notifications
You must be signed in to change notification settings - Fork 20.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geth not broadcasting transfer #22308
Comments
Sidenote, geth sync stopped after ~ 8 hours from restart and nothing into logs about it. |
The txs was mined here: https://etherscan.io/tx/0x4b60ee38c374c753f3e0d682b9f97d169fe4276d95cdd178a495e9dc3d8fbbcf , As far as I can tell, your issue was filed |
@holiman You're wrong, this is not the issue that i have reported. Geth returned txid but transfer was stuck inside geth with pending state. For 3 hours. I have reported same issue and someone replied issue was fixed (same issue ) - we are running multiple servers with geth and issue is happening random and only geth restart fix this pending transfers and broadcasting them into network. Please kindly note, this issue is present on many of our systems and we're just simply sending transfers our using Now we're facing again same issue 3 transfers in pending state and geth is fully synced. `> eth.syncing
` |
While it's not really helping solving the root issue, a temporary hack you can use to work around these scenarios, apart from restarting geth, can be to use another node to broadcast the transactions. |
Thank you for this hint but i would avoid doing this, We're running multiple servers and it will take a lot of time to check each of them and rebroadcast the transaction manually. I would rather wait for this geth issue to be solved properly. |
So, I'm king of agreeing that Martin is correct here. You sent a transaction with 121 gwei, the network's average price on that day way 175 gwei. How do you expect to be included fast in a block if you underprice the network? |
@karalabe Sorry but i don't think you get the full picture of the issue here. Even if the issue was the gas price THE TRANSFER SHOULD BE VISIBLE ON BLOCK EXPLORER if we're checking the txid from pending transfers. But when we're checking transfer etherscan return invalid txhash. When transfer is sent out with low gwei and geth successfully process it, transfer is available on block explorer with pending state. Here the issue is: transfer is send with correct amount of gwei, and all other necessary details but geth won't send transfers out. Blocks are updating but if we check eth.pendingTransactions we have plenty of transfers in pending state and when we check any of this txid inside explorer transfer isn't available. IF WE RESTART GETH - GETH BROADCAST TRANSFER AND THEY ARE PROCESSED! It is not a gwei issue it's totally geth fault and broadcast of transfers is happening only after geth restart. I have reported the issue 2 times ~ 1 year ago and it was told to me the issue was fixed (also other users reported same issue but in fact it was never fixed). Issue is appearing random after some time and inside geth we don't have any information about transfer just txid which is not valid. Right now we have many transfers in pending state but it looks even after geth restart transfers are still not processed and can't get any information about them from explorer.
|
I agree with @Gbogdann93 . Its not related about gwei price. Most of my tx's are stucked in pending in node. Never shown on etherscan. stucked tx gwei is 150 and average gwei was 120 at the time creating. |
This is totally geth issue which is pending for almost 1 year but never checked properly as there are multiple topics opened with same issue. I hope this time someone will try to cover and fix it. |
I can confirm this issue. Some of the transactions just stuck at geth with pending status, but the rest of the world not receive the pending transactions at all, no pending txs on etherscan. The gas price is correct and the nonce is correct. It seems that the geth just not broadcast these transactions at all. It may cost about 1.5 hours for geth to re-broadcast these transactions to the network and then these transactions confirmed by the network. |
What would help us is to know your sending patterns. How many transactions do you push into the network when this happens, how fast do you push them in, what is the txpool content at that time ( There are a lot of moving variables, and we're mostly confident Geth handles things correctly in general, but once you push transactions over the limits of your local and remote peer's capacities, things might get unreliable. I'm 100% certain Geth published every transaction it gets. The question is why the those transactions get dropped by remote nodes to which you send it to. It hits some limits, the question is which one. |
Certain entities are also playing pool battles, trying to kick each other out of the pool to get a better placing for exchange transactions. Such adversarial moves can also influence other transactions, though it's hard to say without seeing some traffic and txpool content when the issue happens. |
@karalabe It seems we're getting back all the time and blame this is not geth fault but in fact it totally is. In past year i have provided info, geth logs but nothing was checked at all and some threads were closed even if the resolution was "fixed in version 1.9.16". We encountered same issue on multiple servers, we're running geth on at least 15 separate servers for our wallets and the issue is appearing also on server where we're sending 2-3 transfers/day and also on servers where we're sending more than 50 transfers out/day this is totally random and after some geth restart transfer are visible into network. I think i have already tried to get all necessary informations for at least 2 topics but even if other users reported same problem, it wasn't never checked properly. At this moment we have many customers complaining about transfers with invalid txid not available on network and we can't do anything at all. Do you have any temporary solution how to have this transfers available on network? ~ 2 years ago when we face same issue and we restarted geth, pending transfers from geth were executed 2 times and we lost some huge amount because of that. Please kindly let us know what we can do to have transfers properly sent out from geth. We have also prepared other servers with geth fully synced but i am not sure if we can move keystores and migrate to new one, and what will happen with pending transfers from old geth? We don't want to have transfers executed 2 times. |
This can be further investigated, but it's not trivial. If you are running multiple servers, can you: monitor one as being If it never saw the transaction: yes, geth failed to broadcast. However, there are a lot of transactions coming in and getting dropped every second, so you might need to run some custom version of geth on If you would be willing to set this up, I can provide the custom version. |
@Gbogdann93 what we're trying to say is: this might totally be geth's fault, but we cannot debug it without further information. I understand you are frustrated by this experience, since you have already provided logs in the past. |
@fjl Please let me know what is needed again from my side. I will not go and debug geth on live environment where we have at least 20 ETH in pending state. I am just developer as you and i don't have access to all servers and also we will not do any tests on environments with 200+ eth. We already tried to replicate and see same output from test geth but issue can't be replicated at all. (same server, specs, geth version and so on) |
@Gbogdann93 It's completely understandable that you don't want to debug/test on live environments, independent how much funds are at stake. Obviously don't test in prod. Our suggestion was a bit different. We are mostly convinced that it's not your local node that's not sending out transactions, rather it's the nodes around you (the ones you're connected to) that are rejecting them after they receive them. As such, you don't need to touch your production nodes at all really. What we need is a completely independent new node, that is simply connected to your production nodes and doesn't do anything special, just behaves like any other live node (with high verbosity logging). Martin suggested you set it up because you might feel better if no details about your infra are shared with us. We can set up a node ourselves too with all the logging we'd need, the only thing we need is to be connected to your own nodes to see what's happening on the wire. We can whitelist your nodes to allow them to always connect to us if you give us This bug is obviously a rare thing that I still say depends on network conditions. You won't be able to reproduce it easily, because there's no singular cause and wou won't be able to reproduce the network conditions. The only option we see is if we can somehow monitor the network traffic and analyze the events after-the-fact. So Option A: you set up an extra node within your infra, have it connected to your servers and send us logs; Option B: we set up our monitoring nodes and run it the best we can, but we still need to know we have a connection to your systems when weird things happen. |
@karalabe We would like to go with option B. |
@karalabe we are facing the same issue for a while now, every few days we have pending txs on geth, sometimes dozens, at the moment we are rebroadcasting through other nodes we have or via etherscan. if it helps troubleshoot we could also connect or node with yours and it would be a nice feature if the node that rejects txs, let the sender node known that it has been rejected and why |
I met the same issue on my node, sendRawTransaction randomly stuck txs in local not broadcast to network. INFO [02-23|03:37:13.849] Loaded local transaction journal transactions=99 dropped=0 Only I can do is restart geth again and again, it will not rebroadcast all pending txs in one restart. You need to keep restarting again and again. The stuck tx always happened when average gas price swing wildly. |
This has been a problem for years with geth in large production networks. I can give everyone an the hack we have in place that keeps it happy most of the time, and while it may be "spammy" or a "waste of resources", it keeps our stuff running pretty smoothly (unless there is a backup, which I'll talk later) a) set up 3-4 geth servers that are just nodes in multiple places if possible This will stop the random "a tx didn't go out and blocked my entire pipe" issue. It sucks, but this almost always gaurantees they are seen. We no longer have those stuck transactions with this setup. The time this fails is large gas spikes and you have TXs that are stuck for multiple hours because of gas fees. When this happens, all the tx's you have out there are dropped, and all the endpoints you've sent them to already know about them too and have dropped them as well. The only way to fix that is basically find a new endpoint to rebroadcast the tx to. As a service operator that relies on geth this entire thing is extremely frustrating. |
@karalabe Do we have any ETA for this issue? It will be fixed in the near future? |
You'll need to give me a couple days, we're finalizing the 1.10 release now and I want to get that out of the way before diving into a new thing. |
@karalabe Please kindly lets us know when we can expect an update on this. Thank you. |
I just saw this issue on our node and based on the suggestions above, I restarted Geth and it got fixed. |
@BigMurry thanks a lot for your suggestion to increase maxPeers value. it was working just fine for the last 2 weeks. I will be looking forward to your advice re the new geth upgrade and if any pending are shown... |
Is there any updates? |
Same issue here with version 1.10.1. The node randomly stops broadcasting pending transactions, and the only way to fix it is restarting it |
It happened today on our network. |
@kladkogex Get used to it. We have few servers where this error didn't appear. What is really intresting is the fact that we have reported this issue for almost 2 years but it seems nobody cares even if this is very critical and affect many users transfers. The only solution is to move to another client since at least we don't have any response in a place where we are supposed to help each others. |
Hey @karalabe Let's make a bet - I do not know much about Go, never programmed this language. I have no clue what is inside geth. In addition to that I go for a vacation tomorrow, so you have 5 days lead time. If I fix this bug faster than you, you pay me $1000. If you fix it faster than me, I pay you $1000. Deal? This bug should have been fixed long time ago. |
This ticket is titled "Geth not broadcasting transfer". There was another ticket, about not re-broadcasting transactions. Those are two different things. Geth broadcasts all transactions. There are no known issues about broadcasting transactions. If geth indeed fails at doing so, then it should be possible to figure out a repro testcase to trigger such faulty behaviour. However, I think all these reports are really just an artefact of the next issue: Geth does not always rebroadcast transactions. There's currently no fix for that, at least which won't cause a lot of network spam.
Which bug is that? Number one or number two/ |
We got the same issue, especially since we upgraded to version 1.10.1. On the rinkeby network the tx's sometimes don't propagate. When I do eth.pendingtransactions() there aren't listed any tx's but they are not mined either. When I inspect the txpool I see my transactions with a valid gasprice (1 on Rinkeby) and correct nonces. The strange thing; transactions from other addresses sent to the same node work fine. If we experience the issue only the transactions from that one particular address won't propagate. We sent tx's with different addresses to our node but the others continue to work normally. More often than not it's after we've sent a few tx's in a short time span (multiple in less than 1 minute), but yesterday it was with a milder load and it still went wrong. We changed from a self hosted node to AWS Managed blockchain that uses Geth but it didn't make a difference, still the same issue. The issue fixes after restart of our application/after some time randomly. |
@holiman Again we face this issue. Restarted geth but transfers are "still pending" even if we already have 100 confirmations for them. I think this will never be fixed, am i right? |
Original issue:
Now you say:
How is that the same issue? That sounds like a different issue to me -- it was obviously sent to the network, how could it otherwise be confirmed? Please file a new ticket. |
@holiman I can open 100 tickets but in the end is just waste of time as we all can see. There are 2 issues which have been reported by me and other users for almost 2 years and no ones cares.
Don't get me wrong, we're all tired about this issues coming all the time and it's not me reporting this, there are plenty of other users which are complaining about this bug and in the end no action is done from your side. I can go again from scratch to open ticket and post again all informations required, logs and other things but as i see it's just waste of time from my side. |
@Gbogdann93 I'm sorry, we're a handful of people trying to maintain the client. People are pushing for new transaction models (1559), people are pushing for the merge. We don't have capacity to handle everything. We are aware of issues, in our opinion the solution is a full pool rewrite with a completely different data model. That will take a long time. If you have capacity to debug the pool, great, we'd gladly fix any explicit errors. |
Can we avoid the hostility and negativity towards the maintaining team? We are capable of contributing to the project so lets be civil and appreciate any help we can provide each other |
we are experincing the same issue ,and i mentted it here |
Same issue again with latest geth version v.10.6. |
Same issue here. Been following this conversation thread for some time in hope that someone, somewhere finds something to alleviate this problem. I am running a geth server in docker on a high powered aws instance. We have the data vol on a dedicated ebs volume using GP2. We are investigating whether disk congestion might correlate with this issue - we have seen this issue occur when we tend to see higher io waits on our data disk. We have only just noticed this coincidence so it may turn out to be nothing but given this problem happens every few days for us we shoukd be able to conclude this theory soon. We are now running enhanced metrics and stats collection to see if there is alignment between the two. We should know more within a week or two. Keep you posted. |
Update - we believe we have got to the bottom of this issue, although still tracking for confidence. We had previously upped our GlobalSlots to 250k, but not touched the other parameters. Having eyed the metrics closer:-
Without these settings above increased, when we had a flurry of transactions to send, the earlier transactions were published fine but the node held back the later ones until it was restarted. With these two settings adjusted/increased we have seen consistent behaviour in transactions being sent to the blockchain. This may of course be purely coincidental given the behaviour and volatility we see on Ethereum, but two weeks without a stuck transaction we have already hailed this as a minor miracle. @karalabe - It would be great if we could confirm the above - would it be possible to investigate these conditions and what happens to locally submitted transactions when these two parameters are exceeded? For the second parameter you will need to batch up a series of transactions from the same account and submit them at once. FYI - it would be great to add some additional logging around these parameters to know when they have been exceeded. |
@chreechris Are you still running without issue with those parameters? |
I've been seeing this issue regularly for months. Seems to be no way to gracefully resolve it other than killing GETH and praying that the little gremlins inside GETH persisted all the transactions to disk and will rebroadcast them on restart. |
What might help is if we had some separate logs for |
I do suffer from same issue time to time. As @chreechris mentionned, it only happens when blockchain is under heavy load. Another workaround si to cancel (nonce overwrite) the first tx stucked then all other TX in queue will be broadcasted. I will test the settings globalqueue and accountslot as well and update in a few weeks. |
This is happening on almost a daily basis for me now. GETH logs don't reflect any errors or problems broadcasting the transaction. The GETH node just silently hoards the transactions, and won't re-broadcast them until I restart it and submit another transaction. Is this a problem with the peers my node is broadcasting to? Or is there something that prevents GETH from broadcasting to its peers correctly? If GETH cannot confirm the transaction in a reasonable number of blocks, shouldn't GETH try to rebroadcast the transaction repeatedly until I issue a new transaction to drop+replace with a higher fee? |
there is a fix on BSC for this. Is it possible to port it to go-ethereum ? bnb-chain/bsc#570 |
Shouldn't GETH rebroadcast the transactions to newly connected peers too? I'm finding GETH doesn't even retry transactions broadcasted to peers that get dropped after. It seems like the GETH broadcasting logic assumes that all the peers at the moment of broadcast are good peers, which is probably a faulty assumption... some of these peers are bad. |
There should be a better way to force rebroadcast the pending transaction queue other than restarting geth. |
Hi there. |
System information
Geth version: 1.9.23
OS & Version: Debian 9
Expected behaviour
Transfer to be sent out and processed by network
Actual behaviour
Transfer is in pending transaction state and even after geth restart transfer is not processed.
Steps to reproduce the behaviour
Random simply sent transfer out
It seems again this issue is happening, i have reported it 2 times and also other users reported. Last time when this happened geth restart fixed the issue and transfer was processed. Now even after geth restart transfer is still in pending state and it's not broadcasted to network.
One of previous issues:
#21385 (comment)
Info from geth:
The text was updated successfully, but these errors were encountered: