Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault(SIGSEGV) when starting Geth #16606

Closed
AyushG3112 opened this issue Apr 30, 2018 · 8 comments · Fixed by #16682
Closed

Segmentation Fault(SIGSEGV) when starting Geth #16606

AyushG3112 opened this issue Apr 30, 2018 · 8 comments · Fixed by #16682

Comments

@AyushG3112
Copy link

Geth version: 1.8.6-stable
OS & Version: Ubuntu 16.04.2

Expected behaviour

Geth should start syncing the blockchain.

Actual behaviour

Geth crashes with segmentation violation

Steps to reproduce the behaviour

No specific steps. Simply starting Geth.

Backtrace

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xa6faab]

goroutine 83 [running]:
github.com/ethereum/go-ethereum/eth/filters.(*EventSystem).eventLoop(0xc420662940)
	/build/ethereum-LEwR9U/ethereum-1.8.6+build13246+xenial/build/_workspace/src/github.com/ethereum/go-ethereum/eth/filters/filter_system.go:434 +0x2eb
created by github.com/ethereum/go-ethereum/eth/filters.NewEventSystem
	/build/ethereum-LEwR9U/ethereum-1.8.6+build13246+xenial/build/_workspace/src/github.com/ethereum/go-ethereum/eth/filters/filter_system.go:113 +0x104
@joeysino
Copy link

joeysino commented May 3, 2018

Are you starting geth with RPC enabled?

We get this error if we start geth like this:

./geth --rpc --rpcaddr [redacted1]

But if we start it with a different IP address then it works fine!

./geth --rpc --rpcaddr [redacted2]

Fixed! (See below)

This is happening on both 1.8.3 and 1.8.7, running on CentOS.

@AyushG3112
Copy link
Author

@joeysino I'm using WebSockets, not RPC. Are you using AWS by any chance?

@joeysino
Copy link

joeysino commented May 3, 2018

@AyushG3112 Yes we are also using AWS. It seems to happen with either --rpc or with --ws.

@joeysino
Copy link

joeysino commented May 3, 2018

Oh I see my mistake. --rpcaddr should be given the local machine's IP address, not the remote machine's IP address. (That is for the firewall to handle.)

So that's why it exploded when I changed IP address.

@AyushG3112
Copy link
Author

@joeysino So what I've been able to diagnose, the private IP of your AWS instance is assigned to your instance, so you can bind to it. The Public and Elastic IP are allocated to the AWS NATs which forward your requests, so if you try binding to it, it crashes.

Same thing happened with our other nodes, they were not able to start servers at the Public or Elastic IP

@joeysino
Copy link

joeysino commented May 3, 2018

Yes that's right. AWS instances have multiple IPs, internal and external, and for this binding we need to use the internal one. (Which can be found from ifconfig.)

So I think we solved our problem at least. But perhaps geth could provide a better error message when the IP cannot be bound to.

Possible message: geth cannot bind to IP 20.30.40.50. Bind IP must be an interface on the local machine

@holiman
Copy link
Contributor

holiman commented May 3, 2018

I'm not sure I follow... The original report is about a segfault -- under what circumstances does the segfault occur?
Afaik, when trying to bind to a non-existing address an error message is shown that the address is unavailable. Are you saying it segfaults instead?

@AyushG3112
Copy link
Author

@holiman Yes. On binding to a non-available IP, Geth is throwing SIGSEGV, removing the IP or changing it to an available one works perfectly. Can reproduce multiple times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@holiman @joeysino @AyushG3112 and others