Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: APISIX start failed when one of ETCD nodes down #5115

Closed
chzhuo opened this issue Sep 22, 2021 · 7 comments · Fixed by #5158
Closed

bug: APISIX start failed when one of ETCD nodes down #5115

chzhuo opened this issue Sep 22, 2021 · 7 comments · Fixed by #5158
Assignees
Labels
enhancement New feature or request

Comments

@chzhuo
Copy link
Contributor

chzhuo commented Sep 22, 2021

Issue description

I have three ETCD nodes

172.16.255.1
172.16.255.2
172.16.255.3

ETCD cluster work well when node 172.16.255.3 down

But APISIX start failed under the circumstances
image

Environment

apisix version: 2.8

Steps to reproduce

1

Actual result

2

Error log

3

Expected result

4

@tzssangglass
Copy link
Member

if only configured 172.16.255.1 at APISIX, will it start properly?

@shuaijinchao shuaijinchao added the bug Something isn't working label Sep 22, 2021
@shuaijinchao
Copy link
Member

When starting or restarting apisix through the cli tool, any node will exit the process directly when it is abnormal. Are you interested in fixing it ? @chzhuo

@shuaijinchao
Copy link
Member

You can determine whether there is at least one healthy node after the etcd loop check is over to ensure that APISIX can run normally.

@spacewander spacewander removed the bug Something isn't working label Sep 22, 2021
@spacewander
Copy link
Member

The behavior is intended to make sure each configured node is working. So that we can avoid misconfigured nodes.

@spacewander spacewander added the enhancement New feature or request label Sep 23, 2021
@spacewander
Copy link
Member

I changed my mind. We can allow broken nodes to work around some unstable status in etcd.

@okaybase
Copy link
Member

Production environment Case:
In our prod environment, we upgrade etcd cluster to v3.5, but we also met this case: etcd-io/etcd#12845 (comment), result to some etcd nodes down~
So, I also inclined to do some enhancement~

@shuaijinchao
Copy link
Member

I'll fix this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants