bug: APISIX start failed when one of ETCD nodes down #5115

chzhuo · 2021-09-22T09:03:00Z

Issue description

I have three ETCD nodes

172.16.255.1
172.16.255.2
172.16.255.3

ETCD cluster work well when node 172.16.255.3 down

But APISIX start failed under the circumstances

Environment

apisix version: 2.8

Steps to reproduce

1

Actual result

2

Error log

3

Expected result

4

The text was updated successfully, but these errors were encountered:

tzssangglass · 2021-09-22T10:08:44Z

if only configured 172.16.255.1 at APISIX, will it start properly?

shuaijinchao · 2021-09-22T10:19:28Z

When starting or restarting apisix through the cli tool, any node will exit the process directly when it is abnormal. Are you interested in fixing it ? @chzhuo

shuaijinchao · 2021-09-22T10:22:00Z

You can determine whether there is at least one healthy node after the etcd loop check is over to ensure that APISIX can run normally.

spacewander · 2021-09-22T10:38:41Z

The behavior is intended to make sure each configured node is working. So that we can avoid misconfigured nodes.

spacewander · 2021-09-23T03:38:29Z

I changed my mind. We can allow broken nodes to work around some unstable status in etcd.

okaybase · 2021-09-23T04:04:51Z

Production environment Case:
In our prod environment, we upgrade etcd cluster to v3.5, but we also met this case: etcd-io/etcd#12845 (comment), result to some etcd nodes down~
So, I also inclined to do some enhancement~

shuaijinchao · 2021-09-23T08:10:29Z

I'll fix this problem

shuaijinchao added the bug Something isn't working label Sep 22, 2021

spacewander removed the bug Something isn't working label Sep 22, 2021

spacewander added the enhancement New feature or request label Sep 23, 2021

shuaijinchao self-assigned this Sep 23, 2021

shuaijinchao mentioned this issue Sep 29, 2021

feat: etcd cluster single node failure APISIX startup failure #5158

Merged

4 tasks

spacewander closed this as completed in #5158 Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: APISIX start failed when one of ETCD nodes down #5115

bug: APISIX start failed when one of ETCD nodes down #5115

chzhuo commented Sep 22, 2021

tzssangglass commented Sep 22, 2021

shuaijinchao commented Sep 22, 2021

shuaijinchao commented Sep 22, 2021

spacewander commented Sep 22, 2021

spacewander commented Sep 23, 2021

okaybase commented Sep 23, 2021

shuaijinchao commented Sep 23, 2021

bug: APISIX start failed when one of ETCD nodes down #5115

bug: APISIX start failed when one of ETCD nodes down #5115

Comments

chzhuo commented Sep 22, 2021

Issue description

Environment

Steps to reproduce

Actual result

Error log

Expected result

tzssangglass commented Sep 22, 2021

shuaijinchao commented Sep 22, 2021

shuaijinchao commented Sep 22, 2021

spacewander commented Sep 22, 2021

spacewander commented Sep 23, 2021

okaybase commented Sep 23, 2021

shuaijinchao commented Sep 23, 2021