Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend Use Nacos cause OutOfMemoryError #1613

Closed
AYue-94 opened this issue Sep 4, 2024 · 4 comments · Fixed by #1614
Closed

Backend Use Nacos cause OutOfMemoryError #1613

AYue-94 opened this issue Sep 4, 2024 · 4 comments · Fixed by #1614
Labels
kind/bug Something isn't working
Milestone

Comments

@AYue-94
Copy link

AYue-94 commented Sep 4, 2024

What happened?

When backend loses contact with naocs, OOM occurs after a period of time:

Exception in thread "com.alibaba.nacos.client.Worker" java.lang.OutOfMemoryError: unable to create new native thread
	at java.lang.Thread.start0(Native Method)
	at java.lang.Thread.start(Thread.java:719)
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
	at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1025)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Exception in thread "com.alibaba.nacos.client.Worker" java.lang.OutOfMemoryError: unable to create new native thread
	at java.lang.Thread.start0(Native Method)
	at java.lang.Thread.start(Thread.java:719)
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
	at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1025)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
jstack 6718 | grep "com.alibaba.nacos.client.Worker" | wc -l
    4029

How can we reproduce it (as minimally and precisely as possible)?

  1. do not start nacos
  2. backend use nacos
dynamic.config.enable=true
dynamic.config.namespace=sermant
dynamic.config.timeout=30000
dynamic.config.serverAddress=127.0.0.1:8848
dynamic.config.dynamicConfigType=NACOS
dynamic.config.connectTimeout=3000
dynamic.config.enableAuth=false
dynamic.config.userName=
dynamic.config.password=
dynamic.config.secretKey=
  1. start backend

Anything else we need to know?

No response

Sermant version

2.0.0

OS version

MacOS

@AYue-94 AYue-94 added the kind/bug Something isn't working label Sep 4, 2024
@AYue-94
Copy link
Author

AYue-94 commented Sep 4, 2024

We don't need to manage reconnection owerself when using the Nacos client, nacos will deal reconnect itself.
image
for agent, it is the same, nacos version after 2.2.1(https://github.com/alibaba/nacos/pull/9639), it add health check logic, it will cause memory/thread leak.
image

@AYue-94
Copy link
Author

AYue-94 commented Sep 4, 2024

i will try to fix it, by remove nacos reconnect logic

@lilai23
Copy link
Collaborator

lilai23 commented Sep 5, 2024

How many time nacosclient manages reconnection?Is it configurable?

@AYue-94
Copy link
Author

AYue-94 commented Sep 5, 2024

The number of reconnections for nacos client cannot be configured, unlimited retries.
See https://github.com/alibaba/nacos/blob/2.2.1/common/src/main/java/com/alibaba/nacos/common/remote/client/RpcClient.java#L280

@lilai23 lilai23 added this to the v2.1.0 milestone Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants