Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MaxListenersExceededWarning - "disconnected listeners" with disconnectEventAudienceMap #78

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

HarshaNalluru
Copy link
Collaborator

@HarshaNalluru HarshaNalluru commented Feb 22, 2021

Description

MaxListenersExceededWarning - "disconnected listeners" - Cause

  • Observed the following warning while stress testing service-bus JS SDK - [Service Bus] Session based stress tests : MaxListenersExceededWarning Azure/azure-sdk-for-js#12161
    • (node:6245) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 disconnected listeners added to [Connection]. Use emitter.setMaxListeners() to increase limit
  • Session.close() - adds the disconnected listener, and the listener would be removed upon "session_close" event.
  • When there are many sessions(service-bus sessions) with "SessionLockLost" error triggered by the renewlocks, Session.close() is called for each of the sessions in parallel, this leads to many disconnected listeners on the same connection. The listener count went to as high as 700 when dealt with 1000 concurrent sessions. This caused the MaxListenersExceededWarning.
  • For each of the Session.close(), upon "session_close" event, the added listeners were removed one by one. So, it is not fatal, it gets recovered. But momentarily, there are too many listeners to the same "disconnected" event.

Fix

  • Idea is to not add many listeners for the same event - inspiration from what we did for core-amqp at [Core AMQP][Service Bus] Fix "MaxListenersExceededWarning" with the managementLink Azure/azure-sdk-for-js#11749
  • Instead of adding many listeners on the same connection for each of the sessions, the plan is to add only a single listener for the disconnected event on the connection.
  • The connection maintains an active map of session ids with the provided callbacks. Instead of adding a new listener for the disconnected event on the connection, whoever(session) wants to listen to the "disconnected" event would set their id with a callback on that active map maintained at the connection level.
  • All the callbacks are called in a loop if the disconnected event is listened by that single common listener.

TODO

  • Propose the single listener fix for the Session.close() - MaxListenersExceededWarning
  • Add tests for the fix
  • Repro the MaxListenersExceededWarning with the containerized stress test - by repeating [Service Bus] Session based stress tests : MaxListenersExceededWarning Azure/azure-sdk-for-js#12161
  • Generate the service-bus SDK with the rhea-promise package from this PR and repeat the stress test to confirm this fixes the problem
  • Changelog
  • Expand this solution to other places where the "disconnected" event listeners on the connection are being added by senders/receivers/sessions if this is the decided fix

Reference to any GitHub issues

@richardpark-msft @chradek @ramya-rao-a

@chradek
Copy link
Collaborator

chradek commented Mar 10, 2021

I started reviewing but have a couple questions and comments.

  1. Is the goal simply to prevent seeing the node.js warning when a lot of listeners are attached to a single emitter? Did we decide we didn't want to increase the limit to a very large number? (Could we even remove the limit entirely?)

I ask because it looks like we're essentially doing what event listener does: store all the callback functions on an emitter into an object we can iterate over. I want to make sure this isn't also due to a real performance issue because from that standpoint the only piece that looks like it would run faster is removing event listeners, it's just now we manage calling the callbacks versus the emitter calling them.

  1. Why is this change limited to session.close()? Should it apply anywhere where a disconnected event is added? (For example, session creation.)

@HarshaNalluru
Copy link
Collaborator Author

Thanks for the comments, @chradek!!!

Why is this change limited to session.close()? Should it apply anywhere where a disconnected event is added? (For example, session creation.)

Not just session creation. Potentially, any "child" that is adding the listeners on the connection object. I had added a TODO in the description too if we decided to go with this solution.

Is the goal simply to prevent seeing the node.js warning when a lot of listeners are attached to a single emitter? Did we decide we didn't want to increase the limit to a very large number? (Could we even remove the limit entirely?)

Great question. That's the idea technically, and this seemed to be the only way I could prevent the addition of too many listeners.
We have not decided on increasing/removing the limits.

I ask because it looks like we're essentially doing what event listener does: store all the callback functions on an emitter into an object we can iterate over. I want to make sure this isn't also due to a real performance issue because from that standpoint the only piece that looks like it would run faster is removing event listeners, it's just now we manage calling the callbacks versus the emitter calling them.

That's exactly what I was thinking of too.. while implementing this "supposed fix". I'm also unsure if this fixes things..

  • Is comparing the memory being used with and without this PR be a good enough metric?
  • Comparison between the time taken to clear out the close() calls(removing listeners without this PR vs resolving the promises with this PR)?

Any thoughts, @chradek ?

@richardpark-msft
Copy link
Collaborator

@chradek brings up a good point that I hadn't honestly thought about.

In the previous fix, the potential number of registered listeners was one per message (ie, if we were doing "backup" message settlement). So it'd be difficult for us to really know what the cap is, but theoretically it could be something like 2000+.

This particular one seems like it's centered around links - if that's the case then that number is likely to be very low (ie, I think even in our tests we ended up having 11 total (1 over the limit) so we're probably not in bad shape there to just bump up the max listeners.

We actually already do bump up the limit to a much higher default than I anticipated for Senders:
https://github.com/Azure/azure-sdk-for-js/blob/869a72352f16963c243516106334c5f3cccc2e67/sdk/servicebus/service-bus/src/core/messageSender.ts#L295

I would think that the simplicity of just using the built-in default rather than having something custom would outweigh potential efficiency gains, IMO.

@HarshaNalluru
Copy link
Collaborator Author

TODO: Have a parallel PR to bump the listeners limit to a higher number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants