-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak when running from python #560
Comments
I'm not sure... a memory leak is possible. As a workaround I would suggest having a long-running process that spawns a Python subprocess to run batches of scans/domains (using a single Scanner()), and then shuts down the subprocess once the scans are complete., |
Hello there - I ran into the same issue and thought it may have been our wrapper code used around SSLyze, but I set up a test script using the example code and see large amounts of memory being consumed. I have a list of 30 random websites I am using in 'myhosts.txt'. I tried using scanner.get_results() strictly as a generator function, but there does not seem to be a difference. Output: Line # Mem usage Increment Occurrences Line Contents
Ubuntu: 20.04 It's happening somewhere between scanner.queue_scans() and scanner.get_results() @nabla-c0d3 - Thanks so much for all of your work on this amazing library. It has inspired me to dive deeper into Python. I will try digging into the code, but it is certainly pretty complex Hopefully a solution can be found. . |
I think I've isolated the issue to mass_scanner's _generate_result_for_completed_server_scan function, with the help of memory-profiler. Specifically, the lines where the finished scan_command's plugin is called to produce a result. (and the attempts that follow in the below except blocks) I'm going to keep poking at this and hopefully figure out what's occurring. For my testing, I've been repeating the CERTIFICATE_INFO scan command on the same host and comparing the memory footprint with psutil.
edit: I didn't consider imports for the first execution, but that still doesn't explain the following 2-3 executions taking a decent amount of memory. |
I'm getting closer to (one of) the issues. I've noticed that over the course of 50 executions with the CERTIFICATE_INFO scan command, having all trust stores enabled (as it is currently non-configurable) when evaluating trust stores leads to an average leak per execution (ignoring imports from the first run) of roughly 22 KB. When I manually intervened and disabled all trust stores, the average leak per execution is roughly 4.6 KB. The next step will be further investigating _verify_certificate_chain, called here. |
I used the test script from @MoshePerez and https://github.com/bloomberg/memray , and can confirm that the leak is in _verify_certificate_chain(). Instead, I pushed a workaround in #563 . Can you try it and confirm that it fixes the problem? Thanks! |
@nabla-c0d3 using a similar test script as @MoshePerez, your PR seems to eliminate the leak. I poked at nassl a bit more, and found that Here's a simple test to demonstrate the leaks. I was having issues getting reliable data from psutil with this. With it running continuously with a sleep statement between tests, you'll be able to see the leak in any process monitor.
|
I can confirm memory leak but not in sslyze it's inside https://github.com/nabla-c0d3/nassl |
@sconway-datto Please go ahead and open an issue in nassl for this, which references this issue. Thanks! |
[#560] Implement workaround for memory leak in CertificateChainVerifier
Fixed as part of v5.0.4. |
It looks like memory is not being cleared after each scan. here is a short snippet to reproduce:
Output:
I also tried adding
gc.collect()
after each call, it helped reduce the object count but memory used still increased.Am I missing something and using the scanner wrong or is it a memory leak?
The text was updated successfully, but these errors were encountered: