Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with unclosed span in opentelemetry integration #2722

Closed
vanchaxy opened this issue Feb 9, 2024 · 5 comments · Fixed by #2801
Closed

Memory leak with unclosed span in opentelemetry integration #2722

vanchaxy opened this issue Feb 9, 2024 · 5 comments · Fixed by #2801

Comments

@vanchaxy
Copy link

vanchaxy commented Feb 9, 2024

How do you use Sentry?

Sentry Saas (sentry.io)

Version

1.40.3

Steps to Reproduce

SentrySpanProcessor stores all open span in self.otel_span_map dict. This leads to a memory leak if otel span is deleted without closing. E.g. due to a bug: open-telemetry/opentelemetry-python-contrib#2149

Expected Result

Sentry span should be deleted after otel span is deleted by GC.

Actual Result

Sentry spans are stored in self.otel_span_map forever.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Feb 9, 2024
@sentrivana
Copy link
Contributor

Thanks for the report @vanchaxy, putting this in our backlog.

@Undefined-User
Copy link

Undefined-User commented Mar 7, 2024

Same problem, docker containing the flask process crashes about once an hour due to memory overflow when no heavy tasks are performed.

@bruno-garcia
Copy link
Member

isn't this a p0? it's def a p0 in .NET

@getsantry getsantry bot removed the status in GitHub Issues with 👀 2 Mar 8, 2024
@antonpirker antonpirker self-assigned this Mar 11, 2024
antonpirker added a commit that referenced this issue Mar 12, 2024
OTel spans that are handled in the Sentry span processor can never be finished/closed. This leads to a memory leak. This change makes sure that open spans will be removed from memory after 10 minutes to prevent memory usage from growing constantly.

Fixes #2722

---------

Co-authored-by: Daniel Szoke <szokeasaurusrex@users.noreply.github.com>
@vanchaxy
Copy link
Author

That's a very controversial solution. At least, make SPAN_MAX_TIME_OPEN_MINUTES configurable. Our system has a lot of long-running jobs that are longer than 10 minutes. I think the proper solution will be to create weak references to otel spans and remove sentry spans only if otel spans were deleted.

szokeasaurusrex added a commit that referenced this issue Mar 13, 2024
* ref: Improve scrub_dict typing (#2768)

This change improves the typing of the scrub_dict method.

Previously, the scrub_dict method's type hints indicated that only dict[str, Any] was accepted as the parameter. However, the method is actually implemented to accept any object, since it checks the types of the parameters at runtime. Therefore, object is a more appropriate type hint for the parameter.

#2753 depends on this change for mypy to pass

* Propagate sentry-trace and baggage to huey tasks (#2792)

This PR enables passing `sentry-trace` and `baggage` headers to background tasks using the Huey task queue.

This allows easily correlating what happens inside a background task with whatever transaction (e.g. a user request in a Django application) queued the task in the first place.

Periodic tasks do not get these headers, because otherwise each execution of the periodic task would be tied to the same parent trace (the long-running worker process).

--- 

Co-authored-by: Anton Pirker <anton.pirker@sentry.io>

* OpenAI integration (#2791)

* OpenAI integration

* Fix linting errors

* Fix CI

* Fix lint

* Fix more CI issues

* Run tests on version pinned OpenAI too

* Fix pydantic issue in test

* Import type in TYPE_CHECKING gate

* PR feedback fixes

* Fix tiktoken test variant

* PII gate the request and response

* Rename set_data tags

* Move doc location

* Add "exclude prompts" flag as optional

* Change prompts to be excluded by default

* Set flag in tests

* Fix tiktoken tox.ini extra dash

* Change strip PII semantics

* More test coverage for PII

* notiktoken

---------

Co-authored-by: Anton Pirker <anton.pirker@sentry.io>

* Add a method for normalizing data passed to set_data (#2800)

* Discard open spans after 10 minutes (#2801)

OTel spans that are handled in the Sentry span processor can never be finished/closed. This leads to a memory leak. This change makes sure that open spans will be removed from memory after 10 minutes to prevent memory usage from growing constantly.

Fixes #2722

---------

Co-authored-by: Daniel Szoke <szokeasaurusrex@users.noreply.github.com>

* ref: Event Type (#2753)

Implements type hinting for Event via a TypedDict. This commit mainly adjusts type hints; however, there are also some minor code changes to make the code type-safe following the new changes.

Some items in the Event could have their types expanded by being defined as TypedDicts themselves. These items have been indicated with TODO comments.

Fixes GH-2357

* Fix mypy in `client.py`

* Fix functools import

* Fix CI config problem

... by running `python scripts/split-tox-gh-actions/split-tox-gh-actions.py`

---------

Co-authored-by: Christian Schneider <christian@cnschn.com>
Co-authored-by: Anton Pirker <anton.pirker@sentry.io>
Co-authored-by: colin-sentry <161344340+colin-sentry@users.noreply.github.com>
@DanielNoord
Copy link

@antonpirker Is there any interest in making this configurable? I have spend countless hours trying to understand why my OpenTelemetry spans were missing their root spans and it now turns out it is because a hidden global constant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants