Fork process models for metric batch reader do not work #2767

lzchen · 2022-06-16T16:45:01Z

and when a new process is not created (simply a new fork is created).

When does this happen?

One thing I am still thinking about is how do I do the same for PeriodicExportingMetricReader? Unlike tracing/logging where the events are received by processor, the PMR collects them with collect function. There is nothing like emit or on_start/on_end where I can check if the pids to see the fork hook is invoked and invoke if not.

Originally posted by @srikanthccv in #2277 (comment)

The text was updated successfully, but these errors were encountered:

PeterJCLaw · 2023-05-11T20:05:57Z

Is this specific to uWSGI? If so could uWSGI's own post-fork mechanisms be used to provide support?

srikanthccv · 2023-05-12T17:35:11Z

What do you mean by could uWSGI's own post-fork mechanisms be used to provide support?

PeterJCLaw · 2023-05-13T13:39:12Z

Oh, sorry, I'd misread the conversation which this was extracted from. Somehow I had thought that this issue (#2767) was essentially reporting the issue that that PR (#2277) was actually fixing. I was then suggesting that using uWSGI's decorators could help solve that issue by providing on-fork hooks compatible with uWSGI.

I suspect that uWSGI's fork hooks might still be useful here (at least when uWSGI is being used), though obviously this is broader than that.

srikanthccv · 2023-05-14T03:29:56Z

Yes, we have documented it here https://github.com/open-telemetry/opentelemetry-python/tree/main/docs/examples/fork-process-model#uwsgi-postfork-decorator. Since which application server is used outside the control of the OpenTelemetry, we documented how to work around the current limitations with the postfork hooks. That example no longer represents the current state of our SDK with gunicorn/uWSGI. Now, only metrics SDK doesn't work, so docs should be updated to reflect that.

…worker processes (#243) ### Issue: OTel Python [has issues](open-telemetry/opentelemetry-python#2767) where the SDK is unable to report metrics for applications using a fork process model WSGI server. This affects ADOT when it tries to generate the OTel or Application Signals metrics. A solution to this is to [re-initialize the SDK in the worker processes after the process forking as happened](https://opentelemetry-python.readthedocs.io/en/latest/examples/fork-process-model/README.html). A small caveat is that if the SDK has been initialized in the master process, the worker process SDK won't work because Tracer/Meter providers can be set globally only once. So to circumvent this, we need to skip initializing the SDK in the master process and only do so in the worker processes. ### Description of changes: - Introducing an opt-in configuration environment variable `OTEL_AWS_PYTHON_DEFER_TO_WORKERS_ENABLED` to enable if they are using a WSGI (or a fork process model) server and want the ADOT SDK to defer auto-instrumentation to worker processes. - Whenever the ADOT SDK auto-instrumentation is loaded (either via the `sitecustomize.py` file or the `opentelemetry-instrument` command), the SDK will check if the above configuration is enabled and if the current process is the master process, and will skip the instrumentation. - The way we determine if the current process is master or worker is by using an internal marker environment variable `IS_WSGI_MASTER_PROCESS_ALREADY_SEEN`. The first time the ADOT SDK sees a python process, this env var is not set and it will know this should be a WSGI master process. We then set the env var and when a new worker process forks, the master environment is copied to it (and so the env var). So when the ADOT SDK checks this env var again (in worker) it finds that the env var was already set to `true` in the master. ### Testing: - Unit tests covering the functionalities bases on different configurations of the `OTEL_AWS_PYTHON_DEFER_TO_WORKERS_ENABLED` and `IS_WSGI_MASTER_PROCESS_ALREADY_SEEN` variables. - Manual test using a sample application. Since this is an opt-in configuration (a 2-way door), testing manually gives us a fair bit of confidence. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

PeterJCLaw mentioned this issue May 10, 2023

Clarify support for forking process models #3307

Open

ocelotl mentioned this issue Jun 7, 2023

Update postfork example #3339

Open

srikanthccv mentioned this issue Jul 25, 2023

How to use auto-instrumentation for Uvicorn with multiple worker processes open-telemetry/opentelemetry-python-contrib#385

Open

srprash mentioned this issue Aug 29, 2024

Opt-in config and logic for deferral of instrumentation to only WSGI worker processes aws-observability/aws-otel-python-instrumentation#243

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fork process models for metric batch reader do not work #2767

Fork process models for metric batch reader do not work #2767

lzchen commented Jun 16, 2022

PeterJCLaw commented May 11, 2023

srikanthccv commented May 12, 2023 •

edited

Loading

PeterJCLaw commented May 13, 2023

srikanthccv commented May 14, 2023

Fork process models for metric batch reader do not work #2767

Fork process models for metric batch reader do not work #2767

Comments

lzchen commented Jun 16, 2022

PeterJCLaw commented May 11, 2023

srikanthccv commented May 12, 2023 • edited Loading

PeterJCLaw commented May 13, 2023

srikanthccv commented May 14, 2023

srikanthccv commented May 12, 2023 •

edited

Loading