Clarify support for forking process models #3307

PeterJCLaw · 2023-05-10T16:01:50Z

https://opentelemetry-python.readthedocs.io/en/latest/examples/fork-process-model/README.html indicates that BatchSpanProcessor does not support forking process models, including those used by Gunicorn and uWSGI. However those docs (#1609) pre-date other changes which appear to have added support for forking process models, including:

which seem specifically targetted at resolving the described issues.

What is the current expected state of support for the various forking models?
Either way, updating the docs to clarify the actual situation would be appreciated.

Possibly related issues:

The text was updated successfully, but these errors were encountered:

aabmass · 2023-05-11T16:14:08Z

Thanks for opening this issue, I think it is an important topic.

Afaik, things should be working for tracing at this point. Metrics are a little more complicated and #2837 is definitely still valid. It would be great if we could clean up old issues/documentation and would definitely appreciate help on this if you have the time @PeterJCLaw.

@srikanthccv last worked on this, do you know if any of the linked bugs can be closed out now?

srikanthccv · 2023-05-11T18:55:19Z

I think 4th linked issue can be closed; however, it's better to verify once before closing.
For uWSGI, tracing/logging SDK should continue to work.
The exporter mentioned in (exporter-jaeger-thrift): Exception in forked environments #2837 is Jaeger Thrift which is planned to be removed in July, but if someone is interested in getting it fixed, they are welcome.
Fork process models for metric batch reader do not work #2767 I am not sure if there is a solution. Last time I checked, uWSGI still doesn't invoke the hooks we rely on.

PeterJCLaw · 2023-05-11T20:37:35Z

Thanks for those additional details.

I think 4th linked issue can be closed; however, it's better to verify once before closing.

What does verification look like here? Given the nature of this as a forking issue I'd kinda assumed the failure mode was race conditions and the like -- timing based and hard to reproduce/be sure if fixed just from testing. Is there a known reproduction of the failure mode or something we can do to check that it's working as intended?
(Or is it a cased of code inspection?)

srikanthccv · 2023-05-11T21:28:14Z

The verification can be pretty simple. If you run a simple gunicorn based application with multiple workers (Without following the special instructions mentioned in the docs), it should export the traces/metrics/logs. If they are not exported (to the collector, console etc.), then the issue still exists.

PeterJCLaw · 2023-05-11T21:50:36Z

Hrm, ok. I'm currently running gunicorn with the default worker type and that does seem to work. My reading of gunicorns docs indicate that the master process doesn't actually load the application though -- only the workers do (after the fork happens), unless you use the --preload option (which I'm currently not doing).

I'll see if I can find some time to validate the behaviours under gunicorn & uwssgi.

Edit for clarity: my use of gunicorn without --preload works fine -- I get tracing reported even though I'm using BatchSpanProcessor.

srikanthccv · 2023-05-12T03:24:05Z

I am confused now. Is it working or not working with gunicorn?

PeterJCLaw · 2023-05-13T13:41:18Z

My current setup is using gunicorn without calling --preload and that seems to work.
Not using --preload under gunicorn seems to be the default, though the open telemetry docs seem to imply that they're expecting the inverse -- that gunicorn will preload before forking.

PeterJCLaw · 2023-05-13T14:12:12Z

I've just tested a toy application under gunicorn and uWSGI, using both their --preload and --lazy-apps modes (and not) respectively using https://gist.github.com/PeterJCLaw/5e6fd585d0c620496c82ca926ce50f67. All four configurations seem to work ok -- I get traces printed out to the console which seem reasonable.

I'm not sure how to validate that the desired batching is happening, though I can see that the traces are often not printed until several requests later which suggests that it is working.

srikanthccv · 2023-05-14T03:25:07Z

I've just tested a toy application under gunicorn and uWSGI, using both their --preload and --lazy-apps modes (and not) respectively using https://gist.github.com/PeterJCLaw/5e6fd585d0c620496c82ca926ce50f67. All four configurations seem to work ok -- I get traces printed out to the console which seem reasonable.

Thank you, this confirms the above statement that traces and logs batch processors work all the time.

I'm not sure how to validate that the desired batching is happening, though I can see that the traces are often not printed until several requests later which suggests that it is working

They are emitted if the queue becomes full or batch timeout elapses. I think this is working fine.

PeterJCLaw · 2023-06-06T16:36:25Z

Unfortunately while it looks like the BatchSpanProcessor does actually work fine, the PeriodicExportingMetricReader appears not to. I've not tested it under all the various configurations previously discussed, however I have observed it not working correctly in uWSGI's default mode (i.e: where it imports the WSGI app before forking). In that case the ticker thread only exists in the main process, presumably meaning that metrics are only collected from that main process. I'd guess this is due to PeriodicExportingMetricReader using the approach from #2242 rather then the enhanced approach introduced in #2277.

Edit: using uWSGI's --lazy-apps to delay the import does get PeriodicExportingMetricReader running in all the worker processes.

PeterJCLaw · 2023-06-06T17:25:00Z

Hrm, the uWSGI side of this may be somewhat resolvable by users. Ish. Via https://stackoverflow.com/a/72601192 and unbit/uwsgi#2388 it seems that uWSGI has some extra options which can be used to enable os.register_at_fork to work -- specifically --py-call-osafterfork. There's also --enable-threads which seems to be needed to enable workers to have threads. Unfortunately in local testing I'm seeing that using those two options together results in uWSGI not serving any requests (at all); while using just --py-call-osafterfork seems to lock up one of the two workers I'm running with. Not sure yet if this is a uWSGI issue generally (definitely feels like it could be) or something to do with threads. I doubt it's open-telemetry specific. Posting here as hopefully useful to others hitting this.

SharifiFaranak · 2024-01-29T17:46:32Z

+1 to updating docs here since BatchSpanProcessor is working well with Gunicorn. It cost me more than a few hours trying to get the post_fork hook to work (which didn't end up working) when it wasn't necessary.

PeterJCLaw added the feature-request label May 10, 2023

srikanthccv mentioned this issue Jul 25, 2023

How to use auto-instrumentation for Uvicorn with multiple worker processes open-telemetry/opentelemetry-python-contrib#385

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify support for forking process models #3307

Clarify support for forking process models #3307

PeterJCLaw commented May 10, 2023

aabmass commented May 11, 2023

srikanthccv commented May 11, 2023

PeterJCLaw commented May 11, 2023

srikanthccv commented May 11, 2023

PeterJCLaw commented May 11, 2023 •

edited

Loading

srikanthccv commented May 12, 2023

PeterJCLaw commented May 13, 2023

PeterJCLaw commented May 13, 2023

srikanthccv commented May 14, 2023

PeterJCLaw commented Jun 6, 2023 •

edited

Loading

PeterJCLaw commented Jun 6, 2023 •

edited

Loading

SharifiFaranak commented Jan 29, 2024

Clarify support for forking process models #3307

Clarify support for forking process models #3307

Comments

PeterJCLaw commented May 10, 2023

aabmass commented May 11, 2023

srikanthccv commented May 11, 2023

PeterJCLaw commented May 11, 2023

srikanthccv commented May 11, 2023

PeterJCLaw commented May 11, 2023 • edited Loading

srikanthccv commented May 12, 2023

PeterJCLaw commented May 13, 2023

PeterJCLaw commented May 13, 2023

srikanthccv commented May 14, 2023

PeterJCLaw commented Jun 6, 2023 • edited Loading

PeterJCLaw commented Jun 6, 2023 • edited Loading

SharifiFaranak commented Jan 29, 2024

PeterJCLaw commented May 11, 2023 •

edited

Loading

PeterJCLaw commented Jun 6, 2023 •

edited

Loading

PeterJCLaw commented Jun 6, 2023 •

edited

Loading