Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use auto-instrumentation for Uvicorn with multiple worker processes #385

Open
adamantike opened this issue Mar 27, 2021 · 17 comments
Labels
backlog bug Something isn't working

Comments

@adamantike
Copy link
Contributor

Describe your environment

Using the following Dockerfile to simplify testing:

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim

RUN pip install --no-cache-dir \
      opentelemetry-distro==0.19b0 \
      opentelemetry-instrumentation-fastapi==0.19b0

Save the file as Dockerfile, and build the Docker image it using the following command in the same folder:

docker build -t test-uvicorn .

Steps to reproduce

The Uvicorn server only has one registered endpoint at the root path. Running the server, and then accessing the endpoint (in this example, at http://localhost:5000/) will display any collected spans in the terminal.

Running the Uvicorn server with multiple workers disabled, and OpenTelemetry auto-instrumentation:

docker run --rm -p 5000:5000 test-uvicorn opentelemetry-instrument --trace-exporter console_span uvicorn main:app --host 0.0.0.0 --port 5000

Running the Uvicorn server with multiple workers enabled, and OpenTelemetry auto-instrumentation:

docker run --rm -p 5000:5000 test-uvicorn opentelemetry-instrument --trace-exporter console_span uvicorn main:app --host 0.0.0.0 --port 5000 --workers 2

What is the expected behavior?

Both modes (single-worker, and multi-worker) must correctly auto-instrument the application, and display collected spans.

What is the actual behavior?

Only the single-worker command displays collected spans. Multi-worker Uvicorn does not display any spans.

Additional context

Uvicorn uses multiprocessing to spawn workers. (https://github.com/encode/uvicorn/blob/bf1c64e2c1/uvicorn/main.py#L381-L384, and https://github.com/encode/uvicorn/blob/bf1c64e2c1/uvicorn/subprocess.py#L38).
Could be related to the same issue that is blocking Gunicorn from working (#171).

Submitting the issue here, as apparently the auto-instrumentation code will be moved to this repository, based on open-telemetry/opentelemetry-python#1532.

@adamantike adamantike added the bug Something isn't working label Mar 27, 2021
@srikanthccv
Copy link
Member

If you are using Gunicorn as your process manager this should help you https://opentelemetry-python.readthedocs.io/en/latest/examples/fork-process-model/README.html.

@adamantike
Copy link
Contributor Author

This example is not using Gunicorn to run the web server, but if I understand correctly, the pre-fork web server model is the same for Uvicorn. If that's the case, it would be just a matter of finding the post-fork hook in Uvicorn, and applying a similar solution to the ones provided for Gunicorn and uWSGI.

The main question here then becomes: is auto-instrumentation possible in the pre-fork web server model? (at least, when opentelemetry-instrument does not receive any options and is configured completely from environment variables).

Similar to what opentelemetry-instrument already does to automatically configure Celery workers (https://github.com/open-telemetry/opentelemetry-python/blob/b9c0b6f8827895e23eeff1f253f4c502154d1873/opentelemetry-instrumentation/src/opentelemetry/instrumentation/auto_instrumentation/sitecustomize.py#L96-L105).

@adamantike
Copy link
Contributor Author

cc @owais, who implemented open-telemetry/opentelemetry-python#1036

@adamantike
Copy link
Contributor Author

adamantike commented Mar 27, 2021

With Gunicorn, I started receiving spans with the following post_fork hook:

def post_fork(server, worker):
    from opentelemetry.instrumentation.auto_instrumentation import sitecustomize

Previously running export OTEL_TRACES_EXPORTER=console_span, as the command to run Gunicorn shouldn't need to include opentelemetry-instrument, because that would just instrument the parent process.

It's important to note that importing the sitecustomize file already works, because it triggers the initialize() execution: https://github.com/open-telemetry/opentelemetry-python/blob/b9c0b6f8827895e23eeff1f253f4c502154d1873/opentelemetry-instrumentation/src/opentelemetry/instrumentation/auto_instrumentation/sitecustomize.py#L109

@adamantike
Copy link
Contributor Author

adamantike commented Mar 27, 2021

With Uvicorn, it seems to be more complex, as it lacks the required hooks. I started receiving spans after manually editing the subprocess_started function, to run the sitecustomize import before anything else:

def subprocess_started(config, target, sockets, stdin_fileno):
    from opentelemetry.instrumentation.auto_instrumentation import sitecustomize
    # ...

However, if an easy way is found to auto-instrument Gunicorn, IMHO there's no need to work on barebones Uvicorn, but instead to use the Uvicorn worker class for Gunicorn.

@owais
Copy link
Contributor

owais commented Mar 27, 2021

I haven't tried or read the links shared above in detail but isn't the issue about batch span processor and not about instrumentation? Can you try using simple span processor and see if that works to clarify if we are talkig about the same problem?

@github-actions
Copy link

This issue was marked stale due to lack of activity. It will be closed in 30 days.

@github-actions
Copy link

Closed as inactive. Feel free to reopen if this issue needs resolving.

@TBBle
Copy link
Contributor

TBBle commented Jun 7, 2022

Should this be reopened, as uvicorn --workers (or running uvicorn with the WEB_CONCURRENCY env-var set, or --reload (which also uses pre-forking under the hood) was not resolved, and might need upstream support for the appropriate hook?

Presumably it'd be similar to whatever #676 does, if we had the hook (or monkeypatched the appropriate code in uvicorn).

@bilbof
Copy link

bilbof commented Jul 29, 2022

I've experienced this issue, and this should probably be re-opened as it should be possible to autoinstrument uvicorn applications.

I can confirm that the post fork for uvicorn fixes the issue, but does require app code changes. I did this with a middleware

# instrumentation.py
from opentelemetry.instrumentation.auto_instrumentation import sitecustomize
from starlette.middleware.base import BaseHTTPMiddleware
class OpenTelemetryMiddleware(BaseHTTPMiddleware):
    def self.__init__(self, app):
        super().__init__(app)
    async def dispatch(self, req, call_next):
        response = await call_next(req)
        return response

# app.py
from instrumentation import OpenTelemetryMiddleware
app.add_middleware(OpenTelemetryMiddleware)

The fix I actually went with was to move from uvicorn app:app --workers 4 to gunicorn app:app -w 4 -k uvicorn.workers.UvicornWorker, since Gunicorn is supported, and I didn't want to have to write a library / modify an app too much to get instrumentation working.

Demo showing that uvicorn --workers auto-instrumentation isn't currently working:

git clone git@github.com:bilbof/fastapi-opentel-demo.git && cd fastapi-opentel-demo/demos/autoinstrument
docker build -t fastapi-demo .
docker run -it -p 8000:8000 fastapi-demo opentelemetry-instrument --metrics_exporter=none --traces_exporter=console uvicorn app:app --host=0.0.0.0
# go to localhost:8000 and you'll see traces logged to the console
docker run -it -p 8000:8000 fastapi-demo opentelemetry-instrument --metrics_exporter=none --traces_exporter=console uvicorn app:app --host=0.0.0.0 --workers=2
# go to localhost:8000 and you will not see traces logged to the console

@srikanthccv srikanthccv reopened this Jul 30, 2022
@srikanthccv
Copy link
Member

Please try running with latest rc2 version and let us know if it changes anything.

@bilbof
Copy link

bilbof commented Jul 31, 2022

@srikanthccv sure, here's a POC using rc2 which demonstrates the problem: https://github.com/bilbof/fastapi-opentel-demo/tree/main/demos/autoinstrument

git clone git@github.com:bilbof/fastapi-opentel-demo.git && cd fastapi-opentel-demo/demos/autoinstrument
docker build -t demo .

# go to localhost:8000 and you'll see traces logged to the console
docker run -it -e WORKERS=1 -p 8000:8000 demo

# go to localhost:8000 and you will not see traces logged to the console
docker run -it -e WORKERS=2 -p 8000:8000 demo

pip freeze:

opentelemetry-api==1.12.0rc2
opentelemetry-distro==0.32b0
opentelemetry-instrumentation==0.32b0
opentelemetry-instrumentation-fastapi==0.32b0
opentelemetry-sdk==1.12.0rc2

AFAIC 0.32b0 is equivalent to 1.12.0rc2, but let me know if I'm wrong and there's a more recent version I can try.

@bilbof
Copy link

bilbof commented Jul 31, 2022

The problem is caused by Uvicorn using multiprocessing.spawn rather than multiprocessing.fork (which uses os.fork), so the OTel post fork hook isn't running.

It'd be great if we could fix this here rather than at uvicorn, since others are experiencing the same problem with multiprocessing (e.g. #2185). Would be interested in your thoughts on how to resolve this.

Also, is the fork process model doc still accurate? The workaround in this case is to use gunicorn to run uvicorn apps (which is recommended by Uvicorn) which I've confirmed has worked (see my demo). So issue #2038 could be closed.

@abhinavramana
Copy link

@srikanthccv Can this be resolved soon? We need some uvicorn features that gunicorn doesn't support properly like websockets

@srikanthccv
Copy link
Member

The fork process model documentation is not accurate. I left my comments here open-telemetry/opentelemetry-python#2767 (comment) and here open-telemetry/opentelemetry-python#3307. The logs and traces SDK should always work regardless of the application server used (gunicorn, uvicorn, uwsgi etc...), but the metrics SDK won't work unless there is a hook. I don't know of any fix that can be done in SDK to make it work with Uvicron (or any application server). I guess any threaded application that uses Uvicorn has a high chance of deadlock if there is no way to sanitize the locks&state. I believe upstream should support hooks.

@Mustafakhn
Copy link

I am also experiencing the same issue with auto instrumentation and uvicorn with the --workers flag it seems like this issue is still not resolved as I wasn't able to find any solution regarding this on the internet if anyone found any workaround for this problem or any solution regarding this issue please let me know the solution it would be really helpful Thankyou.

@lyonsy
Copy link

lyonsy commented Mar 1, 2024

Any updates here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants