Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce job-runner required dependencies, as shipping them is painful. #746

Open
bloodearnest opened this issue Sep 30, 2024 · 0 comments
Open

Comments

@bloodearnest
Copy link
Member

bloodearnest commented Sep 30, 2024

Required production dependencies are expensive for job-runner, for two reasons

  1. not currently deployed in docker, so need deps shipping to backends via awkward alternate methods

  2. we need to ship job-runner in opensafely cli, which is currently shipped via pip install, and we vendor the dependencies in order to avoid problems installing them all via pip on users computers.

We are dockerising 1, so that removes that issue, but 2 remains an issue.

The current prod dependency tree looks like this:

opensafely-pipeline==2024.3.19.153938
├── pydantic [required: <2, installed: 1.10.12]
│   └── typing_extensions [required: >=4.2.0, installed: 4.7.1]
└── ruyaml [required: Any, installed: 0.91.0]
    ├── distro [required: >=1.3.0, installed: 1.8.0]
    └── setuptools [required: >=39.0, installed: 65.3.0]
opentelemetry-exporter-otlp-proto-http==1.12.0
├── backoff [required: >=1.10.0,<3.0.0, installed: 2.1.2]
├── googleapis-common-protos [required: ~=1.52, installed: 1.56.4]
│   └── protobuf [required: >=3.15.0,<5.0.0dev, installed: 3.20.2]
├── opentelemetry-api [required: ~=1.3, installed: 1.12.0]
│   ├── Deprecated [required: >=1.2.6, installed: 1.2.13]
│   │   └── wrapt [required: >=1.10,<2, installed: 1.14.1]
│   └── setuptools [required: >=16.0, installed: 65.3.0]
├── opentelemetry-proto [required: ==1.12.0, installed: 1.12.0]
│   └── protobuf [required: ~=3.13, installed: 3.20.2]
├── opentelemetry-sdk [required: ~=1.11, installed: 1.12.0]
│   ├── opentelemetry-api [required: ==1.12.0, installed: 1.12.0]
│   │   ├── Deprecated [required: >=1.2.6, installed: 1.2.13]
│   │   │   └── wrapt [required: >=1.10,<2, installed: 1.14.1]
│   │   └── setuptools [required: >=16.0, installed: 65.3.0]
│   ├── opentelemetry-semantic-conventions [required: ==0.33b0, installed: 0.33b0]
│   ├── setuptools [required: >=16.0, installed: 65.3.0]
│   └── typing_extensions [required: >=3.7.4, installed: 4.7.1]
└── requests [required: ~=2.7, installed: 2.25.0]
    ├── certifi [required: >=2017.4.17, installed: 2020.11.8]
    ├── chardet [required: >=3.0.2,<4, installed: 3.0.4]
    ├── idna [required: >=2.5,<3, installed: 2.10]
    └── urllib3 [required: >=1.21.1,<1.27, installed: 1.26.5]

Requests is currently required, however our use of it is so simple, and we control both endpoints, that it might be worth considering writing a simpler urllib client to replace it.

Pipeline is dropping pydantic, so that helps some, although we still need ruyaml and friends.

The opentelemetry deps are the big ones, due to the packaging strategy of the otel python libraries (which is bs).

Ideally, we'd make these optional deps with a telemetry pip extra, that we only install in the docker image by default, and not vendor in opensafely-cli. We could enable away to try install non-vendored versions of these deps if we do want to enable client side telemetry, but its not required. But acheiving this might be tricky to actually implement, as by definition, telemetry capturing code is used all over the codebase.

With a bit of work, we could potentially get the list of dependencies down to a fraction of its current size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant