Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to do OS and language package installs when running --network=none #3246

Open
deitch opened this issue Nov 1, 2022 · 2 comments

Comments

@deitch
Copy link
Contributor

deitch commented Nov 1, 2022

Summary: provide a way to install arbitrary packages - both OS and language - without using RUN.

The ability to disable network via --network=none is important when building. This is especially powerful for compliance, source tracking, and reproducible builds.

If I can RUN anything, then the number of ways that I can install something is infinite. This makes it impossible to parse a Dockerfile to learn all source code, which can lead to license compliance and security scanning issues.

Did I do RUN apk add? RUN apt install? RUN git clone? RUN curl -O https://example.com/foo.tgz? RUN go build (which may or may not have downloaded)? RUN my-command.sh which calls other-cmd.sh which calls some binary which downloads software?

The curl issue is resolved by ADD features; the upcoming git features of ADD bring it a step forward.

What happens when I need to install OS packages? Language-specific packages?

The key here is not "no network access", but rather "no arbitrary network access", which --network=none provides, by eliminating access for RUN, not for ADD.

As a first blush, I would propose that we either extend ADD or use a new command like INSTALL (see this moby issue) that can install arbitrary package types. The same way I can ADD a file from a URL or a git repo, I can ADD an apk package, an apt package, go mod download, npm install, etc.

In terms of syntax, I could see something like:

ADD --type=apk bash=1.2.3

Or just as easily bash@1.2.3 or bash#1.2.3.

For languages, each language usually has a standard format for "install all my dependencies", so:

ADD --type=go /workdir
# OR
WORKDIR /workdir
ADD --type=go .

The above would run go mod download. We could extend it to:

ADD --type=npm /workdir

which would run npm install.

We could just as easily use INSTALL instead of ADD, if that fits better.

Either way, the goal would be to get the benefits of --network=none while working with the various needs - OS and language - to install dependencies.

I am aware that, at least for languages, we could download all of those things and add them to git commit (vendoring of various kinds), but that isn't always a realistic option, e.g.

ADD https://github.com/some/project.git#v1.2.3
RUN go build

The above will fail if it is a 3rd-party project that does not have everything vendored.

As discussed in community Slack with @jedevc and @AkihiroSuda

@tonistiigi
Copy link
Member

I don't think we will do custom flags for every possible software ecosystem.

The problem of installing packages via RUN is real though and would be much better to track them precisely, validate immutability and enable offline installs. Some discussion on it https://dockercommunity.slack.com/archives/C7S7A40MP/p1665627057564009?thread_ts=1665623364.830339&cid=C7S7A40MP

So I think the missing pieces for this atm are #3240 and follow-up buildinfo refactor to track the dependencies precisely. Then we will need #2943 (review) for the ability to guarantee the reuse of immutable sources.

These should fix it for the LLB layer and make it possible to achieve what you are describing using a custom buildkit frontend.

From there we can discuss how you could still use Dockerfile but include language-specific workflows (possibly from other frontends). Some options described in https://docs.google.com/document/d/1IafaByYR_Ao5ENw8bvIh8rbSbyh-mCAieNRrUOD3tAA/edit

@deitch
Copy link
Contributor Author

deitch commented Nov 1, 2022

I don't think we will do custom flags for every possible software ecosystem.

That is fine. I was not convinced my way was the best way (or even a good way), so much as illustrating, "this is a real problem", and "here is one approach".

The problem of installing packages via RUN is real though and would be much better to track them precisely

Yeah I was thinking about it in the "2 step" concept as well. Part of the issue is that just about every language has some form of "get build sources" combined with "do the build", and each is arbitrary: go build, npm i, etc. etc.

Having scanned that Google Doc, what do you see a final UX looking like (I always look at these things from enduser perspective first)? If RUN is network-disabled, and go build will try to download things, what would the Dockerfile look like that would work (once all is said and done)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants