Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile: create "dependabot" user #3398

Merged
merged 1 commit into from
Apr 12, 2021
Merged

Dockerfile: create "dependabot" user #3398

merged 1 commit into from
Apr 12, 2021

Conversation

thepwagner
Copy link
Contributor

@thepwagner thepwagner commented Mar 29, 2021

Previously, the root Dockerfile would run all instructions as the default user of the parent container (specifically: root). This leaves several paths where dynamic SDK versions might be installed (e.g. pyenv, rustup) owned by root.
In Dockerfile.development, and hopefully other downstream containers, we create a user grant that user ownership of these paths. Those ownership changes make for a very expensive docker layer.

This PR migrates the creation of the user account to the base Dockerfile, so that dependencies can be owned by the correct user in the layer they're initially added. This will reduce build times, and the size of final images.

The ability for bin/docker-dev-shell --rebuild to synchronize the dependabot UID/GID for consistency across volume mounts is retained, but now developers are expected to rebuild the dependabot/dependabot-core image locally (previously it could be pulled from the registry). The --rebuild flag already does this, so it shouldn't be any additional toil 🤞 .

I introduced a bin/cibuild script that chains the Dockerfile->Dockerfile.ci steps similar to Actions. This was helpful for this iteration - it could be adopted by .github/workflows/ci.yml or abandoned. I don't think we should start maintaining this flow in two places. We rolled that back for now.

Using that harness, I verified the CI script can be simplified by just bulk adding the source code in the Dockerfile.ci build. This avoids the scripting complexity (and for Macs: hefty performance penalty) of volume mounting all the individual directories. That script complexity is retained in bin/docker-dev-shell, because sharing with an external editor is worth it there. This avoids synchronizing the UID/GID with the host in CI, so the UID/GID of dependabot/dependabot-core is a reliable 1000:1000.

@thepwagner thepwagner self-assigned this Mar 29, 2021
Dockerfile.development Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@feelepxyz
Copy link
Contributor

Thanks for starting on this! I reckon this might end up speeding up updater image builds a fair bit and reduce the size 🙌 also means we can test core with the same permissions as the updater without running out of disk space in actions ✨

@feelepxyz feelepxyz marked this pull request as ready for review April 6, 2021 17:16
@feelepxyz feelepxyz requested a review from a team as a code owner April 6, 2021 17:16
@feelepxyz feelepxyz marked this pull request as draft April 6, 2021 17:16
@feelepxyz feelepxyz marked this pull request as ready for review April 7, 2021 12:03
Copy link
Contributor Author

@thepwagner thepwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I wondered about implications of OSTYPE

If it's necessary, should the later chowns use dependabot:$USER_GID instead of the dependabot:dependabot (e.g. the group by name)?

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile.ci Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
Dockerfile.development Outdated Show resolved Hide resolved
Copy link
Contributor

@xlgmokha xlgmokha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance you could prepare a dive report or hook up dive in the cibuild script via this.

Dockerfile Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
@thepwagner
Copy link
Contributor Author

a dive report
asdf

@xlgmokha I dig it! I've opened internal tracking issues for both of these improvements, I don't think they should slip into this PR's scope.

Dive reporting on dependabot-core would only half-solve the problem - we care most about the final image in production.

bin/cibuild Outdated Show resolved Hide resolved
@feelepxyz
Copy link
Contributor

@thepwagner I think this is close to done now 🎉 could I get your 👀 on the native helper permission changes? I'm going to do a test on staging with all ecosystems and leave it until Monday before thinking about deploying.

@thepwagner
Copy link
Contributor Author

could I get your 👀 on the native helper permission changes?

I'm concerned about the changes for rust permissions.

Referencing the chown workaround we're trying to replace, I think there are 3 important postconditions to test in the image:

  • If the update requires custom python, can it be installed? pyenv install 3.9.3
  • If the update downloads go modules, can they be installed? go get github.com/sirupsen/logrus
  • If the update requires a custom rust toolchain, can it be installed? rustup install 1.50.0

b35c591 passes the first 2, but can't install a custom rust version:

dependabot@6f91d58b997c:/opt/go/gopath$ rustup install 1.50.0
info: syncing channel updates for '1.50.0-x86_64-unknown-linux-gnu'
error: could not create temp file: /opt/rust/tmp/sqi0qzbcwhefp9b3_file

I didn't work step by step, but the final result also writes to /opt/rust/toolchains and /opt/rust/update-hashes:

dependabot@ce69f0013b03:/opt/rust$ find /opt/rust -maxdepth 3 -name "*1.50.0*"
/opt/rust/toolchains/1.50.0-x86_64-unknown-linux-gnu
/opt/rust/update-hashes/1.50.0-x86_64-unknown-linux-gnu

We could either chown those paths so the delegated user can write them, or use a USER dependabot directive to perform the entire install as the dependabot user.

@feelepxyz
Copy link
Contributor

b35c591 passes the first 2, but can't install a custom rust version:

Oh good spot! I'm trying to find any references of installing custom rust versions from updates but can't find any but looks like we fetch a rust-toolchain file if present. https://rust-lang.github.io/rustup/overrides.html

@feelepxyz
Copy link
Contributor

We might be able to test against this repo https://github.com/rust-lang/rust-clippy

Dockerfile Show resolved Hide resolved
@feelepxyz
Copy link
Contributor

Looks like rust updates with a custom toolchain are indeeed failing, tested on https://github.com/feelepxyz/rust-clippy and seeing: info: syncing channel updates for 'nightly-2021-04-08-x86_64-unknown-linux-gnu' updater | <job_109124542> error: could not create temp file: /opt/rust/tmp/ec6c4k1xq74aa07p_file

@thepwagner
Copy link
Contributor Author

@xlgmokha now that we're ~stable, here's a (manual for now :/ ) dive report:

Pre, fc12b55 :

Image Source: docker://docker.io/dependabot/dependabot-core:fc12b554eaf835858e1cf9252da39bfc7243258f
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 98.7559 %
  wastedBytes: 93762386 bytes (94 MB)
  userWastedPercent: 2.1754 %
Inefficient Files:
Count  Wasted Space  File Path
    2        8.2 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic-updates_main_binary-amd64_Packages.lz4
    2        7.2 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic-updates_universe_binary-amd64_Packages.lz4
    2        6.8 MB  /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_bionic-security_main_binary-amd64_Packages.lz4
    6        5.6 MB  /var/cache/debconf/templates.dat
    2        4.6 MB  /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_bionic-security_universe_binary-amd64_Packages.lz4
    3        3.1 MB  /var/cache/debconf/templates.dat-old
    2        3.1 MB  /root/.npm/_cacache/content-v2/sha512/b7/61/15994ac668f42f5e7d0591bcbb5b6967f5b678f0a64d5a5144a4731786c17a0bee115f6fff2a93520c7a395b09347f7812145c1e0336051609a6c3a86716
    6        2.2 MB  /var/lib/dpkg/status
    6        2.2 MB  /var/log/dpkg.log
    5        2.2 MB  /var/lib/dpkg/status-old
    5        1.2 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic_InRelease

Post, 86c2b45

Image Source: docker://dependabot/dependabot-core:86c2b45012053f4e8f23ee53c38f37474c3916e2
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 98.6085 %
  wastedBytes: 98797996 bytes (99 MB)
  userWastedPercent: 2.2888 %
Inefficient Files:
Count  Wasted Space  File Path
    3         12 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic-updates_main_binary-amd64_Packages.lz4
    3         11 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic-updates_universe_binary-amd64_Packages.lz4
    2        6.8 MB  /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_bionic-security_main_binary-amd64_Packages.lz4
    6        5.6 MB  /var/cache/debconf/templates.dat
    2        4.6 MB  /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_bionic-security_universe_binary-amd64_Packages.lz4
    3        3.1 MB  /var/cache/debconf/templates.dat-old
    6        2.2 MB  /var/lib/dpkg/status
    6        2.2 MB  /var/log/dpkg.log
    5        2.2 MB  /var/lib/dpkg/status-old
    5        1.2 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic_InRelease
    2        590 kB  /usr/local/.pyenv/versions/2.7.18/lib/python2.7/site-packages/pip/_vendor/idna/uts46data.pyc

Pretty boring: no regressions - still some wins to be squeaked out with && rm /var/lib/apt/lists/* but that wasn't the intent.


The expected wins are in images derived from dependabot/dependabot-core, e.g. the Dockerfile.development from inside this repo:

Pre:

Image Source: docker://dependabot/dependabot-core-development:fc12b554eaf835858e1cf9252da39bfc7243258f
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 65.1342 %
  wastedBytes: 4670635698 bytes (4.7 GB)
  userWastedPercent: 67.5946 %
Inefficient Files:
Count  Wasted Space  File Path
    2        251 MB  /opt/rust/toolchains/1.51.0-x86_64-unknown-linux-gnu/lib/librustc_driver-7ea116e55de24565.so
    2        251 MB  /opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/librustc_driver-7ea116e55de24565.so
    2        186 MB  /opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/libLLVM-11-rust-1.51.0-stable.so
    2        186 MB  /opt/rust/toolchains/1.51.0-x86_64-unknown-linux-gnu/lib/libLLVM-11-rust-1.51.0-stable.so
    2        139 MB  /opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-3aaf8f932781f33e.rlib
    2        139 MB  /opt/rust/toolchains/1.51.0-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-3aaf8f932781f33e.rlib
    2         72 MB  /usr/local/.pyenv/versions/3.9.4/lib/python3.9/config-3.9-x86_64-linux-gnu/libpython3.9.a
    2         72 MB  /usr/local/.pyenv/versions/3.9.4/lib/libpython3.9.a
    3         48 MB  /home/dependabot/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions
    2         44 MB  /opt/rust/toolchains/1.51.0-x86_64-unknown-linux-gnu/bin/cargo
    2         44 MB  /opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/bin/cargo

Post:

Image Source: docker://docker.io/dependabot/dependabot-core-development:86c2b45012053f4e8f23ee53c38f37474c3916e2
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 97.5930 %
  wastedBytes: 165345180 bytes (165 MB)
  userWastedPercent: 3.6150 %
Inefficient Files:
Count  Wasted Space  File Path
    4         64 MB  /home/dependabot/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions
    3         12 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic-updates_main_binary-amd64_Packages.lz4
    3         11 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic-updates_universe_binary-amd64_Packages.lz4
    2        6.8 MB  /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_bionic-security_main_binary-amd64_Packages.lz4
    7        6.6 MB  /var/cache/debconf/templates.dat
    2        4.6 MB  /var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_bionic-security_universe_binary-amd64_Packages.lz4
    3        3.1 MB  /var/cache/debconf/templates.dat-old
    7        2.8 MB  /var/lib/dpkg/status
    7        2.7 MB  /var/log/dpkg.log
    6        2.7 MB  /var/lib/dpkg/status-old
    5        1.2 MB  /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic_InRelease

That's exactly the expected result: we've saved this super expensive layer 🏎️ 🎉 .

@feelepxyz
Copy link
Contributor

feelepxyz commented Apr 9, 2021 via email

@feelepxyz
Copy link
Contributor

Rust updates with custom toolchains are working now 🎉 feelepxyz/rust-clippy#1

@feelepxyz feelepxyz requested a review from hmarr April 12, 2021 10:10
Copy link
Contributor

@feelepxyz feelepxyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested all ecosystems on staging are working, including installing custom rust toolchains and ruby git vendoring 🎉

Screenshot 2021-04-12 at 12 29 58

Previously the `Dockerfile` would install dependencies as the root user,
then a subsequent `Dockerfile` would create and delegate to a
`dependabot` user.

Chowning files in a downstream layer is expensive and slow!

Let's encourage the best practice of non-root containers by creating
`dependabot` directly in the `Dockerfile`.

Co-authored-by: Pete Wagner <thepwagner@github.com>
@jurre jurre mentioned this pull request Apr 12, 2021
4 tasks
@thepwagner
Copy link
Contributor Author

feelepxyz requested a review from hmarr 3 hours ago

Per this, I'm considering this blocked in @hmarr 's review.

Copy link
Contributor

@hmarr hmarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feelepxyz and I chatted about this separately, I'm 👍 on the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants