Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the generated dpkg_parser having a #!/usr/bin/python3 shebang? #434

Closed
kovkev opened this issue Nov 4, 2019 · 19 comments · Fixed by #479
Closed

Why is the generated dpkg_parser having a #!/usr/bin/python3 shebang? #434

kovkev opened this issue Nov 4, 2019 · 19 comments · Fixed by #479

Comments

@kovkev
Copy link

kovkev commented Nov 4, 2019

I don't have a file at /usr/bin/python3

@kovkev kovkev changed the title Why is the generated dpkg_parser having a #!/bin/usr/python3 shebang? Why is the generated dpkg_parser having a #!/usr/bin/python3 shebang? Nov 4, 2019
@donmccasland

This comment has been minimized.

@chanseokoh
Copy link
Member

chanseokoh commented Nov 5, 2019

Maybe that depends on where you build. Mine has /usr/bin/env python.

$ head -3 bazel-bin/package_manager/dpkg_parser
#!/usr/bin/env python

from __future__ import print_function

If you have an issue, perhaps file an issue against Subpar that I think is generating the binary with the par_binary rule?

@kovkev
Copy link
Author

kovkev commented Nov 8, 2019

I'll give it to @chanseokoh , I get the same for bazel build //package_manager:dpkg_parser. However...

I explored the issue a little bit more

git clone https://github.com/GoogleContainerTools/distroless
cd distroless
Users-MacBook-Pro:distroless kevin$ bazel build //examples/python2.7:hello_py
INFO: Writing tracer profile to '/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/command.profile.gz'
INFO: Invocation ID: 7b903e50-9885-45d1-964e-4707648a4db4
INFO: Call stack for the definition of repository 'debian10_security' which is a _dpkg_src (rule definition at /Users/kevin/odev/distroless/package_manager/dpkg.bzl:58:13):
 - /Users/kevin/odev/distroless/package_manager/dpkg.bzl:80:5
 - /Users/kevin/odev/distroless/WORKSPACE:359:1
INFO: Call stack for the definition of repository 'debian_stretch_security' which is a _dpkg_src (rule definition at /Users/kevin/odev/distroless/package_manager/dpkg.bzl:58:13):
 - /Users/kevin/odev/distroless/package_manager/dpkg.bzl:80:5
 - /Users/kevin/odev/distroless/WORKSPACE:44:1
ERROR: An error occurred during the fetch of repository 'debian10_security':
   dpkg_parser command failed: src/main/tools/process-wrapper-legacy.cc:58: "execvp(/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/external/dpkg_parser/file/downloaded, ...)": No such file or directory
 (/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/external/dpkg_parser/file/downloaded --download-and-extract-only=True --mirror-url= --arch= --distro= --snapshot= --packages-gz-url=https://snapshot.debian.org/archive/debian-security/20191028T085816Z/dists/buster/updates/main/binary-amd64/Packages.gz --package-prefix=https://snapshot.debian.org/archive/debian-security/20191028T085816Z/ --sha256=dace61a2f1c4031f33dbc78e416a7211fad9946a3d997e96256561ed92b034be)
ERROR: /Users/kevin/odev/distroless/examples/python2.7/BUILD:4:1: error loading package 'experimental/python2.7': Encountered error while reading extension file 'file/packages.bzl': no such package '@package_bundle_debian10//file': no such package '@debian10_security//file': dpkg_parser command failed: src/main/tools/process-wrapper-legacy.cc:58: "execvp(/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/external/dpkg_parser/file/downloaded, ...)": No such file or directory
 (/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/external/dpkg_parser/file/downloaded --download-and-extract-only=True --mirror-url= --arch= --distro= --snapshot= --packages-gz-url=https://snapshot.debian.org/archive/debian-security/20191028T085816Z/dists/buster/updates/main/binary-amd64/Packages.gz --package-prefix=https://snapshot.debian.org/archive/debian-security/20191028T085816Z/ --sha256=dace61a2f1c4031f33dbc78e416a7211fad9946a3d997e96256561ed92b034be) and referenced by '//examples/python2.7:hello_py'
ERROR: Analysis of target '//examples/python2.7:hello_py' failed; build aborted: error loading package 'experimental/python2.7': Encountered error while reading extension file 'file/packages.bzl': no such package '@package_bundle_debian10//file': no such package '@debian10_security//file': dpkg_parser command failed: src/main/tools/process-wrapper-legacy.cc:58: "execvp(/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/external/dpkg_parser/file/downloaded, ...)": No such file or directory
 (/private/var/tmp/_bazel_kevin/989ade18a91c241905dfd2e46f1ddbd5/external/dpkg_parser/file/downloaded --download-and-extract-only=True --mirror-url= --arch= --distro= --snapshot= --packages-gz-url=https://snapshot.debian.org/archive/debian-security/20191028T085816Z/dists/buster/updates/main/binary-amd64/Packages.gz --package-prefix=https://snapshot.debian.org/archive/debian-security/20191028T085816Z/ --sha256=dace61a2f1c4031f33dbc78e416a7211fad9946a3d997e96256561ed92b034be)
INFO: Elapsed time: 0.233s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (10 packages loaded, 0 targets configured)
    Fetching @package_bundle_debian10; Restarting.
    Fetching @package_bundle; Restarting.

I searched in the code a little and found the following chain:

Users-MacBook-Pro:distroless kevin$ head ~/Downloads/dpkg_parser.par
#!/usr/bin/python3
PK!
   __init__.pyPK!���!�!
                       __main__.py# Copyright 2017 Google Inc. All rights reserved.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
  • Tada! /usr/bin/python3 shebang!

Now we need to find how to use another dpkg_parser instead of this one.

@chanseokoh
Copy link
Member

@kovkev ah, you were talking about the par binary stored in the GCS bucket. Now I remember that. I don't know any history behind why it's there or why the repo is using it instead of the generated dpkg_parser.

@kovkev
Copy link
Author

kovkev commented Nov 10, 2019

I think it might have to do with the fact that, from within the WORSPACE, and from within the repository_rule, it is difficult to generate that file. It might be solveable in a two-step process. First, run a bazel build //package_manager:dpkg_parser ; cp bazel-out/dpkg_parser.py /usr/lib/dpkg_parser.py and then, have the dpkg_list and dpkg_src pick up the dpkg_parser from specifically file:///usr.lib/dpkg_parser.py using a local_file() rule instead of http_archive().

Very handwave-y explanation here, but I hope the general idea lands.

@mir3z
Copy link

mir3z commented Dec 5, 2019

It's kinda blocking OSX users...

@chanseokoh
Copy link
Member

chanseokoh commented Dec 5, 2019

Someone on OSX on the Distroless Slack channel used some hacky workaround:

I was able to work around the package_manager_repositories pulling in a version of dpkg_parser by creating a dumb host (pyhton -m SimpleHttp 3333) that hosts my built dpkg_parser. ugly but it works 😕

And bazel build --verbose_failures --host_force_python=PY2 //package_manager:dpkg_parser.par will build a local dpkg_parser. Putting python_version = "PY2" may be necessary as well to force using Python 2.

@@ -4,6 +4,7 @@ par_binary(
     name = "dpkg_parser",
     srcs = glob(["**/*.py"]),
     main = "dpkg_parser.py",
+    python_version = "PY2",
     visibility = ["//visibility:public"],
     deps = [
         ":parse_metadata",

@ppodolsky
Copy link

ppodolsky commented Dec 27, 2019

Hi! I have the same issue and it is very annoying on MacOS.

There is no option to symlink /usr/bin/python3 to the proper python because Catalina has write protection for /usr/bin.

Apple vendored their own certificate bundle for the system python. And now we have no easy way to workaround certificate verify failed: unable to get local issuer certificate error for some dpkg_src urls (http://download.docker.com/linux/debian/dists/buster/stable/binary-amd64/Packages.gz in my case).

@mariusgrigoriu
Copy link

How about reverting that script back to #!/usr/bin/env python or #!/usr/bin/env python3? That way it is up to the user to ensure the PATH points to the right interpreter.

@joprice
Copy link

joprice commented Jan 18, 2020

@chanseokoh Following your steps, I get AttributeError: 'NoneType' object has no attribute 'split':

Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/private/var/tmp/_bazel_josephprice/290857ff87d2bcd4dba13b18fbba33d0/execroot/distroless/bazel-out/darwin-fastbuild/bin/package_manager/dpkg_parser.par/__main__.py", line 196, in <module>
  File "/private/var/tmp/_bazel_josephprice/290857ff87d2bcd4dba13b18fbba33d0/execroot/distroless/bazel-out/darwin-fastbuild/bin/package_manager/dpkg_parser.par/__main__.py", line 80, in main
  File "/private/var/tmp/_bazel_josephprice/290857ff87d2bcd4dba13b18fbba33d0/execroot/distroless/bazel-out/darwin-fastbuild/bin/package_manager/dpkg_parser.par/__main__.py", line 92, in download_dpkg
AttributeError: 'NoneType' object has no attribute 'split'

@Sineaggi
Copy link
Contributor

According to google/subpar#87, you can override the interpreter when using the par_binary rule. We could check and see if compiler_args = ["--interpreter", "/usr/bin/env python"], could work for the current python3 users, and the python2 users. Alternatively we could use /usr/bin/env python3 and require users to have python 3 installed.

@Sineaggi
Copy link
Contributor

Sineaggi commented Feb 11, 2020

@chanseokoh @kovkev As an update to the comment made about building the file locally then sourcing it from within the package_manager.bzl file, turns out the http_rule(urls) supports files. If you're testing dpkg_parser locally, you can use something like this

def package_manager_repositories():
    http_file(
        name = "dpkg_parser",
        urls = [("file:///Users/cwalker/git/distroless/bazel-bin/package_manager/dpkg_parser.par")],
        executable = True
    )

and ignore the sha256 (as that is a pain to remember to keep up-to-date).

@james-stephenson
Copy link

james-stephenson commented Feb 12, 2020

I did some more digging on this problem, and it seems that it goes deeper than subpar. It's actually something to do with Bazel's py_binary rule: https://docs.bazel.build/versions/master/be/python.html#py_binary (see the note in the python_version attribute).

It takes you to this comment: bazelbuild/bazel#4815 (comment)

The TL;DR is that it's a lack of appropriate toolchain support for Python 2 and 3 which is truly portable, so there are some workarounds to consider.

One approach is, as above, setting the compiler_args to specify an interpreter (thanks!). This requires pulling the distroless code and directly modifying the par_binary rule.

Another way, as that comment says, would be to add a py_runtime rule, with the correct paths for PY2/PY3, and then invoke on the command line with it.

Example WORKSPACE:

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

git_repository(
    name = "com_google_distroless",
    branch = "master",
    remote = "https://github.com/GoogleCloudPlatform/distroless.git",
)

git_repository(
    name = "subpar",
    remote = "https://github.com/google/subpar",
    tag = "2.0.0",
)

Example BUILD:

load("@subpar//:subpar.bzl", "par_binary")

py_runtime(
    name = "python_runtime",
    files = [],
    interpreter_path = select({
        # Update paths as appropriate for your system.
        "@bazel_tools//tools/python:PY2": "/usr/bin/python2",
        "@bazel_tools//tools/python:PY3": "/usr/local/bin/python3",
    }),
)

Invoke like so:
bazel build --incompatible_use_python_toolchains=false --python_top=//:python_runtime @com_google_distroless//package_manager:dpkg_parser.par

You have a valid dpkg_parser.par now at bazel-bin/external/com_google_distroless/package_manager/dpkg_parser.par

You still need to import dpkg_parser using http_file or a new_local_repository in some fashion (edit: dpkg_list & dpkg_src require it to be an http_file unless you change the _dpkg_parser attr to a different label), but you could do this method in a new repository and just build the dpkg_parser binary transitively rather than mucking with the distroless build code directly. However you cannot use /usr/bin/env because it interprets that string literally as a single command, not one with arguments.

@briandealwis
Copy link
Member

There's a diagnosis and workaround available for macOS Catalina.

@briandealwis
Copy link
Member

Hmm, rereading through this thread, people may have different reasons for wanting to use different python installations.

If you're trying to use an alternative Python because dpkg_parser is failing due to SSL certificate verification issues, see this workaround. I opened a separate issue with the workaround (#475) and pinned it.

Bazel makes it difficult to use tools outside of /usr/bin to promote hermetic builds. Otherwise different users have different packages installed, often leading to different build results — or failures. And hence we all had this certificate problem :-)

I'm going to close this issue — please open new issues if there are other outstanding reasons for wanting to replace the interpreter.

@briandealwis
Copy link
Member

Re-opening: it may be that /usr/bin/python3 is installed by Xcode or the Xcode Command-Line Tools (xcode-select --install from the CLI).

@chanseokoh
Copy link
Member

chanseokoh commented Apr 16, 2020

Finally, Distroless no longer downloads a pre-built dpkg_parser (#477) but instead will use one locally built from source. With the shebang fix (#479), I expect Mac users will no longer have trouble building Distroless without applying any kind of local hacks.

Sync with master and follow the new build instructions in CONTRIBUTING.md.

@mariusgrigoriu
Copy link

I see that dpkg_parser.par needs to be built manually. What's keeping it from being automatically built as needed?

@chanseokoh
Copy link
Member

@mariusgrigoriu it is an unfortunate consequence of this issue (from #477 (comment)):

More Context About the Issue

I asked the following question on the Bazel Slack Channel.

Our "Distroless" project has a kind of circular dependency.

  • A python binary is required to define a repository_rule. (I guess the rule will be evaluated in the Bazel loading phase?) The python binary is executed to make some files available as output of the rule (via export_files).
  • However, this python binary is actually part of the "Distroless" project itself. It is defined as par_binary and is supposed to be built in the normal execution phase. At least, this par_binary doesn't depend on the repository_rule above, so it can be built first alone.

So, in other words, this python binary is really like an external dependency repo that can exist elsewhere on its own. But for now, it is embedded in the project. What we've been doing is

  1. Build the python binary first in a separate step (bazel build //:my_python_binary.par). Then we manually upload the binary to the Internet.
  2. As the second bazel build //:final_target step, we make the binary available (downloading with http_file) and use it to define a repository_rule.

Now we want to remove the "uploading" and "downloading" hackery. That is, I want Bazel to somehow build the python binary first and then use it to define a repository_rule in a single build step. Still, the only option I can think of is to run two separate build steps.

  1. Run bazel build //:my_python_binary.par first as a separate step.
  2. As the second bazel build //:final_target step, repository_rule picks up the locally built binary at <WORKSPACE>/bazel-bin/my_python_binary.par and use it.

Is it possible to make this work with a single bazel build run? What are other options?

And check #477 (comment) for the final decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants