Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--enable-experimental-jit fails to build: AssertionError: SHT_NOTE in 3.13.0b1 #118836

Closed
mgorny opened this issue May 9, 2024 · 13 comments
Closed
Assignees
Labels
build The build process and cross-build topic-JIT type-bug An unexpected behavior, bug, or error

Comments

@mgorny
Copy link
Contributor

mgorny commented May 9, 2024

Bug report

Bug description:

When trying to build CPython with --enable-experimental-jit against LLVM 18.1.5, I'm getting the following error:

$ make
python3.13 ./Tools/jit/build.py x86_64-pc-linux-gnu

==========================================================
JIT support for x86_64-pc-linux-gnu is still experimental!
         Please report any issues you encounter.          
==========================================================

  + Exception Group Traceback (most recent call last):
  |   File "/home/mgorny/git/cpython/./Tools/jit/build.py", line 28, in <module>
  |     args.target.build(pathlib.Path.cwd(), comment=comment, force=args.force)
  |     ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 214, in build
  |     stencil_groups = asyncio.run(self._build_stencils())
  |   File "/usr/lib/python3.13/asyncio/runners.py", line 194, in run
  |     return runner.run(main)
  |            ~~~~~~~~~~^^^^^^
  |   File "/usr/lib/python3.13/asyncio/runners.py", line 118, in run
  |     return self._loop.run_until_complete(task)
  |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  |   File "/usr/lib/python3.13/asyncio/base_events.py", line 721, in run_until_complete
  |     return future.result()
  |            ~~~~~~~~~~~~~^^
  |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 189, in _build_stencils
  |     async with asyncio.TaskGroup() as group:
  |     ...<4 lines>...
  |             tasks.append(group.create_task(coro, name=opname))
  |   File "/usr/lib/python3.13/asyncio/taskgroups.py", line 154, in __aexit__
  |     raise me from None
  | ExceptionGroup: unhandled errors in a TaskGroup (2 sub-exceptions)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 181, in _compile
    |     return await self._parse(o)
    |            ^^^^^^^^^^^^^^^^^^^^
    |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 89, in _parse
    |     self._handle_section(wrapped_section["Section"], group)
    |     ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 349, in _handle_section
    |     assert section_type in {
    |            ^^^^^^^^^^^^^^^^^
    |     ...<5 lines>...
    |     }, section_type
    |     ^
    | AssertionError: SHT_NOTE
    +---------------- 2 ----------------
    | Traceback (most recent call last):
    |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 181, in _compile
    |     return await self._parse(o)
    |            ^^^^^^^^^^^^^^^^^^^^
    |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 89, in _parse
    |     self._handle_section(wrapped_section["Section"], group)
    |     ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/home/mgorny/git/cpython/Tools/jit/_targets.py", line 349, in _handle_section
    |     assert section_type in {
    |            ^^^^^^^^^^^^^^^^^
    |     ...<5 lines>...
    |     }, section_type
    |     ^
    | AssertionError: SHT_NOTE
    +------------------------------------
Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x7f2fa5251760>
Traceback (most recent call last):
  File "/usr/lib/python3.13/asyncio/base_subprocess.py", line 127, in __del__
  File "/usr/lib/python3.13/asyncio/base_subprocess.py", line 104, in close
  File "/usr/lib/python3.13/asyncio/unix_events.py", line 603, in close
  File "/usr/lib/python3.13/asyncio/unix_events.py", line 627, in _close
  File "/usr/lib/python3.13/asyncio/base_events.py", line 829, in call_soon
  File "/usr/lib/python3.13/asyncio/base_events.py", line 552, in _check_closed
RuntimeError: Event loop is closed
make: *** [Makefile:3015: jit_stencils.h] Error 1

I don't recall which alpha I've tested it last on, but I'm pretty sure it used to work (against LLVM 16). I'm getting the same result on main as of 7c87ce7.

This is Gentoo Linux amd64.

I've reproduced by doing (in git repo):

export PATH=/usr/lib/llvm/18/bin:${PATH}
./configure --enable-experimental-jit
make

Resulting log (70k): python-log.txt

CPython versions tested on:

3.13, CPython main branch

Operating systems tested on:

Linux

Linked PRs

@mgorny mgorny added the type-bug An unexpected behavior, bug, or error label May 9, 2024
@sobolevn sobolevn added the build The build process and cross-build label May 9, 2024
@sobolevn
Copy link
Member

sobolevn commented May 9, 2024

cc @brandtbucher

@brandtbucher
Copy link
Member

Thanks for the report!

@savannahostrowski: I think we don’t care about these sections, so it’s probably just a matter of updating the assert. (There may be other missing sections too.)

@brandtbucher
Copy link
Member

This assert (and others) were originally added early on to make sure we didn’t miss any “important” sections when parsing. Not sure if it’s too fragile in its current form, or if SHT_NOTE is just a common section type that didn’t come up on the platforms we tested.

@brandtbucher brandtbucher self-assigned this May 9, 2024
@savannahostrowski
Copy link
Member

savannahostrowski commented May 9, 2024

Thanks for the report!

Makes sense @brandtbucher, I can take a look later today if you'd like.

@savannahostrowski
Copy link
Member

Alright, so I've probably spent too long trying to set up a Gentoo dev environment 😅 and I'm going to admit defeat.

@mgorny Would you be able to try adding that statement to the assert and let me know if that resolves the issue?

@thesamesam
Copy link
Contributor

thesamesam commented May 11, 2024

I'm happy to help (either now or in future) if you need it, fwiw (and/or can give a Dockerfile where hopefully it's reproducible).

Would either of those help?

@savannahostrowski
Copy link
Member

Ah! If you have a Dockerfile, that'd be amazing. I tried to write one of my own and got stuck in a loop of masked packages. Then, once I got LLVM to start installing, it ultimately failed...

@thesamesam
Copy link
Contributor

Let me know how this goes...

FROM gentoo/stage3

# Disable bits which don't work within Docker.
RUN echo 'FEATURES="-ipc-sandbox -pid-sandbox -network-sandbox -usersandbox -mount-sandbox -sandbox"' | cat >> /etc/portage/make.conf
# Speed things up a bit.
RUN echo 'FEATURES="${FEATURES} parallel-install parallel-fetch -merge-sync"' | cat >> /etc/portage/make.conf
RUN echo 'EMERGE_DEFAULT_OPTS="--binpkg-respect-use=y --getbinpkg=y --autounmask-write --autounmask-continue --autounmask-keep-keywords=y --autounmask-use=y"' | cat >> /etc/portage/make.conf
# XXX: Replace -j$(nproc) with some smaller -jN if you run out of RAM
RUN echo "MAKEOPTS='-j$(nproc) -l$(nproc)'" >> /etc/portage/make.conf

RUN emerge-webrsync --quiet
RUN getuto

RUN echo "*/*" >> /etc/portage/package.accept_keywords/all
# By doing this step first (just the deps, not Python itself), we avoid
# losing all our progress if/when Python fails to build.
RUN USE=jit emerge --verbose --oneshot --onlydeps dev-lang/python:3.13
RUN mkdir -p /etc/portage/patches/dev-lang/python:3.13

# You can put patches in /etc/portage/patches/dev-lang/python:3.13/aaa.patch
# to test them.
CMD USE=jit emerge --verbose --oneshot dev-lang/python:3.13

To build it:

docker build -t gentoo gentoo/
docker run -it gentoo

Building LLVM takes a little while - it needs LLVM 18 for the failure, I think, which isn't yet marked stable in Gentoo so there's no binary packages available for it unfortunately.

I added a note about MAKEOPTS in the Dockerfile -- if you get an issue, try reducing that first.

@savannahostrowski
Copy link
Member

Thanks for writing the Dockerfile! Unfortunately, I landed in the same place that I did last night when trying to write my own Dockerfile. I keep hitting this error:

857.1  * ERROR: sys-devel/llvm-18.1.5::gentoo failed (compile phase):
857.1  *   ninja -v -j11 -l11 distribution failed
857.1  * 
857.1  * Call stack:
857.1  *     ebuild.sh, line  136:  Called src_compile
857.1  *   environment, line 3924:  Called multilib-minimal_src_compile
857.1  *   environment, line 2731:  Called multilib_foreach_abi 'multilib-minimal_abi_src_compile'
857.1  *   environment, line 2998:  Called multibuild_foreach_variant '_multilib_multibuild_wrapper' 'multilib-minimal_abi_src_compile'
857.1  *   environment, line 2691:  Called _multibuild_run '_multilib_multibuild_wrapper' 'multilib-minimal_abi_src_compile'
857.1  *   environment, line 2689:  Called _multilib_multibuild_wrapper 'multilib-minimal_abi_src_compile'
857.1  *   environment, line  600:  Called multilib-minimal_abi_src_compile
857.1  *   environment, line 2725:  Called multilib_src_compile
857.1  *   environment, line 3218:  Called tc-env_build 'cmake_build' 'distribution'
857.1  *   environment, line 4159:  Called cmake_build 'distribution'
857.1  *   environment, line 1425:  Called eninja 'distribution'
857.1  *   environment, line 1896:  Called die
857.1  * The specific snippet of code:
857.1  *       "$@" || die -n "${*} failed"
857.1  * 
857.1  * If you need support, post the output of `emerge --info '=sys-devel/llvm-18.1.5::gentoo'`,
857.1  * the complete build log and the output of `emerge -pqv '=sys-devel/llvm-18.1.5::gentoo'`.
857.1  * The complete build log is located at '/var/tmp/portage/sys-devel/llvm-18.1.5/temp/build.log'.
857.1  * The ebuild environment file is located at '/var/tmp/portage/sys-devel/llvm-18.1.5/temp/environment'.
857.1  * Working directory: '/var/tmp/portage/sys-devel/llvm-18.1.5/work/llvm_build-.arm64'
857.1  * S: '/var/tmp/portage/sys-devel/llvm-18.1.5/work/llvm'
857.1 
857.1 >>> Failed to emerge sys-devel/llvm-18.1.5, Log file:
857.1 
857.1 >>>  '/var/tmp/portage/sys-devel/llvm-18.1.5/temp/build.log'
857.1  * Messages for package sys-libs/libomp-18.1.5:
857.1  * Unable to find kernel sources at /usr/src/linux
857.1  * Unable to calculate Linux Kernel version for build, attempting to use running version
857.1  * Messages for package sys-devel/llvm-18.1.5:
857.1  * ERROR: sys-devel/llvm-18.1.5::gentoo failed (compile phase):
857.1  *   ninja -v -j11 -l11 distribution failed
857.1  * 
857.1  * Call stack:
857.1  *     ebuild.sh, line  136:  Called src_compile
857.1  *   environment, line 3924:  Called multilib-minimal_src_compile
857.1  *   environment, line 2731:  Called multilib_foreach_abi 'multilib-minimal_abi_src_compile'
857.1  *   environment, line 2998:  Called multibuild_foreach_variant '_multilib_multibuild_wrapper' 'multilib-minimal_abi_src_compile'
857.1  *   environment, line 2691:  Called _multibuild_run '_multilib_multibuild_wrapper' 'multilib-minimal_abi_src_compile'
857.1  *   environment, line 2689:  Called _multilib_multibuild_wrapper 'multilib-minimal_abi_src_compile'
857.1  *   environment, line  600:  Called multilib-minimal_abi_src_compile
857.1  *   environment, line 2725:  Called multilib_src_compile
857.1  *   environment, line 3218:  Called tc-env_build 'cmake_build' 'distribution'
857.1  *   environment, line 4159:  Called cmake_build 'distribution'
857.1  *   environment, line 1425:  Called eninja 'distribution'
857.1  *   environment, line 1896:  Called die
857.1  * The specific snippet of code:
857.1  *       "$@" || die -n "${*} failed"
857.1  * 
857.1  * If you need support, post the output of `emerge --info '=sys-devel/llvm-18.1.5::gentoo'`,
857.1  * the complete build log and the output of `emerge -pqv '=sys-devel/llvm-18.1.5::gentoo'`.
857.1  * The complete build log is located at '/var/tmp/portage/sys-devel/llvm-18.1.5/temp/build.log'.
857.1  * The ebuild environment file is located at '/var/tmp/portage/sys-devel/llvm-18.1.5/temp/environment'.
857.1  * Working directory: '/var/tmp/portage/sys-devel/llvm-18.1.5/work/llvm_build-.arm64'
857.1  * S: '/var/tmp/portage/sys-devel/llvm-18.1.5/work/llvm'
857.1 
857.1 
857.1 
857.1 
857.1 
857.1  * Regenerating GNU info directory index...
857.1  * Processed 104 info files.
857.1 
857.1  * IMPORTANT: 16 news items need reading for repository 'gentoo'.
857.1  * Use eselect news read to view new items.
857.1 
------
Dockerfile:17
--------------------
  15 |     # By doing this step first (just the deps, not Python itself), we avoid
  16 |     # losing all our progress if/when Python fails to build.
  17 | >>> RUN USE=jit emerge --verbose --oneshot --onlydeps dev-lang/python:3.13
  18 |     RUN mkdir -p /etc/portage/patches/dev-lang/python:3.13
  19 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c USE=jit emerge --verbose --oneshot --onlydeps dev-lang/python:3.13" did not complete successfully: exit code: 1```

@thesamesam
Copy link
Contributor

My guess is you OOMed (check dmesg?) -- could you try changing the MAKEOPTS line to -j4 or something?

(You want it to be roughly: min(threads_on_your_cpu, RAM/2GB).)

@mgorny
Copy link
Contributor Author

mgorny commented May 11, 2024

@mgorny Would you be able to try adding that statement to the assert and let me know if that resolves the issue?

I can confirm that CPython built correctly now (i.e. with that assert change). I'm testing some packages too, no regressions so far.

@savannahostrowski
Copy link
Member

@thesamesam Thanks - I realized that right after sending that reply. That said, this doesn't repro on arm64 and when I try to build the image for amd64, I get an illegal exception (I'm on an M3 Pro).

@mgorny Thanks for taking a look and validating that we build correctly now with that change. If you're able to put up a PR, that'd be much appreciated. Happy to review!

miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 13, 2024
…pythonGH-119000)

(cherry picked from commit e04cd96)

Co-authored-by: Michał Górny <mgorny@gentoo.org>
brandtbucher pushed a commit that referenced this issue May 13, 2024
GH-119020)

(cherry picked from commit e04cd96)

Co-authored-by: Michał Górny <mgorny@gentoo.org>
@savannahostrowski
Copy link
Member

Looks like this can be closed now, as #119000 and #119020 have been merged.

Thanks @mgorny and @thesamesam for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build The build process and cross-build topic-JIT type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants