Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use env vars instead of CMake vars for Dawn/Node bindings #7422

Merged
merged 21 commits into from
May 2, 2023

Conversation

steven-johnson
Copy link
Contributor

The WebGPU bindings need to know the locations of two external files containing executable code:

  • WEBGPU_NATIVE_LIB, a shared library containing the 'native' Dawn implementation, for use with variants of host-webgpu
  • WEBGPU_NODE_BINDINGS, a Node extension that contains the Node/JS bindings for WebGPU, for use with variants of wasm-webgpu

Currently, these are specified via CMake configure-time defines, but this is problematic:

  • For JIT compilation, specifying WEBGPU_NATIVE_LIB as a CMake configure-time definition means there is no way to customize what Dawn variant you use (unless you build from source, which would significantly limit the usefulness of prebuilt Halide binaries with WebGPU at the present time)

  • for AOT compilation, having these as CMake defines is not unreasonable... assuming you are compiling C++ Generators using CMake. If you aren't using CMake, or are exclusively using the Python bindings, you'll have to figure out the right setup for these on your own, which would be painful.

This PR proposes to replace the CMake defines for these two things with plain old environment variables. I'm not wild about using env vars to point to executable files -- aside from the possible security risk, it's generally problematic IMHO to have build-time options controlled by a silent env var -- but I'm not sure that we have a better option here.

At some point, we may be able to simply assume that Node has the bindings built in, and Dawn (etc) would be available via a predictably-named shared library that can be presumed to be in the current search path... but neither of these is the case at the moment.

attn: @jrprice

Halide promises that you can crosscompile to *any* supported target from a 'stock' build of libHalide. Unfortunately, the initial landing of WebGPU support breaks that promise: we compile the webgpu runtime support (webgpu.cpp) with code that is predicated on `WITH_DAWN_NATIVE` (for Dawn vs Emscripten, respectively).

This means that if you build Halide with `WITH_DAWN_NATIVE` defined, you can *only* target Dawn with that build of Halide; similarly, if you build with `WITH_DAWN_NATIVE` not-defined, you can only target Emscripten. (Trying to use the 'wrong' version will produce link-time errors.)

For people who build everything from source, this isn't a big deal, but for people who just pull binary builds, this is a big problem.

This PR proposes a temporary workaround until the API discrepancies are resolved:
- Compile the existing webgpu.cpp runtime *both* ways
- in LLVM_Runtime_Linker.cpp, select the correct variant based on whether the Target is targeting wasm or not
- Profit!

This is a rather ugly hack, but it should hopefully be (relatively) temporary.
@steven-johnson steven-johnson added the buildbot_test_everything Buildbots should run all available tests on this PR (unless build_test_nothing is set). label Mar 15, 2023
Base automatically changed from srj/webgpu-two-variants to main March 16, 2023 21:35
@steven-johnson
Copy link
Contributor Author

(This should finally go green once buildbots get to it -- fingers crossed -- so please consider reviewing it)

@abadams abadams requested a review from alexreinking March 17, 2023 17:55
@abadams
Copy link
Member

abadams commented Mar 17, 2023

+Alex for the cmake change

@alexreinking
Copy link
Member

CMake should really keep using the cache variable, but initialize it from the environment variable.

No matter what we do, the generated build system (be it Make, Ninja, or VS) won't be sensitive to a change in environment variables (i.e. CMake will not regenerate), so we might as well provide a consistent, platform-independent way of setting these variables (i.e. the cache).

Moreover, this prevents unexpected changes to this value if you have to reconfigure for some other reason and do so from a different environment.

@alexreinking
Copy link
Member

alexreinking commented Mar 17, 2023

I would have expected to see something like this in dependencies/webgpu/CMakeLists.txt:

find_library(
  Halide_WEBGPU_NATIVE_LIB
  NAMES webgpu_dawn wgpu
  HINTS ENV HL_WEBGPU_NATIVE_LIB  # for compatibility with Make
)

# Note my review comment about the test launcher script.
# find_path(
#   Halide_WEBGPU_NODE_BINDINGS
#   NAMES dawn.node
#   HINTS ENV HL_WEBGPU_NODE_BINDINGS  # for compatibility with Make
# )

if (Halide_WEBGPU_NATIVE_LIB AND NOT TARGET Halide::webgpu)
  add_library(Halide::webgpu UNKNOWN IMPORTED GLOBAL)
  set_target_properties(
    Halide::webgpu
    PROPERTIES
    IMPORTED_LOCATION "${Halide_WEBGPU_NATIVE_LIB}"
  )
endif ()

Then you could link to Halide::webgpu conditionally if it exists, later.

cmake/HalideGeneratorHelpers.cmake Outdated Show resolved Hide resolved
@steven-johnson
Copy link
Contributor Author

CMake should really keep using the cache variable, but initialize it from the environment variable.

No matter what we do, the generated build system (be it Make, Ninja, or VS) won't be sensitive to a change in environment variables (i.e. CMake will not regenerate), so we might as well provide a consistent, platform-independent way of setting these variables (i.e. the cache).

Moreover, this prevents unexpected changes to this value if you have to reconfigure for some other reason and do so from a different environment.

I'm a little unclear on precisely what you are suggesting: is the idea that we have a cache variable in CMake that defaults to initialization from an env var? I see what you are getting at, but that opens up a different can of worms, namely, that code using the JIT will immediately recognize changes to the HL_WEBGPU_NATIVE_LIB env var, but code using AOT will ignore such changes unless you re-run CMake. (Correct me if I'm wrong, but even an 'ordinary' rebuild via ninja etc wouldn't trigger a cached cmake variable to pick up the env var change.) I find this to be unsettling, and while maybe it's not worse than the status quo of this PR, it doesn't strike me as better. Is there a detail I'm missing here?

@steven-johnson
Copy link
Contributor Author

I would have expected to see something like this in dependencies/webgpu/CMakeLists.txt:

At first I said to myself "but there isn't any such file", but then I realize you were suggesting I add one. Your suggestion looks reasonable, though it does potentially conflict with my concern about binding to a cmake cached var in a previous comment response.

@steven-johnson
Copy link
Contributor Author

Gentle review ping for the changes (which aren't complete but address concerns partially)

@alexreinking
Copy link
Member

alexreinking commented Apr 18, 2023

Gentle review ping for the changes (which aren't complete but address concerns partially)

This is better, but as I wrote in the review comments of #7512, using environment variables for project configuration in CMake is vexing. The generated build systems don't track dependencies on the values of environment variables for the regeneration step; they only track files: the cache file, the CMake code files that were executed as part of the build, and those files which were explicitly added to the CMAKE_CONFIGURE_DEPENDS properties (via any of the, like, 10 mechanisms that do this).

I understand you have some concerns about this state of affairs and I probably wouldn't die on this particular hill if the environment variable only affected Halide's own build. However, it's going downstream via HalideGeneratorHelpers.cmake and I really don't want to ship CMake anti-patterns to our users.

@steven-johnson
Copy link
Contributor Author

At this point I'm fine with revising it as you describe in #7512 [not #7152 as mentioned above], but I'm unclear on which comments are still relevant, since I've been away from this PR for a while now...

at one point you mention adding dependencies/webgpu/CMakeLists.txt, then subsequently point out that if we use the dependency in HalideGeneratorHelpers that we can't do that after all, and need to add a Find module... can I trouble you to synopsize what you think is necessary here to wrap this up The Right Way?

@alexreinking
Copy link
Member

can I trouble you to synopsize what you think is necessary here to wrap this up The Right Way?

Absolutely. I'll just do it and put a write up in the comments here.

alexreinking and others added 3 commits April 19, 2023 18:36
* Use a find-module to locate Dawn/WebGPU

* trigger buildbots

* Tickle buildbots

* Update Generator.cpp

* Update CodeGen_WebGPU_Dev.cpp
@steven-johnson
Copy link
Contributor Author

Looks like the find_package(Halide_WebGPU) is failing on the armbot: https://buildbot.halide-lang.org/master/#/builders/33/builds/238 -- not sure why yet

@steven-johnson
Copy link
Contributor Author

The env var is set correctly:

$ ls -l $HL_WEBGPU_NATIVE_LIB
-rwxr-xr-x  1 halidenightly  staff  88849 Mar 13 14:32 /Users/halidenightly/dawn/build-dawn/src/dawn/native/libwebgpu_dawn.dylib

@alexreinking
Copy link
Member

@steven-johnson -- I think that link is invalid after the build-master reset. Is there a new link?

@steven-johnson
Copy link
Contributor Author

steven-johnson commented Apr 21, 2023

@steven-johnson -- I think that link is invalid after the build-master reset. Is there a new link?

There will be shortly

@steven-johnson
Copy link
Contributor Author

@steven-johnson -- I think that link is invalid after the build-master reset. Is there a new link?

https://buildbot.halide-lang.org/master/#/builders/76/builds/11
https://buildbot.halide-lang.org/master/#/builders/83/builds/6

@steven-johnson
Copy link
Contributor Author

Doing some debugging by enabling CMAKE_FIND_DEBUG_MODE, it seems that CMake expects the hint to be the directory to look in, not the full path to the dylib (whereas HL_WEBGPU_NATIVE_LIB is explicitly a full path). I'll revise the code accordingly.

@steven-johnson
Copy link
Contributor Author

Hopeful fix pushed

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long to get to this, but it's good to land!

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just finished my coffee and thought of this... @steven-johnson - can you test this?

cmake/FindHalide_WebGPU.cmake Outdated Show resolved Hide resolved
@steven-johnson
Copy link
Contributor Author

ops, fixed?

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming the buildbots like it

@alexreinking alexreinking merged commit 7cdbc71 into main May 2, 2023
@alexreinking alexreinking deleted the srj/webgpu-env-vars branch May 2, 2023 04:05
steven-johnson added a commit to halide/build_bot that referenced this pull request Nov 27, 2023
These are ancient and buggy and inconveniently located, and we now finally have redundancy for them, so let's remove them for good. (The mac bot in particular will never properly support WebGPU)

Also remove code for halide/Halide#7422 lands that has landed already
steven-johnson added a commit to halide/build_bot that referenced this pull request Nov 27, 2023
These are ancient and buggy and inconveniently located, and we now finally have redundancy for them, so let's remove them for good. (The mac bot in particular will never properly support WebGPU)

Also remove code for halide/Halide#7422 lands that has landed already
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
AOT pipelines that rely on Dawn/WebGPU now depend on a new
Halide_WebGPU find-module. This module honors the make-ish
HL_WEBGPU_NATIVE_LIB variable as a means of initializing the
Halide_WebGPU_NATIVE_LIB cache variable. This is automatically handled
by add_halide_generator and add_halide_runtime and is available to
downstreams.

The JIT tests no longer read the HL_WEBGPU_NODE_BINDINGS environment
variable during the CMake configure or build phase. Instead, a test
launcher reads it at CTest runtime.

Co-authored-by: Alex Reinking <quic_areinkin@quicinc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buildbot_test_everything Buildbots should run all available tests on this PR (unless build_test_nothing is set).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants