-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC - Creating an openxla-nvgpu project #71
Comments
Fantastic! Looking forward to the collaboration. |
Will this sit directly under |
Yes, that is what I'm proposing. |
cc @nluehr |
Does that translate to “write access”? |
Yes (which we already do on these repos for other contributors), but I tried to phrase it as more general. Being a "component maintainer" on OpenXLA parlance grants some other privileges in terms of overall openxla direction/management. |
What will be the C++ namespace for all the new code in this project? |
Not sure I would speak to "all of the new code", but historically, dialects and components of IREE take a namespace like If not extending that namespacing scheme, I'd encourage at least something similar: |
I'm curious why the sub-directory names are repeating "openxla"? That is why |
It is just following the IREE convention of the include directory being rooted at the I've more or less resigned myself to the fact that the least bad thing is to repeat yourself exactly once in service of having globally unique include paths. |
I'm not a big fan of What if I want to write custom VM module under |
Oh I missed the point about the include directory convention, makes sense! |
Just throwing things out there... |
I'd go with |
I'm game to try it. Like I said, I'm pretty sure that what we pick, we come back in a few months with some code written and apply a bit of sed, but I do think we want globally unique: there are many use cases where these will be linked together for both recommended and unrecommended reasons. Let's not set ourselves up for accidental name collisions. Also, with C++17, the cost of namespaces (in terms of keystrokes) is a lot lower. |
I would be concerned with plugins and target specific components redefining symbols in shared namespace: I'm not sure what the benefits of eliding the target from the namespace buys us in practice? |
(ftr - we're not exporting such legacy rules to our new OSS projects) |
Another directory naming question, why Will we also have some kind of Will we depend on absl/tsl? E.g. logging, Status, StatusOr inside nvgpu compiler/runtime? Or in compiler use LLVM logging (almost non existent), and in runtime use IREE logging (no idea what's the status). |
What will be the license? |
That is being discussed, but we are biasing towards getting moving at the moment over busting out dependencies. We have some work to do to get the dev/versioning workflow going for the level of things we have, and I'd rather get some more mileage on that before we go too crazy with a lot of repositories.
The further "core-ward" we go, we have no plan to depend on absl/tsl, and I would be somewhat resistant to doing so because they have both proven to be problematic (so much so that we excised after thinking "how bad could it be?"). Concrete thoughts... I don't think that we should be mixing universes in the compiler code and need to "build up" from LLVM vs grafting other base libraries. The runtime code for nvgpu has some more give to it from a dependency standpoint, but for the level of things expected to be in there, I would like to avoid the complexity that comes from taking complicated deps if possible. Some of this stuff is preference and some has proven to be more trouble than it is worth in the past... The hard line that we can't cross in this repo is that dependencies must be thin and must have a well supported CMake build. |
I don't have a preference. |
Newer waste a good opportunity to bike-shed 😄 Logging, as brought up by @ezhulenev, caught my eye. For the compiler, I would suggest we try to use the diagnostic handler infrastructure as much as possible and log warnings/errors there. That will force us to provide messages with good context. Regarding namespaces: I agree with @stellaraccident: Lets bias on progress rather than perfect choice for now. Having said that, I personally would use Excited to see this project spin up! |
This has been a heavy overhead week for me but I should finally get some coding time this afternoon, and since I can probably bootstrap the project somewhat efficiently, I'll take a stab at that. As noted, I'll stage it in an iree-samples directory first and will then hand off to someone Google-side to create the repo (which requires a bit of red tape). |
Requires IREE changes from: iree-org/iree#12888 Progress on openxla/community#71
All right... the above two commits seem to get me most of the way there. Things build, etc. Was a bit of a slog. |
…12888) * Adds CMake scoped IREE_PACKAGE_ROOT_DIR and IREE_PACKAGE_ROOT_PREFIX to replace hard-coded path to namespace logic in iree_package_ns (and uses within IREE/removes the special casing). * Adds support for `BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_ABOVE_THIS_LINE ` to bazel_to_cmake. Been carrying this patch for a while and ended up needing it. * Further generalizes bazel_to_cmake target resolution so that it is customizable as needed out of tree. * Moves the `iree::runtime::src::defs` target to `iree::defs` and puts it in the right place in the tree to avoid special casing. * Ditto but for `iree::compiler::src::defs` * Adds a bit more logging to `_DEBUG_IREE_PACKAGE_NAME` mode. * Makes iree_tablegen_library consult a scoped `IREE_COMPILER_TABLEGEN_INCLUDE_DIRS` var for additional include directories (makes it possible to use out of tree). * Adds `NVPTXDesc` and `NVPTXInfo` targets to HAL_Target_CUDA. No idea why this was triggering for me but was getting undefined deps. Must have been coming in elsewhere in a more full featured build. * Fixes iree-opt initialization sequence with respect to command line options. Also fixed the test which should have been verifying this. * Fixed pytype issue in bazel_to_cmake that could theoretically happen. Fixes build issues related to the out of tree build for openxla/community#71
And initial CuDNN custom module that does nothing related to CuDNN yet: https://github.com/iree-org/iree-samples/pull/123/files |
Requires IREE changes from: iree-org/iree#12888 Progress on openxla/community#71
Both of the initial commits have landed. @theadactyl Can we get someone with admin rights on the org to create the openxla-nvgpu repo? It should be populated with https://github.com/iree-org/iree-samples/tree/main/openxla-nvgpu |
Coming in late and I think just agreeing with what's already been decided, but strong positions on: not taking an absl or tsl dep in the compiler and scoping the [sub]namespace to the project (so not just "iree" or "openxla" or "compiler") I do think that the Google (Titus) advice on not having deeply nested namespaces is pretty good and not just a weird Google thing: https://abseil.io/tips/130. Every level of nesting gives us a fun new place for collisions. So I would vote for something like |
I don't like +HUGE to |
Requires IREE changes from: iree-org/iree#12888 Progress on openxla/community#71
…12888) * Adds CMake scoped IREE_PACKAGE_ROOT_DIR and IREE_PACKAGE_ROOT_PREFIX to replace hard-coded path to namespace logic in iree_package_ns (and uses within IREE/removes the special casing). * Adds support for `BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_ABOVE_THIS_LINE ` to bazel_to_cmake. Been carrying this patch for a while and ended up needing it. * Further generalizes bazel_to_cmake target resolution so that it is customizable as needed out of tree. * Moves the `iree::runtime::src::defs` target to `iree::defs` and puts it in the right place in the tree to avoid special casing. * Ditto but for `iree::compiler::src::defs` * Adds a bit more logging to `_DEBUG_IREE_PACKAGE_NAME` mode. * Makes iree_tablegen_library consult a scoped `IREE_COMPILER_TABLEGEN_INCLUDE_DIRS` var for additional include directories (makes it possible to use out of tree). * Adds `NVPTXDesc` and `NVPTXInfo` targets to HAL_Target_CUDA. No idea why this was triggering for me but was getting undefined deps. Must have been coming in elsewhere in a more full featured build. * Fixes iree-opt initialization sequence with respect to command line options. Also fixed the test which should have been verifying this. * Fixed pytype issue in bazel_to_cmake that could theoretically happen. Fixes build issues related to the out of tree build for openxla/community#71
…ree-org#12888) * Adds CMake scoped IREE_PACKAGE_ROOT_DIR and IREE_PACKAGE_ROOT_PREFIX to replace hard-coded path to namespace logic in iree_package_ns (and uses within IREE/removes the special casing). * Adds support for `BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_ABOVE_THIS_LINE ` to bazel_to_cmake. Been carrying this patch for a while and ended up needing it. * Further generalizes bazel_to_cmake target resolution so that it is customizable as needed out of tree. * Moves the `iree::runtime::src::defs` target to `iree::defs` and puts it in the right place in the tree to avoid special casing. * Ditto but for `iree::compiler::src::defs` * Adds a bit more logging to `_DEBUG_IREE_PACKAGE_NAME` mode. * Makes iree_tablegen_library consult a scoped `IREE_COMPILER_TABLEGEN_INCLUDE_DIRS` var for additional include directories (makes it possible to use out of tree). * Adds `NVPTXDesc` and `NVPTXInfo` targets to HAL_Target_CUDA. No idea why this was triggering for me but was getting undefined deps. Must have been coming in elsewhere in a more full featured build. * Fixes iree-opt initialization sequence with respect to command line options. Also fixed the test which should have been verifying this. * Fixed pytype issue in bazel_to_cmake that could theoretically happen. Fixes build issues related to the out of tree build for openxla/community#71
Now that RFC - Proposal to Build IREE Compiler Plugin Mechanism has been implemented and is minimally working with both in-tree and an early adopter out of tree implementation, we are ready to propose the official creation of a new
openxla-nvgpu
project to house:Since this will be the first plugin project of note, we expect that to a certain extent, it will co-evolve with the mechanisms.
Proposed Directory Structure
openxla-nvgpu
will have a similar directory layout to IREE, upon which it depends:The build setup will use
bazel_to_cmake
for consistency and interop with the upstream IREE project (now that iree-org/iree#12765 has taken the steps to allow it to be used out of tree). The corresponding IREE build macros will be extended as needed.Dependencies
This project depends directly on:
It transitively depends on:
Releases
In the very short term, this project will not produce independent releases and will aim to be usable by project developers who are willing to build it from HEAD.
In the longer term, we would like to introduce a new top-level
openxla-compiler
releaser project which is responsible for packaging the platform (IREE) with stable versions of all supported vendor compiler plugins and making available as a standalone binary release. Such a project would eventually depend on this one and would effectively be the plugin-aggregated release of the present dayiree-compiler
packages (which will continue to be released as the "vanilla" platform without out-of-kind platform specific dependencies).Also in the longer term, as the PJRT plugin support evolves, we anticipate releasing
openxla-nvgpu-pjrt
binary packages that can be used to interface NVIDIA GPUs to supported ML frameworks viapip install
.Versioning and CI
Adding this top-level project pushes us firmly into a "many-repo" layout for OpenXLA projects. This will be further reinforced as IREE's dependencies and build tools are disaggregated over time and the top-level releaser projects are established.
As part of this, we will introduce a side by side workspace layout where dependencies are found relative to each other based on parent directory. Example:
Such a layout will be called an "OpenXLA workspace" and we will provide a
sync
script and CI tooling to help manage it. Each project will pin to green release tags in its parent (or other form of stable commit tracking) by maintaining a local metadata file of its openxla dependencies. Thesync
script will be both a simple way for developers to track known-good sync points for the workspace and for CIs to advance. There will be a CI bot which advances dependent projects to next stable sync points automatically. We expect that for projects that already have a strong release cadence like IREE, this will update pins to new nightly releases, and others will cascade from at-head commits.This process of versioning will be developed over time with an eye towards being the one way to manage openxla project dependencies. It will likely be somewhat manual to start with. This will be derived in spirit from the PJRT plugin sync script and enhanced to provide better release tracking and version bump capabilities.
Benchmarking
Benchmarks of the NVIDIA toolchain will be largely inherited and leveraged from the IREE project but run independently so as to provide a continuous view of the performance deltas and characteristics of the platform-independent upstream and the vendor-specific downstream.
Next steps
As a very next step, the
openxla-nvgpu
project will be bootsrapped in the iree-samples repository. It will be relocated, with history, to a new git repository once this RFC has matriculated.Project Ownership
The project will be set up as a collaboration between Google and NVIDIA, and per OpenXLA governance, will share maintainer responsibility between contributors from both companies with the goal of NVIDIA engineers taking on core maintainer responsibility as the project bootstraps and evolves.
The text was updated successfully, but these errors were encountered: