-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC - Proposal to Build IREE Compiler Plugin Mechanism #12520
Comments
Is there a mechanism by which we can override default codegen for a dispatch region and replace it with a plug-in provided compilation-path? The Custom Dispatches section suggests that doing so would require intervention in the front-end compiler: "they expose themselves I'd like to see a mechanism where the plug-in can decide for a given dispatch region whether to use the default path, or provide an alternative one. This can be useful for opinionated implementations that we don't want to support in the baseline compiler (e.g., use Cutlass micro-kernels). EDIT: @benvanik suggests this might be done by contributing custom passes in the plug-in? |
And an adjacent point, for my education, does IREE use a "nesting doll" blob? (so a custom pass plug-in can "undo/ heavily modify" one (probably default) pass's output using some previous pass information?) EDIT: One example is some of the "auto-sharding" passes, there are projects that do a lot of custom wrangling of XLA to get things to work (e.g. the ALPA project). Having a plug-able custom pass that is able to view the full trace would mean that there would be no reason to fork the entire compiler. |
There are various ways to accomplish that but I suspect you are thinking of a model that we don't use. In general, we try to hold to the mantra that it is far easier to make a good decision initially than attempt to undo it later... but of course, mantras rarely apply completely. That is why we are letting the plugin mechanism act as early as it needs to in order to get it right initially, and we expose the deeper IR layers up there so that if it wants to commit a decision, it can. IREE rides on top of MLIR, which is hierarchical and allows a number of workflows that are not dissimilar to what I think you are asking (i.e. you could snapshot and nest versions of the IR for use later). It just gets really hard to reason about things if they end up getting all tangled up and unstructured. |
I enjoyed looking through the linked examples demonstrating host module extension in C++ and Python (I especially liked the possibility of defining custom types), as well as kernel dispatches and believe the efforts spent on implementing a robust infrastructure towards extensibility of the IREE plugin mechanisms will go a long way towards opening R&D doors that previously were just not present. The robustness benefits aside, in my view one of the most salient features of this proposal is extending the MLIR passes. Do you think we should also think about extensibility of autotuning/kernel selection logic as well at some point? The MLIR passes are static in nature and do not provide an entry point for such logic, at least in declarative form. |
Yes - in my view once we get over this first hurdle, it becomes relatively easy to add additional extension points. Of course, it doesn't relieve us from needing to actually figure that stuff out, but it does create the outlet that we can connect them to in a low cost way. That should help a lot. I decided to draw the line at "passes" for this first cut at the mechanism because it is a mature and obviously useful cut point. There is early work for more declarative approaches to some of the other things, and I left them out of this initial RFC because it makes more sense to look at them as a successor once the mechanism is in place. |
^ and pass pipelines with options are effectively like command lines with flags and act as a pretty nice way of setting things up. It also allows for breaking down the pipelines, building reproducers, and running them standalone with iree-opt-style tools. The pipelines themselves can have nested dynamic pass pipelines too such that the top level pipeline may only be conditionally encoding a pipeline based on the IR present in the program, attributes on the IR (target configuration), or the options provided to the pipeline. We use this in the compiler itself, for example, to have a single pass that translates all executables in the program work by forking out to pipelines that translate executables for a particular backend: |
There is always the option of injecting say PDL patterns via commandline flag into passes (well or as additional module files that are ingested by the pipeline) and Ben makes a good point on nested dynamic pipeline. But I agree that passes provide a good starting point here and would get us quite far. Experimenting via Python or some such dynamically could help for shorter iteration cycle that could then be baked further. |
On modifying the pass manager, I have proposed a more dynamic way to register passes here. I appreciate that this is not even close to what this proposal is about, but it's something that would help prototyping without burdening IREE's CI and local developer builds. Right now, we have our own fork, which adds our project to IREE as a submodule. If this would ever get accepted, every IREE developer will pay some cost (clone, update and maybe even build and test). While we could add a lightweight plugin for our work into IREE, we couldn't just move our entire project inside IREE. So IREE would still need to clone and build Unless I'm missing something, and this proposal could perhaps help us do in a more integrated way? |
The compiler side mechanisms won't require modifying IREE (you'll be able to pip install iree-compile and point it at your plugins which can be built separately), and the runtime side will also not require modification (hacking on that right now) so you'll be able to use iree-benchmark-module/pjrt/other binding layers without needing to rebuild/modify those. |
There's a short-term/long-term thing at play here and this RFC is a very first step for modularizing the codebase from a build/dev/integration perspective. Where I think this first spike will get to is allowing there to exist a repository like If we end up relying on this plugin as part of a supported release, then yes, we will collectively be on the hook for lock-step integrate/updates, but there is a ramp to get there (and needs to be a support agreement of some kind), and this RFC gives the optionality for it to exist independently. I'm specifically not committing this RFC to the more adventurous integrations like "download a binary of iree-compile and link to arbitrarily built C++ plugins". Things are possible there, but we need to step slowly as that shared-library/deployment situation needs to improve in lock-step. I've also not scoped this RFC to ergonomics of the runtime side. Will be looking at that more as we walk down the path. |
(I'll also reiterate the comment I made in the email linking to this RFC: happy to have an open meeting to discuss. Please speak up if valuable) |
I think we're on the same page. Let's tackle one problem at a time. We have a thread that is exposing a stable API for injecting passes before any dynamic work is discussed, which is to separate the passes into layers (tensor, memref, scf, llvm) and to build IREE with them as just calls. This will allow us to adjust our transforms to match the rest of IREE's pipeline. Eventually we'll look at other frameworks, hoping to find a common pattern. If that doesn't quite work (because passes change too much, or we can't find a fixed point), then we'll have to try something different. An idea I discussed with @nicolasvasilache is to break down the steps across different tools, like old compiler style, piping IR through different tools. It's a lot easier to get a stable exchange format (MLIR) across tools if we serialize them. In that case we'd use IREE to convert Python models into MLIR and possibly run some passes, then our compiler would kick in, and spit its result to the back-end of IREE, to run on CPU, GPU, whatever. This way we can easily control the toolchain, dynamic libraries loading, and we don't need to be in lock-step on LLVM sources, etc. We could still add plugins at the HAL level or module level, but those would be 100% IREE owned (we'd help with maintenance, obviously), so that the glue exists in IREE, but needs some external tool to make it work. Again, all of those things are long term goals, and probably better ideas will come before we even get to work on them, so just thinking out loud. |
* CMake plumbing to support static, linked-in plugins. * Example plugin added. * PluginManager wired into compiler driver. Per RFC #12520
I believe that means a plugin will also be able to offer an executable import provider and any extra dependencies for the executable loader? Currently, both of these are parameters while configuring IREE. And the current description of plugin capabilities does not mention these. |
This is mostly describing the compiler-side of things which in your case is the thing that decides to emit the calls to runtime-resolved imports. The runtime side is simpler in scope and mostly done (just have to fix a windows thing and improve the sample): #12625 With this runtime-linked imports are much easier to work with as they now don't require rebuilding the IREE runtime or whatever is hosting it (PJRT, python wheels, etc). Building a plugin just requires a single header and can be done in whatever build infra you want with the caveat that system library-based plugins (shared libraries, dlls, dylibs) require the people building the plugin to manage multi-targeting if their users want it. It'll still be best practice to do things purely with compiler plugins where possible - such as JITing kernels in your compiler plugin using the statically available information and attaching the .o for static linking - to avoid the deployment complexity/deployment bloat but such is the case with all software doing this stuff and now IREE won't be standing in the way :) |
* CMake plumbing to support static, linked-in plugins. * Example plugin added. * PluginManager wired into compiler driver. Per RFC #12520
* CMake plumbing to support static, linked-in plugins. * Example plugin added. * PluginManager wired into compiler driver and API. Per RFC #12520
Ok, the first step is now landed, which creates the basic mechanism and a (really) simple example plugin. Just enough to statically link plugins, activate them, bind to options, and register dialects/passes. I'll be working on a more comprehensive (out of tree) example next and using that to drive followons. Overall readme: https://github.com/openxla/iree/blob/main/compiler/src/iree/compiler/PluginAPI/README.md |
Includes updates to the plugin mechanism: * Enables building plugins at a specific path (in or out of repo) * Re-organizes initialization sequence to separate dialect registration from activation * Wires into iree-opt * Adds a mechanism for extending pipelines with additional passes and wires that into the preprocessing pipeline Progress on #12520
This reworks bazel_to_cmake so that it searches for a .bazel_to_cmake.cfg.py file and uses its location as the root of the repository. This file is also evaluated and provides repository-specific configuration for the tool. There may still be some work to fully generalize but this should get us pretty close to being able to use bazel_to_cmake in related OpenXLA projects. Progress on #12520 for letting us use bazel_to_cmake in out of tree plugin repositories.
This reworks bazel_to_cmake so that it searches for a .bazel_to_cmake.cfg.py file and uses its location as the root of the repository. This file is also evaluated and provides repository-specific configuration for the tool. There may still be some work to fully generalize but this should get us pretty close to being able to use bazel_to_cmake in related OpenXLA projects. Progress on #12520 for letting us use bazel_to_cmake in out of tree plugin repositories.
…12598) * CMake plumbing to support static, linked-in plugins. * Example plugin added. * PluginManager wired into compiler driver and API. Per RFC iree-org#12520
* CMake plumbing to support static, linked-in plugins. * Example plugin added. * PluginManager wired into compiler driver and API. Per RFC #12520
Includes updates to the plugin mechanism: * Enables building plugins at a specific path (in or out of repo) * Re-organizes initialization sequence to separate dialect registration from activation * Wires into iree-opt * Adds a mechanism for extending pipelines with additional passes and wires that into the preprocessing pipeline Progress on #12520
This reworks bazel_to_cmake so that it searches for a .bazel_to_cmake.cfg.py file and uses its location as the root of the repository. This file is also evaluated and provides repository-specific configuration for the tool. There may still be some work to fully generalize but this should get us pretty close to being able to use bazel_to_cmake in related OpenXLA projects. Progress on #12520 for letting us use bazel_to_cmake in out of tree plugin repositories.
This has been accepted/implemented. |
…12598) * CMake plumbing to support static, linked-in plugins. * Example plugin added. * PluginManager wired into compiler driver and API. Per RFC iree-org#12520
Includes updates to the plugin mechanism: * Enables building plugins at a specific path (in or out of repo) * Re-organizes initialization sequence to separate dialect registration from activation * Wires into iree-opt * Adds a mechanism for extending pipelines with additional passes and wires that into the preprocessing pipeline Progress on iree-org#12520
This reworks bazel_to_cmake so that it searches for a .bazel_to_cmake.cfg.py file and uses its location as the root of the repository. This file is also evaluated and provides repository-specific configuration for the tool. There may still be some work to fully generalize but this should get us pretty close to being able to use bazel_to_cmake in related OpenXLA projects. Progress on iree-org#12520 for letting us use bazel_to_cmake in out of tree plugin repositories.
The topic of how to extend the IREE compiler has been carried out in back channels for a long time. While the emergence of a mechanism to manage the complexity of multiple differentiation points has been inevitable,
I have been holding off on instantiating such a thing for the last nine months while requirements and initial baseline support for key platforms emerges. In short, such mechanisms can be a massive boon to an ecosystem, but if introduced in the wrong way or at the wrong time, they can be a force for greater fragmentation and hollow out various core development activities needed for the long term health of the ecosystem.
As such, we have been taking the strategy of incrementally enabling the extension points that a plugin can manipulate in order to meet some project/vendor/device-specific goals and holding off on actually creating the point in the codebase where such specialization happens. This has allowed us to ensure that such extension points are well represented up to the input layer and exposed in a way that is testable, cross platform and independent.
The following extension points are generally available in the codebase today:
HAL Driver Target: Extension point for providing last-mile lowering of device executables to a concrete platform such that they can be instantiated and used by a corresponding HAL runtime component (which is not necessarily 1:1). While many of these exist in in the same directory tree as the mainline compiler sources, a mechanism contributing additional driver targets has existed for some time. See the ROCM driver target as an example.
Custom Modules: In IREE's breakdown of host and device processing, custom modules provide compiler support (and a defined correlation with the runtime) of arbitrary host-side extensions. This allows for arbitrary extension of both ops and data types in a way that flows through to all compiler outputs (VM, C sources) in a uniform way and mirrors runtime code that can be optionally depended on. Note that even on the CPU, IREE enforces a host-device split where "host" code is responsible for scheduling and arbitrary execution, while "device" code represents dispatched kernel work. See also the Python sample for an idea of what can be done if more assumptions about the hosting environment of the compiler and runtime are taken.
Custom Dispatches: Device side code generation can be arbitrarily front-loaded or extended via the custom dispatch mechanisms. This provides a platform-neutral extension nomenclature and IR for composing device code in various ways including:
These extension points all follow the same pattern: they expose themselves to the top-level of the program with IR structures that can be targeted by either originating systems or top-level compiler transformations.
Preprocessing Hooks: Simple facilities for experimentation and ad-hoc exploitation of the extension points. While not intended for a shipping compiler, using the preprocessing hooks is a low-code/low-debate way to prototype specific customizations that may be experimental or not suitable for general inclusion into a default compiler flow. These are primarily exposed via:
Preprocessing
directory.iree-preprocessing-pass-pipeline
).--iree-hal-preprocess-executables-with=
).Composing Extension Points
As has been mentioned, work to date has been focused on creating the extension points, not the means of exploiting them (beyond command line flags
and manipulation of the input program). This RFC proposes defining such a mechanism, formalized as named "compiler plugins".
Goals:
Enable both in-tree and out-of-tree plugins as a directory of source code that can be systematically included in the compiler in a configurable way.
Mechanisms for activating a specific list of available plugins as part of a compiler invocation (i.e. command-line flags, C API support, program level attributes).
Ability for the plugin to contribute:
Enable the plugin to configure passes that can be run at defined pipeline hook points in the main compilation flow.
Define a development process and categorization for out-of-tree, in-tree and default available plugins.
Support statically building and linking plugins into a resultant compiler so that they are available without further action.
Development support for dynamically linking plugins apart from the compiler and dynamically loading them (i.e. via an env var, etc).
Non Goals:
Concrete Design Proposal
High level points:
PluginAPI
source tree).We will start by creating an in-tree example plugin by elaborating the following directory structure:
The
PluginRegistration.cpp
file will expose a CAPI like:Note that the definition as a plain C API is intended to provide a way for dynamic linking of plugins, not to imply a stable ABI. Most likely, this boilerplate will be generated by some form of registration macro in
order to get export and visibility flags correct.
When built statically, the main compiler build system will generate code that calls these named registration functions and it will add the necessary target as a link dependency.
When built dynamically, the main compiler will consult a command line flag to determine plugin DSOs to be loaded, load each and dynamically invoke the registration function. A to-be-determined naming correspondance should exist between the DSO name and the plugin ID in order to reduce configuration
boilerplate. We will only support dynamic loading as a development activity and only via the
-DIREE_COMPILER_BUILD_SHARED_LIBS=ON
mode.The
iree_plugin_registrar_t *registrar
pointer will be exchangeable for a C++ class that provides facilities:There will be a CMake setting like
-DIREE_COMPILER_EXTERNAL_PLUGINS=dir1;dir2
which enables a list of plugin directories to link into the compiler. We will also provide a short-hand for in-tree plugins like-DIREE_COMPILER_PLUGINS=foo;bar
Code Organization and Principles
The mechanism itself does not imply any principles for how plugins are developed in the ecosystem beyond putting in-tree and out-of-tree plugins on equal footing.
For an out-of-tree plugin, literally anything can be done and the IREE project developers set no policy. However, many plugins will want to migrate towards some level of supported or bundled status, and therefore certain principles and standards must be developed. It is expected that this will be an evolving area as we learn how to use the mechanism, so what follows is just the
starting point.
Support Tiers
In-tree Plugin Principles
Many platforms in this space have an "exciting" arrays of ways to target them. These can range from both vendor-driven and external compiler/codegen solutions
and (more typically) a plethora of libraries (often overlapping and at many levels of definition).
IREE seeks to make the platform usable for any of these, but we have certain design standards that bias the kinds of things we are willing to accept as an in-tree/Enabled plugin:
Implications for other efforts
Extending the project in this way will create further pressure on testing and CI. Both will likely need work in order to expand to this greater degree of generality.
The text was updated successfully, but these errors were encountered: