Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] SplitHostDevice, handle subroutines #14918

Merged
merged 17 commits into from
May 26, 2023

Conversation

Lunderberg
Copy link
Contributor

This PR refactors SplitHostDevice into three separate transformations. Previously, SplitHostDevice would replace device regions with a builtin::tvm_call_packed() node to replace the extracted region. After this PR, this process is performed in three separate steps.

  1. AnnotateDeviceRegion: Annotate the regions that should be executed on another target.
  2. SplitHostDevice: Extract the annotated region into an independent PrimFunc, with a GlobalVar to represent the call from into the new subroutine.
  3. LowerDeviceKernelLaunch: For any subroutine call where the caller and callee are on different devices, replace with a device kernel launch.

Prior to this commit, the `TargetNode::host` could be specified in
TVMScript as part of the config dictionary, under the key `"host"`.
However, this required all other device parameters to be explicitly
specified, rather than using any of the short-hand string
representations.  This commit forwards the `host` argument from TVMScript's
`T.target` method to `tvm.target.Target`, allowing both the device and
host to be specified using the shorthand string representation.

```python
@T.prim_func
def before_this_commit():
    T.func_attr(
        {
            "target": T.target(
                {
                    "arch": "sm_86",
                    "host": {"keys": ["cpu"], "kind": "llvm", "tag": ""},
                    "keys": ["cuda", "gpu"],
                    "kind": "cuda",
                    "max_num_threads": 1024,
                    "tag": "",
                    "thread_warp_size": 32,
                }
            )
        }
    )
    T.evaluate(0)

@T.prim_func
def after_this_commit():
    T.func_attr({"target": T.target("cuda", host="llvm")})
    T.evaluate(0)
```
Previously, the symbol name of the extracted compute kernel was
defined based on the `kGlobalSymbol` attribute, which was required to
be present.  This commit updates `SplitHostDevice` to generate the
symbol name using `kGlobalSymbol` if present, and to fall back to the
name of the `tvm::GlobalVar` for internal functions.
First pass, `AnnotateDeviceRegions`.  This pass decides which portions
of a PrimFunc should be run on the device, and annotates them with
`kTarget` attribute, indicating which target should be used for later
lowering steps.

Second pass, `SplitHostDevice`.  This pass extracts the annotated
region into an independent PrimFunc.  The `kTarget` attribute of the
extracted kernel is defined by the `kTarget` annotation inserted by
`AnnotateDeviceRegions`.  The host function is marked by the
`tvm::tir::attr::kIsHostFunc` attribute, allowing it to be recognized
by later host-only lowering passes.

Third pass, `LowerDeviceKernelLaunch`.  This pass identifies
subroutine calls that call into device kernels, and rewrites them into
`T.tvm_call_packed`.
@tvm-bot
Copy link
Collaborator

tvm-bot commented May 22, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@Lunderberg
Copy link
Contributor Author

This PR is a subset of the functionality in #14862. As that PR has grown to a difficult-to-review size, I'm carving off independent chunks of it into separate PRs, after which #14862 can be rebased to a more manageable size.

In addition, this PR depends on #14915 for the new unit tests, and should be rebased onto main after #14915 lands.

Previously, the SplitHostDevice pass added the
`tir::attr::kKernelLaunchParams` attribute, and the
LowerDeviceKernelLaunch pass filled in the values for it.  This
cleanup makes the kernel launch params be the sole responsibility of
LowerDeviceKernelLaunch.
@Lunderberg Lunderberg force-pushed the split_host_device_handle_subroutines branch from 7dabdd0 to a856076 Compare May 22, 2023 19:44
PRs apache#14913 and
apache#14914 made analogous changes to
`MakePackedAPI` and `MakeUnpackedAPI` to handle subroutine calls.
Both PRs introduced the same symbol,
`tvm::tir::SubroutineCallRewriter`, a local utility to update internal
calls to a modified function.  While each PR passed CI individually,
and was therefore able to merge, having both changes caused a
duplicate symbol.

This commit updates `MakePackedAPI` and `MakeUnpackedAPI` to place
their local utilities into anonymous namespaces, avoiding the
conflict.
Comment on lines 71 to 72
bool is_host_func =
func->GetAttr<Bool>(tvm::tir::attr::kIsHostFunc).value_or(Bool(false))->value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. discussion offline, do we want to update to not rely on kIsHostFunc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and updated. The SplitHostDevice and LowerDeviceKernelLaunch no longer use the kIsHostFunc attribute at all.

Update to use the `tvm::tir::IsHostFunc` utility function, rather than
the `kIsHostFunc` attribute.  Per discussion on
apache#14020, the `kIsHostFunct` attribute
should only be used in `BindTarget`, and should not be re-introduced
in `SplitHostDevice`.
@Lunderberg
Copy link
Contributor Author

(Merged #14950 into this PR to allow CI to run. No conflicts expected after CI, so github's squash/merge will remove it from the final commit.)

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for the refactoring! Looking forward to the additional functionality having SplitHostDevice split out into separate steps will enable during lowering.

@csullivan csullivan merged commit 6eb0779 into apache:main May 26, 2023
@tqchen
Copy link
Member

tqchen commented May 27, 2023

there seems to be a regression caused, or related to this PR, please look into it

https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-gpu/detail/main/715/pipeline

@Lunderberg
Copy link
Contributor Author

@tqchen Thank you for the heads up, and looking into it.

Lunderberg added a commit to Lunderberg/tvm that referenced this pull request May 30, 2023
This resolves an issue introduced by the combination of
apache#14918 and
apache#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. apache#14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. apache#14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.
@Lunderberg
Copy link
Contributor Author

Found the issue. It was a combination of this PR and #14945, which is why it wasn't caught by CI. A full description of the issue, and a bugfix, are in PR #14982.

@Lunderberg Lunderberg deleted the split_host_device_handle_subroutines branch May 30, 2023 14:22
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request May 30, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit that referenced this pull request May 30, 2023
…14982)

This resolves an issue introduced by the combination of
#14918 and
#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. #14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. #14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.
mei-ye pushed a commit to mei-ye/tvm that referenced this pull request Jun 1, 2023
This PR refactors SplitHostDevice into three separate transformations. Previously, SplitHostDevice would replace device regions with a builtin::tvm_call_packed() node to replace the extracted region. After this PR, this process is performed in three separate steps.

AnnotateDeviceRegion: Annotate the regions that should be executed on another target.
SplitHostDevice: Extract the annotated region into an independent PrimFunc, with a GlobalVar to represent the call from into the new subroutine.
LowerDeviceKernelLaunch: For any subroutine call where the caller and callee are on different devices, replace with a device kernel launch.

* PR#14915 [TVMScript] Allow T.target("device", host="host") in TVMScript

Prior to this commit, the `TargetNode::host` could be specified in
TVMScript as part of the config dictionary, under the key `"host"`.
However, this required all other device parameters to be explicitly
specified, rather than using any of the short-hand string
representations.  This commit forwards the `host` argument from TVMScript's
`T.target` method to `tvm.target.Target`, allowing both the device and
host to be specified using the shorthand string representation.

```python
@T.prim_func
def before_this_commit():
    T.func_attr(
        {
            "target": T.target(
                {
                    "arch": "sm_86",
                    "host": {"keys": ["cpu"], "kind": "llvm", "tag": ""},
                    "keys": ["cuda", "gpu"],
                    "kind": "cuda",
                    "max_num_threads": 1024,
                    "tag": "",
                    "thread_warp_size": 32,
                }
            )
        }
    )
    T.evaluate(0)

@T.prim_func
def after_this_commit():
    T.func_attr({"target": T.target("cuda", host="llvm")})
    T.evaluate(0)
```

* [Target] Added WithoutHost method

* [TIR] SplitHostDevice, handle missing kGlobalSymbol

Previously, the symbol name of the extracted compute kernel was
defined based on the `kGlobalSymbol` attribute, which was required to
be present.  This commit updates `SplitHostDevice` to generate the
symbol name using `kGlobalSymbol` if present, and to fall back to the
name of the `tvm::GlobalVar` for internal functions.

* [TIR] Refactor SplitHostDevice into three separate passes

First pass, `AnnotateDeviceRegions`.  This pass decides which portions
of a PrimFunc should be run on the device, and annotates them with
`kTarget` attribute, indicating which target should be used for later
lowering steps.

Second pass, `SplitHostDevice`.  This pass extracts the annotated
region into an independent PrimFunc.  The `kTarget` attribute of the
extracted kernel is defined by the `kTarget` annotation inserted by
`AnnotateDeviceRegions`.  The host function is marked by the
`tvm::tir::attr::kIsHostFunc` attribute, allowing it to be recognized
by later host-only lowering passes.

Third pass, `LowerDeviceKernelLaunch`.  This pass identifies
subroutine calls that call into device kernels, and rewrites them into
`T.tvm_call_packed`.

* Add unit tests specifically for SplitHostDevice behavior

* Added unit test specifically for AnnotateDeviceRegions

* Added unit tests for LowerDeviceKernelLaunch

* Minor cleanup, moved all kernel launch collection into one spot

Previously, the SplitHostDevice pass added the
`tir::attr::kKernelLaunchParams` attribute, and the
LowerDeviceKernelLaunch pass filled in the values for it.  This
cleanup makes the kernel launch params be the sole responsibility of
LowerDeviceKernelLaunch.

* Updated unit tests for LowerWarpMemory

* Updated unit tests for ThreadSync

* Updated unit test for inject ptx async copy

* [Bugfix] Avoid symbol conflicts in MakePackedAPI/MakeUnpackedAPI

PRs apache#14913 and
apache#14914 made analogous changes to
`MakePackedAPI` and `MakeUnpackedAPI` to handle subroutine calls.
Both PRs introduced the same symbol,
`tvm::tir::SubroutineCallRewriter`, a local utility to update internal
calls to a modified function.  While each PR passed CI individually,
and was therefore able to merge, having both changes caused a
duplicate symbol.

This commit updates `MakePackedAPI` and `MakeUnpackedAPI` to place
their local utilities into anonymous namespaces, avoiding the
conflict.

* Maintain "tir.is_global_func" attr in device-side entry point

* SplitHostDevice, update the host-side target to be the host

* [TIR] Update LowerDeviceKernelLaunch to avoid kIsHostFunc

Update to use the `tvm::tir::IsHostFunc` utility function, rather than
the `kIsHostFunc` attribute.  Per discussion on
apache#14020, the `kIsHostFunct` attribute
should only be used in `BindTarget`, and should not be re-introduced
in `SplitHostDevice`.

* Remove is_host_func from SplitHostDevice tests
mei-ye pushed a commit to mei-ye/tvm that referenced this pull request Jun 1, 2023
…pache#14982)

This resolves an issue introduced by the combination of
apache#14918 and
apache#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. apache#14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. apache#14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.
masahi pushed a commit that referenced this pull request Jun 2, 2023
* [Bugfix][TIR][VTA] Update host-side target, even without device func

This resolves an issue introduced by the combination of
#14918 and
#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. #14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. #14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.

* [CodegenC] Updated unit test for sorted CodegenC output

Previously, this unit test generated a `Map<tvm::Target, IRModule>`
whose default iteration order was not sorted by function name, built
the `Map` of modules, then validated whether the resulting C code was
a sorted list of 4 elements.  However, this condition was stricter
than necessary, as it depended on the number of items added to the
`Map` until it was unsorted.

This commit updates the test to instead validate that `std::is_sorted`
returns true.

* Ignore __tvm_main__ in unit test
Lunderberg added a commit that referenced this pull request Jun 3, 2023
* [Bugfix][TIR][VTA] Update host-side target, even without device func

This resolves an issue introduced by the combination of
#14918 and
#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. #14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. #14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.

* [TIR] Restrict tir.transform.LowerTVMBuiltin to host functions

Previously, the `tir.transform.LowerTVMBuiltin` pass applied to all
functions in an `IRModule`, but was only applied to modules that
contain only host functions.  This commit updates
`tir.transform.LowerTVMBuiltin` to apply only to host functions.

* Updated "stackvm" target to have "cpu" key.

With the presence/absence of the "cpu" key in a target used to
determine whether host-only calls should be run, should make sure to
add it to "stackvm".

* Update IsHostFunc() to use "host" tag instead of "cpu"

Current CI failures due to LowerTVMBuiltin not running on "hexagon"
target, and would like to avoid conflating cpu/host.

* Avoid "host" tag for now

* Update HEXAGON_AOT_LLVM_TARGET to be recognized as host
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jun 10, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jun 16, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jun 16, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jun 21, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
junrushao pushed a commit to junrushao/tvm that referenced this pull request Jun 22, 2023
* [Bugfix][TIR][VTA] Update host-side target, even without device func

This resolves an issue introduced by the combination of
apache#14918 and
apache#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. apache#14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. apache#14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.

* [CodegenC] Updated unit test for sorted CodegenC output

Previously, this unit test generated a `Map<tvm::Target, IRModule>`
whose default iteration order was not sorted by function name, built
the `Map` of modules, then validated whether the resulting C code was
a sorted list of 4 elements.  However, this condition was stricter
than necessary, as it depended on the number of items added to the
`Map` until it was unsorted.

This commit updates the test to instead validate that `std::is_sorted`
returns true.

* Ignore __tvm_main__ in unit test
junrushao pushed a commit to junrushao/tvm that referenced this pull request Jun 22, 2023
…e#14944)

* [Bugfix][TIR][VTA] Update host-side target, even without device func

This resolves an issue introduced by the combination of
apache#14918 and
apache#14945.  The bug occurred for
targets that do not require device-side codegen, but do require a
`device_type` other than `kDLCPU`.  It wasn't caught by CI, as the
issue only occurred with the combination of both PRs.

1. apache#14918 updated `SplitHostDevice` to only modify the `"target"`
   attribute when a device-side function has been extracted.

2. For VTA, there is no device-side function, as everything is done
   through host-side API calls.

3. From (1) and (2), the VTA examples kept the target
   `T.target("ext_dev", host="llvm")` after the `SplitHostDevice`
   pass, instead of being updated to `T.target("llvm")`.

4. apache#14945 restricted CombineContextCall to only apply to host-side
   passes.

5. From (4) and (5), the `CombineContextCall` pass was no longer
   applied to the VTA context calls.

This PR fixes `SplitHostDevice`, updating the target from
`T.target("ext_dev", host="llvm")` to `T.target("llvm")`, even if no
device sections have been extracted from the function.

* [TIR] Restrict tir.transform.LowerTVMBuiltin to host functions

Previously, the `tir.transform.LowerTVMBuiltin` pass applied to all
functions in an `IRModule`, but was only applied to modules that
contain only host functions.  This commit updates
`tir.transform.LowerTVMBuiltin` to apply only to host functions.

* Updated "stackvm" target to have "cpu" key.

With the presence/absence of the "cpu" key in a target used to
determine whether host-only calls should be run, should make sure to
add it to "stackvm".

* Update IsHostFunc() to use "host" tag instead of "cpu"

Current CI failures due to LowerTVMBuiltin not running on "hexagon"
target, and would like to avoid conflating cpu/host.

* Avoid "host" tag for now

* Update HEXAGON_AOT_LLVM_TARGET to be recognized as host
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jul 3, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jul 4, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jul 5, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jul 6, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Jul 7, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Aug 8, 2023
The functionality tested in this commit was added across several
recent PRs, each of which tested their features in isolation.  This PR
adds unit tests to validate the end-to-end behavior of TIR subroutine
calls.

PRs building up to this point:

- TVMScript
  - apache#14889
  - apache#14915
  - apache#14919
  - apache#14941

- Functionality improvements of existing TIR passes
  - apache#14913
  - apache#14914
  - apache#14918
  - apache#14951

- Changes to the TIR lowering flow
  - apache#14942
  - apache#14985

- Codegen updates
  - apache#14958
  - apache#14901

- Compatibility updates/fixes
  - apache#14892
  - apache#14950
  - apache#14943
  - apache#14944
  - apache#14945
  - apache#14952
  - apache#14982
  - apache#14949
@tqchen
Copy link
Member

tqchen commented Dec 14, 2023

A note that we should revisit some of the assumptions #16237

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants