Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MergeFunctions LLVM pass can generate invalid function calls under calling convention #57356

Open
peterhj opened this issue Jan 5, 2019 · 3 comments
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@peterhj
Copy link
Contributor

peterhj commented Jan 5, 2019

Basically, the MergeFunctions LLVM pass can rewrite functions to generate calls that are not valid under the calling convention of the target, e.g. extern "ptx-kernel" functions should not call other extern "ptx-kernel" functions in NVPTX.

This is an LLVM bug, described here (thanks @nikic): https://bugs.llvm.org/show_bug.cgi?id=40232. A PR also adds a target option and a -Z flag to control MergeFunctions: #57268.

Example: in the following Rust source, the functions foo and bar get merged by MergeFunctions:

#![crate_type = "lib"]

#![feature(abi_ptx)]
#![feature(lang_items)]
#![feature(link_llvm_intrinsics)]
#![feature(naked_functions)]
#![feature(no_core)]

#![no_core]

#[lang = "sized"]
trait Sized {}

#[lang = "copy"]
trait Copy {}

#[allow(improper_ctypes)]
extern "C" {
  #[link_name = "llvm.nvvm.barrier0"]
  fn syncthreads() -> ();
}

#[inline]
pub unsafe fn _syncthreads() -> () {
  syncthreads()
}

#[no_mangle]
pub unsafe extern "ptx-kernel" fn foo() {
  _syncthreads();
  _syncthreads();
}

#[no_mangle]
pub unsafe extern "ptx-kernel" fn bar() {
  _syncthreads();
  _syncthreads();
}

to yield the incorrect PTX assembly, as the call.uni bar instruction is not valid since a kernel is calling another kernel (note this requires rustc -Z merge-functions=trampolines from the above PR):

//
// Generated by LLVM NVPTX Back-End
//

.version 3.2
.target sm_35
.address_size 64

	// .globl	bar             // -- Begin function bar
                                        // @bar
.visible .entry bar()
{


// %bb.0:                               // %start
	bar.sync 	0;
	bar.sync 	0;
	ret;
                                        // -- End function
}
	// .globl	foo             // -- Begin function foo
.visible .entry foo()                   // @foo
{


// %bb.0:
	{ // callseq 0, 0
	.reg .b32 temp_param_reg;
	// XXX: `call.uni bar` is not a valid call!
	call.uni 
	bar, 
	(
	);
	} // callseq 0
	ret;
                                        // -- End function
}

Disabling MergeFunctions (e.g. using rustc -Z merge-functions=disabled) yields correct PTX assembly:

//
// Generated by LLVM NVPTX Back-End
//

.version 3.2
.target sm_35
.address_size 64

	// .globl	foo             // -- Begin function foo
                                        // @foo
.visible .entry foo()
{


// %bb.0:                               // %start
	bar.sync 	0;
	bar.sync 	0;
	ret;
                                        // -- End function
}
	// .globl	bar             // -- Begin function bar
.visible .entry bar()                   // @bar
{


// %bb.0:                               // %start
	bar.sync 	0;
	bar.sync 	0;
	ret;
                                        // -- End function
}

P.S. Currently the default operation of MergeFunctions is to emit function aliases which are not supported by NVPTX, so controlling MergeFunctions via the merge-functions flag is necessary to generate any of the PTX assembly above.

Meta

I'm on a patched rustc so this may not be so helpful, but here it is anyway:

rustc --version --verbose:

rustc 1.33.0-nightly (fb86d604b 2018-12-27)
binary: rustc
commit-hash: fb86d604bf65c3becd16180b56267a329cf268d5
commit-date: 2018-12-27
host: x86_64-unknown-linux-gnu
release: 1.33.0-nightly
LLVM version: 8.0
peterhj added a commit to peterhj/rust that referenced this issue Jan 5, 2019
"trampolines", or "aliases (the default)) to allow targets to opt out of
the MergeFunctions LLVM pass. Also add a corresponding -Z option with
the same name and values.

This works around: rust-lang#57356

Motivation:

Basically, the problem is that the MergeFunctions pass, which rustc
currently enables by default at -O2 and -O3, and `extern "ptx-kernel"`
functions (specific to the NVPTX target) are currently not compatible
with each other. If the MergeFunctions pass is allowed to run, rustc can
generate invalid PTX assembly (i.e. a PTX file that is not accepted by
the native PTX assembler ptxas). Therefore we would like a way to opt
out of the MergeFunctions pass, which is what our target option does.

Related work:

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3,
and also to enable the use of function aliases within MergeFunctions.
MergeFunctions both with and without function aliases is incompatible with
the NVPTX target.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to
the MergeFunctions pass, but it is not enabled by default.
@Centril Centril added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Jan 5, 2019
@nagisa
Copy link
Member

nagisa commented Jan 6, 2019

Removing I-unsound, because this, to our knowledge, does not result in any unsoundness. All that happens is target failing to assemble the machine code.

@nagisa nagisa removed the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label Jan 6, 2019
Centril added a commit to Centril/rust that referenced this issue Jan 15, 2019
Add a target option "merge-functions", and a corresponding -Z flag (works around rust-lang#57356)

This commit adds a target option "merge-functions", which takes values in ("disabled", "trampolines", or "aliases" (default is "aliases")), to allow targets to opt out of the MergeFunctions LLVM pass. Additionally, the latest commit also adds an optional -Z flag, "merge-functions", which takes the same values and has precedence over the target option when both are specified.

This works around rust-lang#57356.

cc @eddyb @japaric @oli-obk @nox @nagisa

Also thanks to @denzp and @gnzlbg for discussing this on rust-cuda!

### Motivation

Basically, the problem is that the MergeFunctions pass, which rustc currently enables by default at -O2 and -O3 [1], and `extern "ptx-kernel"` functions (specific to the NVPTX target) are currently not compatible with each other. If the MergeFunctions pass is allowed to run, rustc can generate invalid PTX assembly (i.e. a PTX file that is not accepted by the native PTX assembler `ptxas`). Therefore we would like a way to opt out of the MergeFunctions pass, which is what our target option does.

### Related work

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3 [1], and also to enable the use of function aliases within MergeFunctions [2] [3]. MergeFunctions seems to have some benefits, such as reducing code size and fixing a crash [4], which is why it is enabled. However, MergeFunctions both with and without function aliases is incompatible with the NVPTX target; a more detailed example for both cases is given below.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to the MergeFunctions pass, but it is not enabled by default.

### Examples/more details

Consider an example Rust lib using `extern "ptx-kernel"` functions: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore.rs. If we try to compile this with nightly rustc, we get the following compiler error:

    LLVM ERROR: Module has aliases, which NVPTX does not support.

This error happens because: (1) functions `foo` and `bar` have the same body, so are candidates to be merged by MergeFunctions; and (2) rustc configures MergeFunctions to generate function aliases using the "mergefunc-use-aliases" LLVM option [2] [3], but the NVPTX backend does not support those aliases.

Okay, so we can try omitting "mergefunc-use-aliases", and then rustc will happily emit PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-mergefunc-nousealiases-bad.ptx. However, this PTX is invalid! When we try to assemble it with `ptxas` (I'm on the CUDA 9.2 toolchain), we get an assembler error:

    ptxas nocore-mergefunc-nousealiases-bad.ptx, line 38; error   : Illegal call target, device function expected
    ptxas fatal   : Ptx assembly aborted due to errors

What's happening is that MergeFunctions rewrites the `bar` function to call `foo`. However, directly calling an `extern "ptx-kernel"` function from another `extern "ptx-kernel"` is wrong.

If we disable the MergeFunctions pass from running at all, rustc generates correct PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-nomergefunc-ok.ptx

[1] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_ssa/back/write.rs#L155
[2] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_llvm/llvm_util.rs#L64
[3] rust-lang#56358
[4] rust-lang#49479
bors added a commit that referenced this issue Jan 16, 2019
Rollup of 6 pull requests

Successful merges:

 - #56884 (rustdoc: overhaul code block lexing errors)
 - #57065 (Optimize try_mark_green and eliminate the lock on dep node colors)
 - #57107 (Add a regression test for mutating a non-mut #[thread_local])
 - #57268 (Add a target option "merge-functions", and a corresponding -Z flag (works around #57356))
 - #57551 (resolve: Add a test for issue #57539)
 - #57598 (Add missing unpretty option help message)

Failed merges:

r? @ghost
Centril added a commit to Centril/rust that referenced this issue Jan 17, 2019
Add a target option "merge-functions", and a corresponding -Z flag (works around rust-lang#57356)

This commit adds a target option "merge-functions", which takes values in ("disabled", "trampolines", or "aliases" (default is "aliases")), to allow targets to opt out of the MergeFunctions LLVM pass. Additionally, the latest commit also adds an optional -Z flag, "merge-functions", which takes the same values and has precedence over the target option when both are specified.

This works around rust-lang#57356.

cc @eddyb @japaric @oli-obk @nox @nagisa

Also thanks to @denzp and @gnzlbg for discussing this on rust-cuda!

### Motivation

Basically, the problem is that the MergeFunctions pass, which rustc currently enables by default at -O2 and -O3 [1], and `extern "ptx-kernel"` functions (specific to the NVPTX target) are currently not compatible with each other. If the MergeFunctions pass is allowed to run, rustc can generate invalid PTX assembly (i.e. a PTX file that is not accepted by the native PTX assembler `ptxas`). Therefore we would like a way to opt out of the MergeFunctions pass, which is what our target option does.

### Related work

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3 [1], and also to enable the use of function aliases within MergeFunctions [2] [3]. MergeFunctions seems to have some benefits, such as reducing code size and fixing a crash [4], which is why it is enabled. However, MergeFunctions both with and without function aliases is incompatible with the NVPTX target; a more detailed example for both cases is given below.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to the MergeFunctions pass, but it is not enabled by default.

### Examples/more details

Consider an example Rust lib using `extern "ptx-kernel"` functions: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore.rs. If we try to compile this with nightly rustc, we get the following compiler error:

    LLVM ERROR: Module has aliases, which NVPTX does not support.

This error happens because: (1) functions `foo` and `bar` have the same body, so are candidates to be merged by MergeFunctions; and (2) rustc configures MergeFunctions to generate function aliases using the "mergefunc-use-aliases" LLVM option [2] [3], but the NVPTX backend does not support those aliases.

Okay, so we can try omitting "mergefunc-use-aliases", and then rustc will happily emit PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-mergefunc-nousealiases-bad.ptx. However, this PTX is invalid! When we try to assemble it with `ptxas` (I'm on the CUDA 9.2 toolchain), we get an assembler error:

    ptxas nocore-mergefunc-nousealiases-bad.ptx, line 38; error   : Illegal call target, device function expected
    ptxas fatal   : Ptx assembly aborted due to errors

What's happening is that MergeFunctions rewrites the `bar` function to call `foo`. However, directly calling an `extern "ptx-kernel"` function from another `extern "ptx-kernel"` is wrong.

If we disable the MergeFunctions pass from running at all, rustc generates correct PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-nomergefunc-ok.ptx

[1] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_ssa/back/write.rs#L155
[2] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_llvm/llvm_util.rs#L64
[3] rust-lang#56358
[4] rust-lang#49479
Centril added a commit to Centril/rust that referenced this issue Jan 17, 2019
Add a target option "merge-functions", and a corresponding -Z flag (works around rust-lang#57356)

This commit adds a target option "merge-functions", which takes values in ("disabled", "trampolines", or "aliases" (default is "aliases")), to allow targets to opt out of the MergeFunctions LLVM pass. Additionally, the latest commit also adds an optional -Z flag, "merge-functions", which takes the same values and has precedence over the target option when both are specified.

This works around rust-lang#57356.

cc @eddyb @japaric @oli-obk @nox @nagisa

Also thanks to @denzp and @gnzlbg for discussing this on rust-cuda!

### Motivation

Basically, the problem is that the MergeFunctions pass, which rustc currently enables by default at -O2 and -O3 [1], and `extern "ptx-kernel"` functions (specific to the NVPTX target) are currently not compatible with each other. If the MergeFunctions pass is allowed to run, rustc can generate invalid PTX assembly (i.e. a PTX file that is not accepted by the native PTX assembler `ptxas`). Therefore we would like a way to opt out of the MergeFunctions pass, which is what our target option does.

### Related work

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3 [1], and also to enable the use of function aliases within MergeFunctions [2] [3]. MergeFunctions seems to have some benefits, such as reducing code size and fixing a crash [4], which is why it is enabled. However, MergeFunctions both with and without function aliases is incompatible with the NVPTX target; a more detailed example for both cases is given below.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to the MergeFunctions pass, but it is not enabled by default.

### Examples/more details

Consider an example Rust lib using `extern "ptx-kernel"` functions: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore.rs. If we try to compile this with nightly rustc, we get the following compiler error:

    LLVM ERROR: Module has aliases, which NVPTX does not support.

This error happens because: (1) functions `foo` and `bar` have the same body, so are candidates to be merged by MergeFunctions; and (2) rustc configures MergeFunctions to generate function aliases using the "mergefunc-use-aliases" LLVM option [2] [3], but the NVPTX backend does not support those aliases.

Okay, so we can try omitting "mergefunc-use-aliases", and then rustc will happily emit PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-mergefunc-nousealiases-bad.ptx. However, this PTX is invalid! When we try to assemble it with `ptxas` (I'm on the CUDA 9.2 toolchain), we get an assembler error:

    ptxas nocore-mergefunc-nousealiases-bad.ptx, line 38; error   : Illegal call target, device function expected
    ptxas fatal   : Ptx assembly aborted due to errors

What's happening is that MergeFunctions rewrites the `bar` function to call `foo`. However, directly calling an `extern "ptx-kernel"` function from another `extern "ptx-kernel"` is wrong.

If we disable the MergeFunctions pass from running at all, rustc generates correct PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-nomergefunc-ok.ptx

[1] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_ssa/back/write.rs#L155
[2] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_llvm/llvm_util.rs#L64
[3] rust-lang#56358
[4] rust-lang#49479
Centril added a commit to Centril/rust that referenced this issue Jan 17, 2019
Add a target option "merge-functions", and a corresponding -Z flag (works around rust-lang#57356)

This commit adds a target option "merge-functions", which takes values in ("disabled", "trampolines", or "aliases" (default is "aliases")), to allow targets to opt out of the MergeFunctions LLVM pass. Additionally, the latest commit also adds an optional -Z flag, "merge-functions", which takes the same values and has precedence over the target option when both are specified.

This works around rust-lang#57356.

cc @eddyb @japaric @oli-obk @nox @nagisa

Also thanks to @denzp and @gnzlbg for discussing this on rust-cuda!

### Motivation

Basically, the problem is that the MergeFunctions pass, which rustc currently enables by default at -O2 and -O3 [1], and `extern "ptx-kernel"` functions (specific to the NVPTX target) are currently not compatible with each other. If the MergeFunctions pass is allowed to run, rustc can generate invalid PTX assembly (i.e. a PTX file that is not accepted by the native PTX assembler `ptxas`). Therefore we would like a way to opt out of the MergeFunctions pass, which is what our target option does.

### Related work

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3 [1], and also to enable the use of function aliases within MergeFunctions [2] [3]. MergeFunctions seems to have some benefits, such as reducing code size and fixing a crash [4], which is why it is enabled. However, MergeFunctions both with and without function aliases is incompatible with the NVPTX target; a more detailed example for both cases is given below.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to the MergeFunctions pass, but it is not enabled by default.

### Examples/more details

Consider an example Rust lib using `extern "ptx-kernel"` functions: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore.rs. If we try to compile this with nightly rustc, we get the following compiler error:

    LLVM ERROR: Module has aliases, which NVPTX does not support.

This error happens because: (1) functions `foo` and `bar` have the same body, so are candidates to be merged by MergeFunctions; and (2) rustc configures MergeFunctions to generate function aliases using the "mergefunc-use-aliases" LLVM option [2] [3], but the NVPTX backend does not support those aliases.

Okay, so we can try omitting "mergefunc-use-aliases", and then rustc will happily emit PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-mergefunc-nousealiases-bad.ptx. However, this PTX is invalid! When we try to assemble it with `ptxas` (I'm on the CUDA 9.2 toolchain), we get an assembler error:

    ptxas nocore-mergefunc-nousealiases-bad.ptx, line 38; error   : Illegal call target, device function expected
    ptxas fatal   : Ptx assembly aborted due to errors

What's happening is that MergeFunctions rewrites the `bar` function to call `foo`. However, directly calling an `extern "ptx-kernel"` function from another `extern "ptx-kernel"` is wrong.

If we disable the MergeFunctions pass from running at all, rustc generates correct PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-nomergefunc-ok.ptx

[1] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_ssa/back/write.rs#L155
[2] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_llvm/llvm_util.rs#L64
[3] rust-lang#56358
[4] rust-lang#49479
Centril added a commit to Centril/rust that referenced this issue Jan 19, 2019
Add a target option "merge-functions", and a corresponding -Z flag (works around rust-lang#57356)

This commit adds a target option "merge-functions", which takes values in ("disabled", "trampolines", or "aliases" (default is "aliases")), to allow targets to opt out of the MergeFunctions LLVM pass. Additionally, the latest commit also adds an optional -Z flag, "merge-functions", which takes the same values and has precedence over the target option when both are specified.

This works around rust-lang#57356.

cc @eddyb @japaric @oli-obk @nox @nagisa

Also thanks to @denzp and @gnzlbg for discussing this on rust-cuda!

### Motivation

Basically, the problem is that the MergeFunctions pass, which rustc currently enables by default at -O2 and -O3 [1], and `extern "ptx-kernel"` functions (specific to the NVPTX target) are currently not compatible with each other. If the MergeFunctions pass is allowed to run, rustc can generate invalid PTX assembly (i.e. a PTX file that is not accepted by the native PTX assembler `ptxas`). Therefore we would like a way to opt out of the MergeFunctions pass, which is what our target option does.

### Related work

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3 [1], and also to enable the use of function aliases within MergeFunctions [2] [3]. MergeFunctions seems to have some benefits, such as reducing code size and fixing a crash [4], which is why it is enabled. However, MergeFunctions both with and without function aliases is incompatible with the NVPTX target; a more detailed example for both cases is given below.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to the MergeFunctions pass, but it is not enabled by default.

### Examples/more details

Consider an example Rust lib using `extern "ptx-kernel"` functions: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore.rs. If we try to compile this with nightly rustc, we get the following compiler error:

    LLVM ERROR: Module has aliases, which NVPTX does not support.

This error happens because: (1) functions `foo` and `bar` have the same body, so are candidates to be merged by MergeFunctions; and (2) rustc configures MergeFunctions to generate function aliases using the "mergefunc-use-aliases" LLVM option [2] [3], but the NVPTX backend does not support those aliases.

Okay, so we can try omitting "mergefunc-use-aliases", and then rustc will happily emit PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-mergefunc-nousealiases-bad.ptx. However, this PTX is invalid! When we try to assemble it with `ptxas` (I'm on the CUDA 9.2 toolchain), we get an assembler error:

    ptxas nocore-mergefunc-nousealiases-bad.ptx, line 38; error   : Illegal call target, device function expected
    ptxas fatal   : Ptx assembly aborted due to errors

What's happening is that MergeFunctions rewrites the `bar` function to call `foo`. However, directly calling an `extern "ptx-kernel"` function from another `extern "ptx-kernel"` is wrong.

If we disable the MergeFunctions pass from running at all, rustc generates correct PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-nomergefunc-ok.ptx

[1] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_ssa/back/write.rs#L155
[2] https://github.com/rust-lang/rust/blob/a36b960df626cbb8bea74f01243318b73f0bd201/src/librustc_codegen_llvm/llvm_util.rs#L64
[3] rust-lang#56358
[4] rust-lang#49479
bors added a commit that referenced this issue Jan 19, 2019
Rollup of 10 pull requests

Successful merges:

 - #57268 (Add a target option "merge-functions", and a corresponding -Z flag (works around #57356))
 - #57476 (Move glob map use to query and get rid of CrateAnalysis)
 - #57501 (High priority resolutions for associated variants)
 - #57573 (Querify `entry_fn`)
 - #57610 (Fix nested `?` matchers)
 - #57634 (Remove an unused function argument)
 - #57653 (Make the contribution doc reference the guide more)
 - #57666 (Generalize `huge-enum.rs` test and expected stderr for more cross platform cases)
 - #57698 (Fix typo bug in DepGraph::try_mark_green().)
 - #57746 (Update README.md)

Failed merges:

r? @ghost
VardhanThigle pushed a commit to jethrogb/rust that referenced this issue Jan 31, 2019
"trampolines", or "aliases (the default)) to allow targets to opt out of
the MergeFunctions LLVM pass. Also add a corresponding -Z option with
the same name and values.

This works around: rust-lang#57356

Motivation:

Basically, the problem is that the MergeFunctions pass, which rustc
currently enables by default at -O2 and -O3, and `extern "ptx-kernel"`
functions (specific to the NVPTX target) are currently not compatible
with each other. If the MergeFunctions pass is allowed to run, rustc can
generate invalid PTX assembly (i.e. a PTX file that is not accepted by
the native PTX assembler ptxas). Therefore we would like a way to opt
out of the MergeFunctions pass, which is what our target option does.

Related work:

The current behavior of rustc is to enable MergeFunctions at -O2 and -O3,
and also to enable the use of function aliases within MergeFunctions.
MergeFunctions both with and without function aliases is incompatible with
the NVPTX target.

clang's "solution" is to have a "-fmerge-functions" flag that opts in to
the MergeFunctions pass, but it is not enabled by default.
@tesuji
Copy link
Contributor

tesuji commented Mar 22, 2019

Is this issue closed by #57268?

@peterhj
Copy link
Contributor Author

peterhj commented Mar 22, 2019

This issue is for an LLVM bug that affects rust, whereas #57268 is a workaround but doesn't actually fix the underlying bug. Properly fixing this would require patching LLVM (specifically, the NVPTX backend).

@Enselic Enselic added the C-bug Category: This is a bug. label Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants