Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVPTX support for new asm! #72439

Merged
merged 10 commits into from
May 30, 2020
Merged

NVPTX support for new asm! #72439

merged 10 commits into from
May 30, 2020

Conversation

westernmagic
Copy link
Contributor

@westernmagic westernmagic commented May 21, 2020

This PR implements the new asm! syntax for the nvptx64-nvidia-cuda target.

r? @Amanieu

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @davidtwco (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 21, 2020
@Mark-Simulacrum Mark-Simulacrum assigned Amanieu and unassigned davidtwco May 21, 2020
src/test/assembly/asm/nvptx-types.rs Outdated Show resolved Hide resolved
src/librustc_target/asm/nvptx.rs Outdated Show resolved Hide resolved
src/librustc_target/asm/nvptx.rs Outdated Show resolved Hide resolved
src/test/assembly/asm/nvptx-types.rs Show resolved Hide resolved
@Amanieu
Copy link
Member

Amanieu commented May 21, 2020

You are missing an nvptx-modifiers.rs test which tests that register names are rendered properly in the output asm.

@westernmagic
Copy link
Contributor Author

You are missing an nvptx-modifiers.rs test which tests that register names are rendered properly in the output asm.

As discussed on zulip, NVPTX does not support any modifiers, therefore no tests are needed.

src/test/assembly/asm/nvptx-types.rs Outdated Show resolved Hide resolved
src/librustc_target/asm/nvptx.rs Show resolved Hide resolved
src/librustc_target/asm/mod.rs Outdated Show resolved Hide resolved
@Amanieu
Copy link
Member

Amanieu commented May 22, 2020

The code looks good!

Can you update the specification in src/doc/unstable-book/src/library-features/asm.md to include the new register classes?

@westernmagic
Copy link
Contributor Author

This is false, I checked on NVPTX and values are not zero-extended. The upper bits are undefined. https://rust.godbolt.org/z/9yU64A

They are, implicitly:

A destination register wider than the specified type may be used. The value loaded is sign-extended to the destination register width for signed integers, and is zero-extended to the destination register width for unsigned and bit-size types. See Table 25 for a description of these relaxed type-checking rules.

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-ld
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#operand-size-exceeding-instruction-type-size

Since clang gives the registers bit sizes, they are all zero extended in this implementation.

Here's a small clang experiment as proof:

#include <cstdint>
#include <cstdio>

__global__ void foo(uint32_t x, uint64_t * y) {
	asm volatile (
		"mov.u64 %0, %1;"
		: "=l"(*y) : "l"(x)
	);
}

__host__ int main() {
	uint64_t y = UINT64_MAX;
	uint64_t * yd;
	cudaMalloc(&yd, sizeof(uint64_t));
	cudaMemcpy(yd, &y, sizeof(uint64_t), cudaMemcpyHostToDevice);
	foo<<<1, 1>>>(0xBEEF, yd);
	cudaMemcpy(&y, yd, sizeof(uint64_t), cudaMemcpyDeviceToHost);
	cudaFree(yd);
	printf("%lx\n", y);
}

If you wish, I can add the links I referenced to the documentation

@Amanieu
Copy link
Member

Amanieu commented May 23, 2020

Again, look at the assembly output from godbolt: https://rust.godbolt.org/z/-8yrgo

.visible .func  (.param .b64 func_retval0) foo(
        .param .b64 foo_param_0
)
{

        ld.param.u64    %rd2, [foo_param_0];
        mov.u64 %rd1, %rd2
        st.param.b64    [func_retval0+0], %rd1;
        ret;

}

The input value is not zero-extended anywhere in this code.

@westernmagic
Copy link
Contributor Author

Ah, I see now. Looking at this though, the problem seems to be more complicated though: https://rust.godbolt.org/z/ojcbJV

@Amanieu
Copy link
Member

Amanieu commented May 23, 2020

This is the same as every other architecture: when you put a value into a register that is smaller than the register size, the upper bits are UNDEFINED. Since they are undefined, their exact value depends on what the optimizer decides to do.

@Amanieu
Copy link
Member

Amanieu commented May 23, 2020

@bors r+

@bors
Copy link
Contributor

bors commented May 23, 2020

📌 Commit 8706f76020e84863e8afc4e25dc014decc0f5a2f has been approved by Amanieu

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 23, 2020
@bors
Copy link
Contributor

bors commented May 24, 2020

☔ The latest upstream changes (presumably #72516) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 24, 2020
@Amanieu
Copy link
Member

Amanieu commented May 25, 2020

@bors r+

@bors
Copy link
Contributor

bors commented May 25, 2020

📌 Commit e18054d has been approved by Amanieu

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 25, 2020
RalfJung added a commit to RalfJung/rust that referenced this pull request May 29, 2020
NVPTX support for new asm!

This PR implements the new `asm!` syntax for the `nvptx64-nvidia-cuda` target.

r? @Amanieu
bors added a commit to rust-lang-ci/rust that referenced this pull request May 29, 2020
Rollup of 9 pull requests

Successful merges:

 - rust-lang#67460 (Tweak impl signature mismatch errors involving `RegionKind::ReVar` lifetimes)
 - rust-lang#71095 (impl From<[T; N]> for Box<[T]>)
 - rust-lang#71500 (Make pointer offset methods/intrinsics const)
 - rust-lang#71804 (linker: Support `-static-pie` and `-static -shared`)
 - rust-lang#71862 (Implement RFC 2585: unsafe blocks in unsafe fn)
 - rust-lang#72103 (borrowck `DefId` -> `LocalDefId`)
 - rust-lang#72407 (Various minor improvements to Ipv6Addr::Display)
 - rust-lang#72413 (impl Step for char (make Range*<char> iterable))
 - rust-lang#72439 (NVPTX support for new asm!)

Failed merges:

r? @ghost
@bors bors merged commit 3789455 into rust-lang:master May 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants