Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] prefer hip interfaces over roc during hipify #22394

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jeffdaily
Copy link
Contributor

Description

Change the hipify step to remove the -roc option to hipify-perl. This will prefer hipblas over rocblas. rocblas can still be called directly such as in TunableOp.

Motivation and Context

hip interfaces are preferred over roc for porting from cuda to hip. Calling roc interfaces is meant for ROCm-specific enhancements or extensions.

Copy link
Contributor

@TedThemistokleous TedThemistokleous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing we're keeping the ROCBLAS_CALL and other templates/defines to ensure if someone tries to build the older version support is still there?

@tianleiwu
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline

@tianleiwu
Copy link
Contributor

/azp run Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

Copy link

Azure Pipelines successfully started running 8 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@tianleiwu
Copy link
Contributor

@jeffdaily
Copy link
Contributor Author

Python format failed. Please follow https://github.com/microsoft/onnxruntime/blob/main/docs/Coding_Conventions_and_Standards.md#linting to fix

@tianleiwu Done.

Copy link
Contributor

@TedThemistokleous TedThemistokleous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Thanks for answering questions. LGTM

@tianleiwu
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

Copy link

Azure Pipelines successfully started running 8 pipeline(s).

@tianleiwu
Copy link
Contributor

/azp run Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline, Linux ROCm CI Pipeline, Linux MIGraphX CI Pipeline

Copy link

Azure Pipelines successfully started running 7 pipeline(s).

tianleiwu
tianleiwu previously approved these changes Oct 11, 2024
@tianleiwu
Copy link
Contributor

tianleiwu commented Oct 11, 2024

@jeffdaily, Please take a look at the unit test failure in ROCm pipeline:
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1521002&view=logs&j=f2f63060-d9d6-52d0-adee-b97db5a9ab91&t=28e21ca6-87a4-5e1e-0441-72b5e8326f2d&l=8852

 [ RUN      ] MatmulIntegerOpTest.MatMulInteger_int8_t
1: 2024-10-11 20:32:23.974607957 [E:onnxruntime:Default, rocm_call.cc:126 RocmCall] 
1: 2024-10-11 20:32:23.974636057 [E:onnxruntime:MatMulInteger, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running MatMulInteger node. Name:'node1' Status Message: 
1: /onnxruntime_src/onnxruntime/test/providers/base_tester.cc:337: Failure
1: Expected equality of these values:
1:   expect_result
1:     Which is: 4-byte object <00-00 00-00>
1:   ExpectResult::kExpectFailure
1:     Which is: 4-byte object <01-00 00-00>
1: Run failed but expected success: Non-zero status code returned while running MatMulInteger node. Name:'node1' Status 

Is it int

@tianleiwu tianleiwu dismissed their stale review October 11, 2024 22:13

rocm pipeline failure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants