Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: No performance improvement using hipGraph #77

Open
harrisonsz opened this issue Apr 28, 2024 · 6 comments
Open

[Issue]: No performance improvement using hipGraph #77

harrisonsz opened this issue Apr 28, 2024 · 6 comments

Comments

@harrisonsz
Copy link

Problem Description

GPU: RX6400 (I cannot find this model in all given GPU options)

I was trying to use hipGraph instead of hipStream to accelerate some computation. I find that the difference between performance using stream and graph is minor. I've tested the same program in a cuda manner using Nvidia's GPU and there was significant improvement, so I know for certain that my program was correctly written. My program run on Rocm 5.6.0, then I upgraded it to 5.7.0 and there was no difference in terms of performance. I wonder in which version of Rocm there is some optimization on hipGraph. Also, since I'm using a relatively outdated amd GPU - RX6400, I wonder if hipGraph can only have siginificant influence on some certain models.

Operating System

Ubuntu 22.04.3 LTS(Jammy Jellyfish)

CPU

11th Gen Intel(R) Core(TM) i5-11400

GPU

AMD Radeon VII

ROCm Version

ROCm 5.7.0

ROCm Component

clr, HIP

Steps to Reproduce

I wrote two simple programs to test performance. One uses stream, and another uses graph. I made them txt because github doesn't allow me to upload cpp files. Simply change them to cpp, compile and run the two programs to see the output.
hip_only_stream.txt
hip_using_graph.txt

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@harrisonsz
Copy link
Author

I've tested on RX7900XTX, there is still no improvement.

@ppanchad-amd
Copy link

Hi @harrisonsz, internal ticket has been created to investigate this issue. Thanks!

@schung-amd
Copy link

Hi @harrisonsz, thanks for pointing this out! Unfortunately, this is a known issue. I actually see a performance loss with the graph version of your code on ROCm 6.2 with a 7900XTX! Scaling the problem up to N = 1024 * 1024 * 100, the graph version outperforms the stream version by only 2%.

While I don't think we have any public-facing documentation about this, hipGraph currently does not provide as much of an advantage as CUDA graphs. We're working on improving this, although I am not aware of any definite timelines. I'll reach out to our internal teams to see if they have any additional information.

@schung-amd
Copy link

A quick update from the internal team: we have been making a lot of good progress with hipGraph performance, but we've been focused on the MI300 so many of the improvements at the moment are only seen there for now, and not on Radeon systems like yours or my repro system.

@harrisonsz
Copy link
Author

Thank you for your reply. Do you have plans to also improve hipGraph on Radeon systems in the future?

@schung-amd
Copy link

Checking with the internal team to find out what our plans are on the Radeon front, I'll update here when I have that information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants