Performance relative to CuPy and on Jetson natively #748

dzeleznikar · 2024-09-17T17:55:36Z

dzeleznikar
Sep 17, 2024

I was reading the comments on this old YC post about MatX (https://news.ycombinator.com/item?id=37756281) and saw a reference to "comparison to numPy/cuPy, and we do have a table showing the comparison in the docs," however I could not find the comparison table showing "between MatX and cuPy we see a 3-4x performance difference on average." Is it hiding somewhere or was deprecated with a newer version?

I'm primarily interested in what the Jetson Orin board can do as an alternative to an FPGA platform, but some benchmarks here might help... I could only find things like https://openbenchmarking.org/result/2409108-NE-JETSONORI73 available publicly, and it sounds like that one leverages VkFFT which is an open-source alternative to cuFFT which shows a performance comparison between Nvidia A100 and AMD MI250 when using VkFFT vs CuFFT vs rocFFT, but it would be really interesting to see how performance compares on a Jetson Orin with MatX and/or CuPy as these seem like much more approachable paths to developing with CUDA in my use cases where I have MATLAB code today

cliffburdick · 2024-09-17T18:03:33Z

cliffburdick
Sep 17, 2024
Maintainer

Hi @dzeleznikar, that comment was showing a comparison of syntax here: https://nvidia.github.io/MatX/basics/matlabpython.html

We do not have a performance comparison because it varies so much, but as a general rule of thumb we expect MatX to be faster than cuPy for all workloads. If it's not, it's a bug and should be fixed. A good example is here.

FFT performance is heavily dependent on the size of the FFT and the number of batches. In general cuPy and MatX won't have much of a difference there because they should launch the same exact kernel in cuFFT under the same circumstances. That would be more of a comparison across libraries and hardware at that point as you pointed out. I'd be happy to run some tests on an Orin with a specific size in MatX and cuPy, but just be warned that MatX should not outperform cuPy by much in a synthetic test like that. If your real workload does a lot more than just FFTs, then I would expect MatX to outperform cuPy.

Is your MATLAB available for viewing?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance relative to CuPy and on Jetson natively #748

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Performance relative to CuPy and on Jetson natively #748

dzeleznikar Sep 17, 2024

Replies: 1 comment

cliffburdick Sep 17, 2024 Maintainer

dzeleznikar
Sep 17, 2024

cliffburdick
Sep 17, 2024
Maintainer