Recommendation GPU -> File #544

aklacar1 · 2023-10-23T08:56:54Z

aklacar1
Oct 23, 2023

I am using VPF in order to speed up RobustVideoMatting, and currently it is working perfectly. However, I have found that I have a bottleneck and that is writing tensor to image files.

Use case:
I have loop of PyTorch CUDA Tensors from Video, so 900 tensors for video of 30 seconds. Basic RVM calculation is pretty fast, however I ran into problem when writing this to files, since I need to generate 900 Images from this.

I would appreciate some recommendation here on best practices in Python with Pytorch and VPF.

What I am currently doing:

Each iteration I am moving from GPU to CPU tensors and converting them to numpy (uint8)

After all iterations are done, I am writing them to image files using cv2 with ThreadPoolExecutor.

My ideas are to monitor memory consumption better and try to parallelize this further with ThreadPoolExecutor, but not sure will that speed things up by much or at all. Each tensor is about 32MiB so it fills up GPU quite fast if I do not move them to CPU numpy.

I was looking at Encoder of VPF, but not sure would that help me at all or how to use it ?

Processing speed currently around 12.85 frames per second. (If i do it all on CPU it would be 2-3 Frames per second)

RomanArzumanyan · 2023-10-23T13:45:25Z

RomanArzumanyan
Oct 23, 2023

Hi @aklacar1

First you need to decide what kind of output do you want you program to generate ))

If that's the multitude of JPEG files then I recommend you to find a library or framework which supports nvJPEG (VPF doesn't).
As a rule of thumb you want to avoid Device to Host memory copies as much as possible. nvJPEG is gpu-based JPEG encoder, it allows to eliminate DtoH IO.

Otherwise, if you're open to different output formats you can convert your torch tensors back to VPF Surfaces and encode them as it's shown in https://github.com/NVIDIA/VideoProcessingFramework/blob/master/samples/SamplePyTorch.py.

Nvenc can output videos of decent quality comparable to JPEG or even lossless video. Lossless is slower then usual lossy H.265 but it will be faster then Device -> Host -> JPEG anyway.

Also don't forget about pixel formats.
Native pixel format for JPEG is YUV420 and it has 2 times smaller memory footprint then RGB. If you want to encode your JPEGs on CPU anyway, just convert you tensors from RGB float to YUV420 uint8, and do your DtoH then. Reduce your PCIe traffic as much as possible.

One more advice is to use multiple CUDA streams.
DtoH transmission is blocking by default so by using multiple streams you would be able to better utilize your PCIe bandwidth.

2 replies

aklacar1 Oct 23, 2023
Author

What I need is PNG format due to Alpha channel, since I would like to avoid Green key background. I need to use these PNGs inside FFMPEG.

aklacar1 Oct 23, 2023
Author

However, from quick search it looks like nvJPEG is strictly for JPEG 2000, and PNG is supported on CPU.

RomanArzumanyan · 2023-10-23T14:15:38Z

RomanArzumanyan
Oct 23, 2023

@aklacar1

Thanks for the clarification, now it makes much more sense. Video codecs + alpha channel usually aren't a great match.
Probably you can get away with VPF > CV-CUDA > VPF, take a look here: https://cvcuda.github.io/
Overlay can be done by CV-CUDA, then VPF can encode your final image without alpha channel.

What's your target output format? I'm basically searching for an approach that keeps everything on GPU.

1 reply

aklacar1 Oct 23, 2023
Author

Thank you, I will take into consideration everything you said and find perfect solution. I will give a check also to nvJPEG2000, I just did quick search and this one should hopefully come in handy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommendation GPU -> File #544

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Recommendation GPU -> File #544

aklacar1 Oct 23, 2023

Replies: 2 comments · 3 replies

RomanArzumanyan Oct 23, 2023

aklacar1 Oct 23, 2023 Author

aklacar1 Oct 23, 2023 Author

RomanArzumanyan Oct 23, 2023

aklacar1 Oct 23, 2023 Author

aklacar1
Oct 23, 2023

Replies: 2 comments 3 replies

RomanArzumanyan
Oct 23, 2023

aklacar1 Oct 23, 2023
Author

aklacar1 Oct 23, 2023
Author

RomanArzumanyan
Oct 23, 2023

aklacar1 Oct 23, 2023
Author