Improve optional remote video transcription network usage #1802

lfcnassif · 2023-08-06T17:12:38Z

lfcnassif · 2023-08-06T17:36:28Z

What would be a good compressed audio format to send, without taking to long to convert the audio format? Or should we just send the original extracted audio channel from the videos as is?

wladimirleite · 2023-08-07T01:54:55Z

In theory, FLAC would be a good format (fast encoder/decoder, good quality and compression).
However, I think MPlayer can only decode it. In fact, I couldn't find any suitable option using MPlayer, other than the PCM output (already used in the server side), but I just took a quick look, so maybe there is a solution.

Another option would be using other tools, like FFmpeg or Mencoder. But that would add another dependency for a very
specific purpose.

Maybe we could use the same conversion done in the server side (PCM), and leave the usage of a better format as a future improvement. Although the WAV files are large, they usually are much smaller than the videos themselves. And it won't be necessary to run another conversion on the server side.

One important thing, the current command already includes -vo null -vc null, but I would also add -novideo.
Testing with a set of large videos from various sources and formats, using -novideo made the audio extraction several times faster.
Obviously it depends on the file length, used hardware, video and audio formats used, but this parameter seems to help a lot in general.

lfcnassif · 2023-08-07T13:23:55Z

One important thing, the current command already includes -vo null -vc null, but I would also add -novideo.
Testing with a set of large videos from various sources and formats, using -novideo made the audio extraction several times faster.
Obviously it depends on the file length, used hardware, video and audio formats used, but this parameter seems to help a lot in general.

Hi @tc-wleite, thanks for the performance tests! Sure, we can add the -novideo option.

About the audio format to send, is it possible to extract the audio from videos as is, without any conversion, using mplayer? So we could benefit from the original used compression.

wladimirleite · 2023-08-07T14:07:15Z

About the audio format to send, is it possible to extract the audio from videos as is, without any conversion, using mplayer? So we could benefit from the original used compression.

I tried to do that, but couldn't find how to do it with MPlayer. It is focused in reproduction, so it supports a lot of input formats but not many output ones.

wladimirleite · 2023-08-08T12:24:24Z

I found MPlayer's option -dumpaudio which dumps compressed audio channels from videos (as they originally are). However, the resulting file will only be playable in very specific cases. In practice, for ~20 videos of several formats that I used to test, the extracted files can't be reproduced, identified or converted to PCM by MPlayer. So it doesn't seem useful for what we need.

lfcnassif · 2023-08-08T12:52:11Z

That's bad news, thanks for investigating @tc-wleite!

lfcnassif · 2023-08-08T23:13:35Z

Hi @tc-wleite, a simple idea would be to use a general file compression algorithm already supported by Apache Commons Compress, . I run a few compression algorithms using 7zip on TEDx pt-BR test set slice (1033 audios):

WAV	FLAC	ZIP	BZIP2	LZMA2
433MB	238MB	341MB	277MB	272MB

FLAC conversion was done using FFmpeg default options

Apache Commons Compress also supports other compression schemes.

PS: Measuring running times using 1 thread now...

lfcnassif · 2023-08-08T23:38:33Z

Running times using one 7z thread (non-solid mode) and FFmpeg executed multiple times for each file to convert to/from FLAC:

	WAV	FLAC	ZIP	BZIP2	LZMA2
Size	433MB	238MB	341MB	277MB	272MB
Compression	-	73s	30s	77s	170s
Decompression	-	50s	5s	14s	14s

Of course Apache Commons Compress running times should be different than above.

wladimirleite · 2023-08-08T23:53:50Z

I thought about that too.
It seems a good option, that will save some network bandwidth without much overhead (in terms of required code, additional libraries and processing time).

lfcnassif added the enhancement label Aug 6, 2023

lfcnassif changed the title ~~Improve optional video remote transcription network usage~~ Improve optional remote video transcription network usage Aug 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve optional remote video transcription network usage #1802

Improve optional remote video transcription network usage #1802

lfcnassif commented Aug 6, 2023

lfcnassif commented Aug 6, 2023 •

edited

Loading

wladimirleite commented Aug 7, 2023

lfcnassif commented Aug 7, 2023 •

edited

Loading

wladimirleite commented Aug 7, 2023 •

edited

Loading

wladimirleite commented Aug 8, 2023

lfcnassif commented Aug 8, 2023

lfcnassif commented Aug 8, 2023 •

edited

Loading

lfcnassif commented Aug 8, 2023

wladimirleite commented Aug 8, 2023 •

edited

Loading

Improve optional remote video transcription network usage #1802

Improve optional remote video transcription network usage #1802

Comments

lfcnassif commented Aug 6, 2023

lfcnassif commented Aug 6, 2023 • edited Loading

wladimirleite commented Aug 7, 2023

lfcnassif commented Aug 7, 2023 • edited Loading

wladimirleite commented Aug 7, 2023 • edited Loading

wladimirleite commented Aug 8, 2023

lfcnassif commented Aug 8, 2023

lfcnassif commented Aug 8, 2023 • edited Loading

lfcnassif commented Aug 8, 2023

wladimirleite commented Aug 8, 2023 • edited Loading

lfcnassif commented Aug 6, 2023 •

edited

Loading

lfcnassif commented Aug 7, 2023 •

edited

Loading

wladimirleite commented Aug 7, 2023 •

edited

Loading

lfcnassif commented Aug 8, 2023 •

edited

Loading

wladimirleite commented Aug 8, 2023 •

edited

Loading