Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Stream was too long" while processing video files larger than 2 GB #25

Open
JohnstonJ opened this issue Jul 29, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@JohnstonJ
Copy link

Describe the bug
Attempting to process a large video immediately results in a crash: "IOException: Stream was too long".

To Reproduce
Steps to reproduce the behavior:

  1. Process a large video file. Larger than 2 GB should be enough to trigger it.

Expected behavior
WinWhisper should process the video without crashing.

Screenshots

Welcome to WinWhisper (1.3.2.0). Generate subtitles with ease using WhisperAI.
Enter the path where you want the subtitles to be saved...
Leave empty to save the subtitles in the ./Subtitles folder

Enter the video path or the folder path that contains the videos you want to process...
C:\Users\JOHNST~1\AppData\Local\Temp\mpout\Projects\myfile.mkv
In which language code (en,nl etc) is the audio for video: myfile.mkv? Leave empty to auto detect
en
Do you want to translate the subtitles to English? (yes/no) Default: no

Processing video: Johnston #4 - 2004 Forest Lake Academy, Big Bend.mkv
Extracting audio from video file located at: C:\Users\JOHNST~1\AppData\Local\Temp\mpout\Projects\myfile.mkv
This might take a while depending on the file size and drive speed...
An error occured. Please report the following on our GitHub page: https://github.com/GewoonJaap/WinWhisper/issues/new?assignees=&labels=&projects=&template=bug_report.md&title=
Error details:
========== Start Of Error ==========
Error name:
IOException
Error message:
Stream was too long.
Error stacktrace:
   at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize)
   at AudioExtractor.Extractor.ExtractAudioFromVideoFile(String videoFilePath)
   at Program.<>c__DisplayClass0_0.<<Main>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at Utility.LoopUtil.ForEachAsync[T](List`1 list, Func`2 func)
   at Program.Main(String[] args)
Error inner exception:

Error inner exception stacktrace:

========== End Of Error ==========
Press any key to exit...

Desktop (please complete the following information):

  • OS: Windows 10 22H2 (OS Build 19045.4651)

Additional context
At https://github.com/GewoonJaap/WinWhisper/blob/5a77de230f93e5abab8b54e33f2a7dd0206ce895/AudioExtractor/Extractor.cs#L16C36-L16C48 it looks like you are copying the entire input file into a MemoryStream. MemoryStream has a documented limit of 2 GB, per https://learn.microsoft.com/en-us/dotnet/api/system.io.memorystream.setlength?view=net-8.0 - because that's the maximum length of an array in .NET.

Even if MemoryStream did not have this limit, copying the entire file into memory is probably still not a scalable approach, since the size of the video file might exceed physical memory.

@GewoonJaap
Copy link
Owner

Hi, thanks for reporting this bug.
I will try and get this fixed :) I noticed however that Whisper expectes a stream with valid contents of a WAV file. So I will have to try and see if I could either use a different stream or somehow split the WAV file up into multiple parts.

@GewoonJaap GewoonJaap added the bug Something isn't working label Jul 31, 2024
@GewoonJaap GewoonJaap self-assigned this Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants