Speech To Text processing timeout #1032
Replies: 4 comments 1 reply
-
Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently. |
Beta Was this translation helpful? Give feedback.
-
Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion. |
Beta Was this translation helpful? Give feedback.
-
It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?
|
Beta Was this translation helpful? Give feedback.
-
Hi @Hadymohammed, while we offer a managed Whisper service for broader language coverage, unfortunately the Whisper models are significantly less efficient than Deepgram's own. However, our models don't support Arabic, so you'll need to use Whisper. I would recommend submitting requests at off-peak hours, when we have lower traffic load and can better serve Whisper requests. That would be US nights and weekends. How long of audio are you sending? I would recommend sending shorter audio to Whisper (such as under 60 minutes, or under 30 minutes if possible). I wouldn't recommend sending audio over an hour or so in length, as it will be more likely to time out. Ultimately, there is going to be a limit to which you can scale using Deepgram's managed Whisper. Processing times are slow, and we have a rate limit of 5 concurrent Whisper requests. |
Beta Was this translation helpful? Give feedback.
-
Seeking Support for Optimizing Long Audio Transcriptions with Whisper Large Model
Hello everyone,
We are working on a project to transcribe our internally produced video content, which currently amounts to about 40 hours per month. Our ultimate goal is to scale this system for external users, which could substantially increase our transcription volume.
At present, we’re using the Whisper Large model, primarily because of its support for Arabic transcription and speaker diarization — both are critical features for our workflow. However, we’re running into some significant challenges due to processing constraints:
What We’re Looking For
We’re seeking guidance or suggestions on the following:
Any insights, workarounds, or suggestions would be greatly appreciated.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions