Improve Speech-To-Text (Rules-Based Approach) #17
Labels
enhancement
New feature or request
good first issue
Good for newcomers
good-new-member-issue
help wanted
Extra attention is needed
Objective
The DeepSpeech Speech-To-Text system needs to be improved to handle uncommon & non-English words. The rules-based approach is to inspect the output of the current DeepSpeech model and create a mapping of transcribed audio to the actual expected output. It will look-up all the substrings of the transcribed text to search for potential errors when the text is transcribed and replace those substrings in the event of errors. You can store these mappings in a JSON.
Key Result
Using the run_stt function of stream_deepspeech.py, return a string of audio input that is correctly transcribed.
swanton/stream_deepspeech.py
Line 16 in b8e5502
Example
Expected Transcription: "What is Casa Verde used for?" => DeepSpeech transcription: "What is cause uh very day used for?"
Details
Correctly transcribe all QA pairs from the question-answer pairs Google Sheet.
You will need the following DeepSpeech model and DeepSpeech scorer to use run_stt. Ensure that these files are in the same directory as the stream_deepspeech.py program.
If in need of assistance, please ask @chidiewenike
The text was updated successfully, but these errors were encountered: