Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Speech-To-Text (Rules-Based Approach) #17

Open
chidiewenike opened this issue Oct 17, 2020 · 3 comments
Open

Improve Speech-To-Text (Rules-Based Approach) #17

chidiewenike opened this issue Oct 17, 2020 · 3 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers good-new-member-issue help wanted Extra attention is needed

Comments

@chidiewenike
Copy link
Collaborator

chidiewenike commented Oct 17, 2020

Objective

The DeepSpeech Speech-To-Text system needs to be improved to handle uncommon & non-English words. The rules-based approach is to inspect the output of the current DeepSpeech model and create a mapping of transcribed audio to the actual expected output. It will look-up all the substrings of the transcribed text to search for potential errors when the text is transcribed and replace those substrings in the event of errors. You can store these mappings in a JSON.

Key Result

Using the run_stt function of stream_deepspeech.py, return a string of audio input that is correctly transcribed.

def run_stt(time_len=TIME_LEN):

Example

Expected Transcription: "What is Casa Verde used for?" => DeepSpeech transcription: "What is cause uh very day used for?"

mapping={
    "cause uh very day" : "Casa Verde"
}
transcription_substring = "cause uh very day"
print(mapping[transcription_substring ]) # Output: Casa Verde

Details

Correctly transcribe all QA pairs from the question-answer pairs Google Sheet.

You will need the following DeepSpeech model and DeepSpeech scorer to use run_stt. Ensure that these files are in the same directory as the stream_deepspeech.py program.

If in need of assistance, please ask @chidiewenike

@chidiewenike chidiewenike added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed good-new-member-issue labels Oct 17, 2020
@chidiewenike
Copy link
Collaborator Author

Algorithms to consider in the future:

  • Levenshtein Distances/Fuzzy Matching

@chidiewenike
Copy link
Collaborator Author

chidiewenike commented Oct 28, 2020

Storing JSON from Python Dict

import json

mapper = {
        "ramona roderigo":"Ramon Rodriguez"
}

with open("test_json.json", "w") as in_json:
    json.dump(mapper, in_json)

print(mapper["ramona roderigo"]) # Output => Ramon Rodriguez

Pseudo-Python for Substring Mapper

from stream_deepspeech import *

def stt_mapper(predicted):
    mapper = {
    "romona roderigo" : "ramon rodriguez"            
}

    for substring in predicted:
        if substring in mapper:
            predicted.replace(substring, mapper[predicted])

    return predicted

predicted = run_stt(5)
# predicted => "romona roderigo is an operator at the ranch"
correct = stt_mapper(predicted)
# correct => "ramon rodriguez is an operator at the ranch"

@chidiewenike
Copy link
Collaborator Author

@akimminavarro @taylor-nguyen-987 Can you folks try out Swanton, Swanton Pacific, and Swanton Pacific Ranch? Maybe start with those?

taylor-nguyen-987 added a commit that referenced this issue Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers good-new-member-issue help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants