Identifying Spoken Language #4903

Sasha-Bachynskyi · 2022-09-08T11:47:22Z

Hello, developers.
Is there a model or something to identify spoken language? For example, how to identify whether a speaker speaks English or Russian.
I looked for it in the tutorials and found nothing.
I will appreciate any help

nithinraok · 2022-09-21T03:21:28Z

@fayejf is the model published? Please point to the docs.

jnnnnn · 2022-09-30T10:00:45Z

It looks like there is a labeller, see https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_classification/speech_to_label.py#L81

fayejf · 2022-10-05T17:10:50Z

@jnnnnn @Sasha-Bachynskyi The model is published. Thanks for your patience. #5080

Sasha-Bachynskyi · 2022-11-15T08:49:30Z

Hi, @fayejf!

I can't figure out how to use this model. There is only an instance of how to initialize a model.
Could you give an example of what method I should call and how to pass the audio file in?

Thank you in advance for helping!

nithinraok · 2022-11-15T18:33:13Z

Hi @Sasha-Bachynskyi , PR to merge info regarding docs should be merged soon.
#5366

You may infer the label using EncDecSpeakerLabelModel class. https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/api.html#nemo.collections.asr.models.EncDecSpeakerLabelModel

For inferencing on single audio file use get_label method. Instead for inferencing on multiple files use batch_inference

Sasha-Bachynskyi · 2023-02-22T12:49:48Z

Hi @nithinraok, I'm sorry for bothering you.
I want to identify the spoken language in a single file.

I use the following instruction

Below is my code:

import nemo.collections.asr as nemo_asr

langid_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(model_name="langid_ambernet")

lang = langid_model.get_label('audio.wav')

But, I get an error:

Traceback (most recent call last):
  File "/home/denis/test_lang/test-lang.py", line 5, in <module>
    lang = vad_model.get_label('audio.wav')
  File "/home/denis/anaconda3/envs/nemo2/lib/python3.9/site-packages/nemo/collections/asr/models/label_models.py", line 455, in get_label
    _, logits = self.infer_file(path2audio_file=path2audio_file)
  File "/home/denis/anaconda3/envs/nemo2/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/denis/anaconda3/envs/nemo2/lib/python3.9/site-packages/nemo/collections/asr/models/label_models.py", line 427, in infer_file
    audio = librosa.core.resample(audio, sr, target_sr)
TypeError: resample() takes 1 positional argument but 3 were given

It seems that there is something wrong with librosa

System info:
Nvidia video A40
Nemo - branch main, installed 22th of February 2023
librosa - 0.10.0

What can it be? I'd appreciate any help in advance

nithinraok · 2023-02-22T23:20:17Z

Looks like librosa is expecting mandatory naming args from newest version. Lower your librosa version or use the fix provided at #6086

Sasha-Bachynskyi added the bug Something isn't working label Sep 8, 2022

Sasha-Bachynskyi changed the title ~~Define Spoken Language~~ Identify Spoken Language Sep 8, 2022

Sasha-Bachynskyi changed the title ~~Identify Spoken Language~~ Identifying Spoken Language Sep 8, 2022

fayejf closed this as completed Oct 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identifying Spoken Language #4903

Identifying Spoken Language #4903

Sasha-Bachynskyi commented Sep 8, 2022 •

edited

Loading

nithinraok commented Sep 21, 2022

jnnnnn commented Sep 30, 2022

fayejf commented Oct 5, 2022

Sasha-Bachynskyi commented Nov 15, 2022

nithinraok commented Nov 15, 2022

Sasha-Bachynskyi commented Feb 22, 2023

nithinraok commented Feb 22, 2023

Identifying Spoken Language #4903

Identifying Spoken Language #4903

Comments

Sasha-Bachynskyi commented Sep 8, 2022 • edited Loading

nithinraok commented Sep 21, 2022

jnnnnn commented Sep 30, 2022

fayejf commented Oct 5, 2022

Sasha-Bachynskyi commented Nov 15, 2022

nithinraok commented Nov 15, 2022

Sasha-Bachynskyi commented Feb 22, 2023

nithinraok commented Feb 22, 2023

Sasha-Bachynskyi commented Sep 8, 2022 •

edited

Loading