You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Try to use TTS with czech language and latest num2words dependency
Crash due to unsupported language
Expected behavior
Czech language should work
Logs
File "/usr/local/lib/python3.10/site-packages/TTS/api.py", line 366, in tts_to_file
wav = self.tts(
File "/usr/local/lib/python3.10/site-packages/TTS/api.py", line 312, in tts
wav = self.synthesizer.tts(
File "/usr/local/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 406, in tts
outputs = self.tts_model.synthesize(
File "/usr/local/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 410, in synthesize
return self.full_inference(text, speaker_wav, language, **settings)
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 479, in full_inference
return self.inference(
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 525, in inference
text_tokens = torch.IntTensor(self.tokenizer.encode(sent, lang=language)).unsqueeze(0).to(self.device)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 666, in encode
txt = self.preprocess_text(txt, lang)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 652, in preprocess_text
txt = multilingual_cleaners(txt, lang)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 573, in multilingual_cleaners
text = expand_numbers_multilingual(text, lang)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 562, in expand_numbers_multilingual
text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
File "/usr/local/lib/python3.10/re.py", line 209, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 562, in<lambda>
text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 542, in _expand_number
return num2words(int(m.group(0)), lang=lang if lang != "cs"else"cz")
File "/usr/local/lib/python3.10/site-packages/num2words/__init__.py", line 98, in num2words
raise NotImplementedError()
Thanks for the investigation! This repository is no longer updated, but if you like you can open a PR with that fix in our fork. Otherwise I can take care of it in 1-2 weeks.
Describe the bug
Due to a change in num2words package,
cz
is no longer valid lang code.cs
should be used now.See savoirfairelinux/num2words#587 for the change
To Reproduce
Expected behavior
Czech language should work
Logs
Environment
Additional context
Simple fix would be to remove the fix that was probably applied in the past to get around the num2words non-standard code:
https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/layers/xtts/tokenizer.py#L482
https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/layers/xtts/tokenizer.py#L487
https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/layers/xtts/tokenizer.py#L515
https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/layers/xtts/tokenizer.py#L519
The num2words release that contains the fix:
https://github.com/savoirfairelinux/num2words/releases/tag/v0.5.14
The text was updated successfully, but these errors were encountered: