Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add play and speed to cli options #3027

Merged
merged 18 commits into from
Oct 16, 2023

Conversation

David-bfg
Copy link
Contributor

A quality of life addition for the cli tool.
Add play argument to play TTS after it is generated.
Additionally include speed argument to be used with Coqui Studio models.

--play argument uses simpleaudio to play the tts wav
--speed <float 0.0-2.0> passes speed argument to Coqui Studio models
@CLAassistant
Copy link

CLAassistant commented Oct 3, 2023

CLA assistant check
All committers have signed the CLA.

@erogol
Copy link
Member

erogol commented Oct 6, 2023

Thanks for the PR @David-bfg

Would that work on different OSs? On linux you could just pipe to play commands I guess. It'd be better than introducing a new dependency.

@David-bfg
Copy link
Contributor Author

Would that work on different OSs?

https://simpleaudio.readthedocs.io/en/latest/
simpleaudio is cross platform win, linux & mac

you could just pipe to play commands I guess.

Yes, I thought this would be preferable to just add a --pipe_out arg.

A lot or all of the logs are print lines so they would need to be suppressed to do so. I was not aware of such a feature beyond putting if statements around each log, but a cursory google search looks like there's something more manageable for that.

I'll look into it and circle back. Just hoping there is a similarly simple function to format the raw wav data to stdout as there is for saving it to a file.

Considering conversion to pipe wav data for audio playback with ohter program
like aplay.

This is incomplete code. Using to get feedback before proceeding with
implementation.
pipe_out = sys.stdout if args.pipe_out else None

with contextlib.redirect_stdout(None if args.pipe_out else sys.stdout):
# Late-import to make things load faster
Copy link
Contributor Author

@David-bfg David-bfg Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just indentation changes from lines
368-417 and
420-531.
Use hide whitespace to see changes better.

@David-bfg
Copy link
Contributor Author

@erogol any comments you could give on commit f1b1f4a

I went and looked into getting the wav file sent to standard out instead of the logs. This appears to be generally what I'd be looking to implement. Just wanted to check that the idea looked reasonable before cleaning it up and fully implementing.
Thanks.

@erogol
Copy link
Member

erogol commented Oct 9, 2023

@David-bfg I think piping with standard out makes sense. But we should drop the play argument to keep things simpler.

@David-bfg
Copy link
Contributor Author

@erogol unless there are further code comments this should be complete or at it's last stage.

@erogol
Copy link
Member

erogol commented Oct 13, 2023

@David-bfg I'll review it next Monday. Thanks for the update.

@erogol erogol merged commit a151d70 into coqui-ai:dev Oct 16, 2023
@omega3
Copy link

omega3 commented Jan 21, 2024

I installed via
pip install TTS
and
tts --text "companies seeking competitive advantage." --model_name "tts_models/en/vctk/vits" --speed 1.5 --out_path "$out" --speaker_idx p230

shows
tts: error: unrecognized arguments: --speed 1.5

@David-bfg
Copy link
Contributor Author

@omega3 https://pypi.org/project/TTS/ & latest readme show speed removed from docs. it only worked with ⓍTTS voice model if i recall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants