Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Loudness Issue #232

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Fix Loudness Issue #232

wants to merge 3 commits into from

Conversation

alastorid
Copy link

@alastorid alastorid commented Jan 19, 2025

Tested with Chinese and found no clipping.

PR Waveform
Before image
After image

Text used:

我要給阿Q做正傳,已經不止一兩年了。但一面要做,一面又往回想,這足見我不是一個「立言」的人,因為從來不朽之筆,須傳不朽之人,於是人以文傳,文以人傳——究竟誰靠誰傳,漸漸的不甚瞭然起來,而終於歸接到傳阿Q,彷彿思想裡有鬼似的。

Edit: Based on this PR #221, it normalizes the audio stream without affecting audio quality. However, the implementation is now different.

@HarryBXie
Copy link

Hello, @alastorid

I'm source melo TTS recently, and try implement this model to edge devcie. Follow your PR, I meet below problem. Please have a time to help, thank you!
In melo/app.py

  • audio_list.append(utils.fix_loudness(audio,self.hps.data.sampling_rate))

cause almost audio content lost, just save the last around 3 seconds audio. base on aarch64/conda/python3.9/ubuntu22.04. All dependencies installed as the requirements.

@alastorid
Copy link
Author

Hello @HarryBXie ,
Thanks for reporting the issue. I’ve reworked the solution, and it now works without the need for additional dependencies, unlike before. Could you please pull the latest version of this PR and try again?

@HarryBXie
Copy link

@alastorid,
Thank you for your quick response, new PR is works well.
I have tried two kinds of development platform, the PR both are effective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants