Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

porting the dithering function from kaldi #40

Merged
merged 3 commits into from
Mar 15, 2024

Conversation

KarelVesely84
Copy link
Contributor

No description provided.

@KarelVesely84
Copy link
Contributor Author

KarelVesely84 commented Mar 14, 2024

okay, I added the missing part, the local build including the test binaries was okay...

@KarelVesely84
Copy link
Contributor Author

the failed linux-macos test seems unrelated

there is a return at unusual place in that failing test test_rfft.py

- 1.0 is too much, and would break the systems
@KarelVesely84
Copy link
Contributor Author

for me it is ready, i just changed the default dithering constant

k2 uses audio samples [-1..+1]
kaldi was using audio samples [-32k..+32k], hence the constant is 2^15x smaller

@csukuangfj
Copy link
Owner

Thanks! Will check it tomorrow morning.

@csukuangfj
Copy link
Owner

Thank you for your contribution!

@csukuangfj csukuangfj merged commit 2c7ecc6 into csukuangfj:master Mar 15, 2024
4 of 7 checks passed
@KarelVesely84 KarelVesely84 deleted the add_dithering branch March 20, 2024 17:36
@nshmyrev
Copy link

nshmyrev commented Apr 9, 2024

Hey, can we please change dither constant to be 1.0 like it was before. 0.00003 vs 1.0 can cause a lot of confusion in existing code.

@csukuangfj
Copy link
Owner

Could you create a PR to change it to 1.0?

@KarelVesely84
Copy link
Contributor Author

KarelVesely84 commented Apr 12, 2024

The problem is, that in some situations 1.0 is the correct value (for +/- 32k audio signal), while in other situation 0.00003 is the correct value (+/- 1.0 audio signal range), and both options are possible use-cases. So there will be confusion either on one or the other side. And the detection cannot be 100% reliable, as there can be an input signal with hard zero's in it.

So it is better to document it, and give a hint to the users.

But, it is true that if default is 1.0, being internally divided into 0.00003. And, occasionally it is used with +/-32k input singal, it will effectively disable the dithering. While as it is now, using 1.0 will break the system with +/-1.0 audio signal. So, changing the default to 1.0 is a safer way...

@nshmyrev
Copy link

Let it stay then, sorry for confusion and thanks for the patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants