-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clap processor: remove wasteful np.stack operations #27454
Conversation
Np.stack on large 1-D tensor, causing ~0.5s processing time on short audio (<10s). Compared to 0.02s for medium length audio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating @m-bain!
Could you provide an example snippet you used to run this for future reference of anyone visiting this PR?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
@amyeroberts hows this ? import time
import numpy as np
waveform = np.random.rand(100_000)
n_repeat = 10
t1_p = time.time()
prev_impl = np.stack(np.tile(waveform, n_repeat))
t2_p = time.time()
t1_n = time.time()
new_impl = np.tile(waveform, n_repeat)
t2_n = time.time()
assert (prev_impl == new_impl).all()
print(f"Time to process [prev. impl.]: {t2_p-t1_p:.3f}s")
print(f"Time to process [new. impl.]: {t2_n-t1_n:.3f}s")
|
@m-bain Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch! 🤗
remove wasteful np.stack Np.stack on large 1-D tensor, causing ~0.5s processing time on short audio (<10s). Compared to 0.02s for medium length audio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove wasteful np.stack Np.stack on large 1-D tensor, causing ~0.5s processing time on short audio (<10s). Compared to 0.02s for medium length audio
What does this PR do?
Upon profiling, it showed some strange result that the ClapProcessor was taking 0.5s to apply
_get_input_mel(...)
on short audio (less than 10s), whereas medium length audio (10s-20s) was taking only 0.02sAs it turns out there was a wasteful
np.stack
operation on the 1-D waveform numpy array, meaning that the 1-D array is unpacked then stacked back together again, with no effect. This PR removes this wasteful op and short audio is now also processed in 0.02sBefore submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker
@sanchit-gandhi