-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISTFT #27
Comments
Great. Did you did test the reconstruction error? |
I think line 33-40 can be replaced with transpose convolution. (I've done that in my private repository, but also didn't make a PR to PyTorch since it's not written in C like |
@faroit Could you possibly test by yourself? Because I did, and the error was
@seungwonpark Yes. I think there's pros/cons though. That said, we can simply time it and compare :) so yes, please share it! Even better if you can measure them. |
@keunwoochoi Thanks for pointing out the difference! I'll let you know when I'm ready to share. |
Here it is. I've also measured the time performance, and looks like your implementation is much faster! By the way, I can't find out the actual difference between them. The major difference is that mine uses |
@keunwoochoi cool, I tested yours in a small experiment where I run STFT of one library (librosa, scipy, tf, torch) against ISTFT of another and measured the reconstruction RMSE.
i think this is because of the combination of half overlap (you used //4) the |
@f0k btw. do you know a good way to write a unit test for STFT/ISTFT errors due to windowing when masking was applied before the ISTFT, e.g. normalizing using the window instead of the squared window....? I was always wondering how to catch these.... |
It’s probably due to the boundary. I don’t pre-pad, if you ignore the first and last half-windows in the eval it should be much better for the moment. |
it would be interesting to compare with https://github.com/wslihgt/pyfasst/tree/master/pyfasst/tftransforms |
cool. There also https://github.com/nils-werner/stft which I used for quite a while. I think we should stick to scipy.signal and librosa since they are heavily tested. However, I'm super interested to spot the differences and discuss which one has the best reconstruction performance.. but maybe thats out of scope for this repo ;-) |
Hi, recently we've found out that our previous time profiling code in my repo for comparing Choi's implementation(rfft) and mine(deconvolution with parallel ops) was wrong, and fixed it. And the new conclusion is: deconvolution is much faster! |
I think we should close this then and move this initiative over to pytorch/pytorch#3775 @seungwonpark are you willing to help out over there? |
Yeah, we could close it, maybe re-open once we wanna work on a wrapper for it. |
@faroit Hey, so after closing it I think we sort of lost the track. I don't think it's actually necessary to wait for the pytorch istft. There are also some implementation details (e.g., deconv like @seungwonpark's vs rfft as I've done), but maybe we have our own tentative implementation with a less-tentative API? Seems like critical issues were resolved (faroit/stft-istft-experiments#1), and we can follow the API of |
|
Okay, what about we open up an PR with an API proposal to address ISTFT in pytorch/pytorch#3775. Then we could
|
You mean a PR here? (Yes let's do it) |
On pytorch. Yes we could just add this ISTFT but then it's rather hacky (using conv layers) and would mean it won't be that easy to make it clean and readable. And once we reached that someone would have to start from scratch using libtorch, thus it wouldn't particularly sustainable ;-) |
Ok, I'm not really sure if they'd care an API only. Given the history, they've been adapting what had implemented outside of Pytorch quite actively (in other words, somehow blindly if it seems good), at least for the case of audio processing. So, how about we do it here, maybe
? It wouldn't block a follow-up dev while we refining it. |
(continued) ..Or, BOTH?! Since we have control and flexibility here -- and the API for |
Minor clarification question: If the STFT coefficients are allowed to evolve & train via autograd (I assume they are, yes?), then is the idea of for this ISTFT module to be either (a) sort of a 'trainable' ISTFT which may not remain a perfect inverse to the trainable STFT? i.e. it is independent from the STFT routine? (@seungwonpark's deconv version looks that way) I'm guessing (a). Yes? Mainly because I have no idea how you'd do (b), and can think of plenty of uses for (a), and proper training could enforce arbitrary levels of inversion-quality if you wanted it. |
I also guess it'd be (a) in the case, i.e., the provided ISTFT is a static, non-trained module. |
Ohhh, static & non-trained. Interesting. Thanks for the clarification. |
ISTFT is being developed in the main repo pytorch/audio#130 |
https://gist.github.com/keunwoochoi/2f349e72cc941f6f10d4adf9b0d3f37e
It works, but I'll probably not gonna make a PR to Pytorch (pytorch/pytorch#3775) because it's written in Python (
torch.stft
is written in C). Just for a future reference/usage.The text was updated successfully, but these errors were encountered: