-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pytorch istft #1
Comments
Ok I figured it out - it's windowing. My torch istft implementation assumes an istft matrix with a rectangular window (which is not intended).
(see the commented,) i.e., without specifying
|
Can't quickly find how to fix it though, gotta check later. |
@faroit @keunwoochoi |
@tz579 yes I know about this paper. I would love to see your implementation here. Maybe we can test drive it in this repo and than think about how we could help PyTorch to implemented the ISTFT... |
@faroit I did a bunch of modification/simplification, in addition to the implementation of the synthesis window. test_torch --> test_torch 1.0049663e-07 Note that I did not take time to configure tensorflow yet. Also, for librosa verison, I shrank the data type from complex128/float64 to complex64/float32, to be comparable with other versions, and thus the results is a little bit different with that of yours in a previous post. There is something more: Thus, I attached all of the 4 files: test{torch, librosa, scipy}.py, as well as the main file test.py, as all of them are slightly or heavily modified. The above results could be reproduced using the 4 files uploaded. The only problem is that the error is still a little bit higher for pytorch version (3~4 times larger than non-torch ones), either for the "cross" or the "within" ones. |
@tz579 thanks for the update.
I've updated to conda environment, it should now correctly install tensorflow.
that looks great. Could you add this as a PR to this repo? |
@tz579 this is a good point. I actually forgot to test for different hop sizes. I've updated the code and yes, I can confirm that higher overlaps will result in reconstruction errors. While all of the implementations seem to be able to do perfect reconstruction for higher overlap than *2, the differences between the packages when in comes to how they handle the synthesis windows are quite big. E.g. a hop of /4 of the @bmcfee do you have some insights of the differences for librosa and scipy for the ISTFT or how to get there by adopting the parameters? See librosa stft-istft-experiments/test_librosa.py Lines 6 to 11 in 7577397
vs. scipy stft-istft-experiments/test_scipy.py Lines 6 to 24 in 7577397
@nils-werner I remember you spend some time understanding how scipy has implemented ISTFT. Are they doing something "non-standard" over there? |
@faroit Thanks and I will make a PR once I'm ready to write out a brief description of the modifications.
Also, scipy gives weird results on Blackman/Hamming, "normal" results only on Hann/Bartlett, as mentioned in the same post of mine. It seems that scipy has some "non-canonical" implementation not only for stride/overlapping, but also for window functions? |
Not off the top of my head, no. Do you have a sense of where the errors pop up? Are they due to boundary effects, or do they pop up all the way through? |
A pull request has been made from my fork of your repo. I have done several modifications to remove the unnecessary changes from my side, leaving only "useful" ones. The only file that has significant modifications is test_torch.py, while test_scipy.py is also slightly modified to add functionality of choosing window type. Edit on Apr. 08, 2019: Pull request updated with error fixes, as well as better handling of input/output for optional batch dim, numpy/torch.tensor array types, and cpu/cuda device for torch.tensor. Also, I can confirm that after your fix of test_scipy.py, scipy version can handle non-1/2 stride now. However, scipy version still induce significant reconstruction error with Hamming window, just as what it has behaved all along our discussion. |
@bmcfee it was just a scale issue. I fixed it in 7242420. Now librosa and scipy have perfect reconstruction in any combination. Maybe you would like to update your gist? |
@tz579 thanks for your effort. Maybe we could bring in @f0k and @ksanjeevan... @f0k the goal here is not to find a good API or to write a wrapper that works with all STFT flavors. Instead to find STFT/ISTFT parameters that would allow one to interchange the library. This is useful e.g. if you want to run the synthesis operation of the deep learning framework. I think it would be okay to find a working set of parameters like hann = half-overlap/quarter overlap which would work for most applications. |
Hey, am I too demanding in the torchaudio redesign? :)
In order to find good defaults for pytorch's istft? From what I've read in the thread, you can do good reconstructions with many combinations now? Sorry, I guess I haven't understood what the question is. Just a brief note that might be relevant, but might be known to you anyway: >>> np.allclose(np.hamming(k), torch.hamming_window(k))
False
>>> np.allclose(np.hamming(k), torch.hamming_window(k, periodic=False))
True |
@tz579 any update on hopsize != win/2? Did you try the updated deconv implementation from @seungwonpark? |
For non-1/2 hopsize issue, please look at my previous post to you, where I have confirmed that your modification has resolved this problem for Scipy. For deconv implementation by seungwonpark, I have tried an older version but not the newest one, since it does not fit my need in my current project. However, for the version in the pull request I made to your project, I have made several acceleration attempt, mainly to reduce the levels of loops (now rfft function would only be called once during the whole process), and it has been tested to be significantly faster, from my project. You could try to merge my pull request in your private branch and try it. |
continue the discussion from here: keunwoochoi/torchaudio-contrib#27 (comment)
@keunwoochoi I already modified the unpad, see here: https://github.com/faroit/stft-istft-experiments/blob/master/test_torch.py#L85, that didn't help, though
The text was updated successfully, but these errors were encountered: