Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update MelResNet #751

Merged
merged 3 commits into from
Jun 29, 2020
Merged

Update MelResNet #751

merged 3 commits into from
Jun 29, 2020

Conversation

jimchen90
Copy link
Contributor

@jimchen90 jimchen90 commented Jun 26, 2020

This is to update the variable names and docstring in #705

Stack:

Add MelResNet Block #705, #751
Add Upsampling Block #724
Add WaveRNN Model #735
Add example pipeline with WaveRNN #749

@jimchen90 jimchen90 requested a review from vincentqb June 26, 2020 13:15
@codecov
Copy link

codecov bot commented Jun 26, 2020

Codecov Report

Merging #751 into master will decrease coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #751      +/-   ##
==========================================
- Coverage   89.24%   89.22%   -0.02%     
==========================================
  Files          32       32              
  Lines        2519     2515       -4     
==========================================
- Hits         2248     2244       -4     
  Misses        271      271              
Impacted Files Coverage Δ
torchaudio/models/_wavernn.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 793eeab...434ba88. Read the comment docs.

@jimchen90 jimchen90 changed the title Update variable names and docstring in MelResNet Update MelResNet Jun 26, 2020
Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. I pointed out a few nits. Please mark as "ready for review" when ready.

Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu. arXiv:1802.08435, 2018.
r"""ResNet block based on "Deep Residual Learning for Image Recognition"

The paper link is https://arxiv.org/pdf/1512.03385.pdf. The input signal is spectrogram.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "is a spectrogram"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
I also updated this change in the _wavernn class in #735

n_batch = 2
n_time = 200
n_freq = 100
n_output = 256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the output shape changing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I ran the test, I tried a different input value to double check if the test passes. Do you think I need to change it back to n_output = 128?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for clarifying. Let's change it back to what it was.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks.

torchaudio/models/_wavernn.py Outdated Show resolved Hide resolved
torchaudio/models/_wavernn.py Outdated Show resolved Hide resolved
Examples::
>>> resblock = _ResBlock(num_dims=128)
Examples
>>> resblock = _ResBlock()
>>> input = torch.rand(10, 128, 512)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the example, let's make it clear we are discussing a spectrogram:

input = torch.rand(10, 128, 512)  # a random spectrogram
output = resblock(input)  # shape: (10, 128, 512)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
I also updated this change in #735 and #724 .

@@ -36,18 +36,20 @@ def test_mfcc(self):
class TestMelResNet(common_utils.TorchaudioTestCase):

def test_waveform(self):
"""test the output dimensions after _MelResNet block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Validate the output dimensions of a _MelResNet block."

Copy link
Contributor Author

@jimchen90 jimchen90 Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
I also updated this change in #735 and #724 with their own parts.

torchaudio/models/_wavernn.py Outdated Show resolved Hide resolved
torchaudio/models/_wavernn.py Outdated Show resolved Hide resolved
@jimchen90 jimchen90 marked this pull request as ready for review June 26, 2020 16:51
Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Please revert the n_output value in the test, and kernel_size if changed just for experimenting.

Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu. arXiv:1802.08435, 2018.
r"""ResNet block based on "Deep Residual Learning for Image Recognition"

The paper link is https://arxiv.org/pdf/1512.03385.pdf. The input signal is a spectrogram.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: after re-reading this, I would simply remove "The input signal is a spectrogram." since you are already describing the arguments below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I removed this sentence. Thanks.

@jimchen90 jimchen90 requested a review from mthrok June 26, 2020 17:03
@jimchen90 jimchen90 mentioned this pull request Jun 26, 2020
@jimchen90 jimchen90 merged commit 878d3da into pytorch:master Jun 29, 2020
mthrok pushed a commit to mthrok/audio that referenced this pull request Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants