Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS] porting VITS implementation #5600

Merged
merged 282 commits into from
Jan 25, 2023
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
282 commits
Select commit Hold shift + click to select a range
0a7a3cb
Disable loss typecheck
jasonjjl1999 Nov 24, 2021
e56a539
Fix spectrogram lengths
jasonjjl1999 Dec 23, 2021
df7e996
Remove Precision 16 requirement
jasonjjl1999 Dec 23, 2021
56ae317
Address lgtm alerts
Jan 20, 2022
340d9f0
clean up unused code
jasonjjl1999 Jan 20, 2022
ec57957
Merge branch 'vits' of https://github.com/jasonjjl1999/NeMo into vits
jasonjjl1999 Jan 20, 2022
73dbb98
Address lgtm alerts
Jan 20, 2022
03cd8fa
Fix logging issues
Jan 20, 2022
6aa5fc9
Refactor audio_to_mel_torch method
Jan 20, 2022
817db70
Use NeMo FilterBank to get melspec
Jan 20, 2022
37cb9e5
Fix filterbank max frequency to match with original VITS
jasonjjl1999 Jan 20, 2022
153524b
merge
jasonjjl1999 Jan 20, 2022
b6e24ae
Fix filterbank features correct length
jasonjjl1999 Jan 20, 2022
05627e4
Address lgtm issues
Jan 20, 2022
697331b
Remove print statements
jasonjjl1999 Jan 20, 2022
4ea74b8
Remove stft_pad_amount
jasonjjl1999 Jan 20, 2022
4cf6ffe
new structure for tts datasets in script folder
Oktai15 Jan 21, 2022
b37f56e
remove cmudict downloading
Oktai15 Jan 24, 2022
5675ba2
Merge branch 'main' into upd_tts_ds_processing
Oktai15 Jan 24, 2022
d2d7c6d
rename mixertts dataset, add vocoder dataset
Oktai15 Jan 24, 2022
e7f95f8
Merge branch 'main' into upd_tts_ds_processing
Oktai15 Jan 24, 2022
b9a3677
Merge branch 'main' into upd_tts_ds_processing
Oktai15 Jan 24, 2022
7eb3dab
Merge branch 'upd_tts_ds_processing' of github.com:NVIDIA/NeMo into u…
Oktai15 Jan 24, 2022
8164da4
add libritts processing
Oktai15 Jan 24, 2022
cd5c041
update tts dataset and libritts get data
Oktai15 Jan 25, 2022
a865637
fix bugs in vocoder ds
Oktai15 Jan 25, 2022
6749a3d
add ds
treacker Jan 26, 2022
a1e2bec
changed vits yaml
treacker Jan 26, 2022
8fb55a2
upd
treacker Jan 26, 2022
1f5b367
rm yaml
treacker Jan 26, 2022
a2d7726
merged vits
treacker Jan 26, 2022
34f3429
fix yaml and model
treacker Jan 27, 2022
33d21d3
Added scaler
treacker Jan 28, 2022
845257e
refactored yaml
treacker Jan 28, 2022
d6ff4c7
managed to run in fp16
treacker Jan 28, 2022
d6da3c9
Merge branch 'vits_exp' of github.com:NVIDIA/NeMo into vits_exp
Oktai15 Jan 29, 2022
733e6b4
refactoring
Oktai15 Jan 29, 2022
70f3171
fix small bugs and add new todos
Oktai15 Jan 30, 2022
6be1cee
fix optimizers
Oktai15 Jan 30, 2022
3f8ca4c
Port Variational Inference with Adversarial Learning (VITS) to NeMo T…
jasonjjl1999 Feb 3, 2022
012d88f
make new commit
blisc Feb 3, 2022
cd68360
add copyright headers
blisc Feb 3, 2022
4ab8463
style
blisc Feb 3, 2022
ac0e33f
merge with evgeny's branch
blisc Feb 3, 2022
d7a6ffb
Merge commit '9f95457e3f3781b58878f09d6c251bdaea5c4a55' into vits_exp
Oktai15 Feb 4, 2022
1160fe7
rename README
Oktai15 Feb 4, 2022
e842e04
Merge remote-tracking branch 'blisc_nemo/vits_model_merged' into vits…
Oktai15 Feb 4, 2022
00ca79e
fix style without vits_modules
Oktai15 Feb 4, 2022
7025270
add numba code, fix style and add todos
Oktai15 Feb 4, 2022
3d17d6f
small fix
treacker Feb 7, 2022
eb8bea5
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Feb 7, 2022
9700bbe
fix some todos
treacker Feb 7, 2022
24df2c5
added numba mas
treacker Feb 14, 2022
35b0cd4
added DDP sampler
treacker Feb 14, 2022
6e436d0
specified versions
treacker Feb 14, 2022
9f6ff8f
fixed for new librosa version
treacker Feb 14, 2022
97cd6cc
added feature loss
treacker Feb 14, 2022
7c01dca
added IPA phonemizer
treacker Feb 21, 2022
9ae3f9d
refactored IPA g2p
treacker Feb 22, 2022
cfd95c0
added vits losses
treacker Feb 24, 2022
df4a866
some ref
treacker Mar 15, 2022
dcd2596
fix
treacker Mar 28, 2022
783a9a9
added checkpointing
treacker Apr 5, 2022
2bc0ac1
cp
treacker Apr 6, 2022
4b491bb
cfg
treacker Apr 7, 2022
0736b27
merged some 1.8.0 fixes
treacker Apr 7, 2022
38a93b4
plt fix
treacker Apr 8, 2022
0f50fce
fix logging
treacker Apr 9, 2022
e312e96
fix checkpoint loading
treacker Apr 10, 2022
fe3901a
refactored inference
treacker Apr 20, 2022
5e38eba
fp32 run
treacker Apr 21, 2022
47741d4
update branch
ericharper May 2, 2022
a775320
update package info
ericharper May 2, 2022
e944e13
new exp
treacker May 2, 2022
bfd8a36
pull main
ericharper May 4, 2022
d6883e3
update branch
ericharper May 4, 2022
0623b01
Restored tests previously disabled for 22.03 base (#4109)
borisfom May 4, 2022
c10fed4
add augmentation to label models (#4113)
nithinraok May 5, 2022
d4f6d75
Call register_bert_model after assigning self.bert_model variable (#4…
ramanathan831 May 5, 2022
f3df343
Tutorial on ITN with Thutmose tagger and small fixes (#4117)
bene-ges May 5, 2022
6c16139
cleaned up TN/ ITN doc (#4119)
yzhang123 May 6, 2022
0fc2044
Check implicit grad acc in GLUE dataset building (#4123)
MaximumEntropy May 6, 2022
6fd6254
update the default (#4135)
ekmb May 9, 2022
54f6bbf
Draft: Fix restoring from checkpoint for case when `model.common_data…
PeganovAnton May 9, 2022
9f5b443
fix typo (#4140)
yzhang123 May 10, 2022
3389242
Fix/punctuation avoid overwritting tmp files (#4144)
PeganovAnton May 10, 2022
c339c04
bug_fix_diarization_manifest_creation (#4125)
yzhang123 May 10, 2022
df33239
fix doc (#4146)
yzhang123 May 11, 2022
30db4d4
Tacotron2 retrain (#4103)
treacker May 11, 2022
1f3788e
Multiprocess improvements (#4127)
nithinraok May 11, 2022
46162c6
WaveGlow input type fixes (#4151)
redoctopus May 11, 2022
b34609f
notebooks' link, typo and import fix (#4158)
fayejf May 12, 2022
0704e14
Thutmose tagger bug fixes (#4162)
bene-ges May 12, 2022
4bbe6fb
update speaker docs (#4164)
nithinraok May 13, 2022
52e5b25
changed to vits g2p
treacker May 15, 2022
deb8267
refactoring
treacker May 15, 2022
2d16d9e
Merge branch 'r1.9.0' of https://github.com/NVIDIA/NeMo into vits_exp
treacker May 15, 2022
861b81f
Merge branch 'main' of https://github.com/NVIDIA/NeMo into vits_exp
treacker May 23, 2022
58b2f4e
added cosineLR
treacker May 25, 2022
57b0c8b
Updated whitelist path
treacker May 25, 2022
f087eb7
added vanilla torch grad scaler
treacker Jun 8, 2022
e476eb0
Fixed lightning version
treacker Jun 8, 2022
af0c679
added warmup and wd
treacker Jun 8, 2022
17b4d4e
switched to cosineLR
treacker Jun 12, 2022
7510e71
refactored data classes for vits
treacker Jun 20, 2022
8d0f725
some fixes
treacker Jun 20, 2022
aadcb32
fixed import
treacker Jun 20, 2022
8bfe370
changeg train loop
treacker Jun 20, 2022
3be013c
fixed scheduler bug
treacker Jun 22, 2022
fd03723
refactoring for exps
treacker Jul 12, 2022
5d43cc3
Refactored loss logic
treacker Jul 26, 2022
97ac086
Ref for exps
treacker Aug 2, 2022
16eeacb
added coqui stuff
treacker Aug 3, 2022
8769c57
exps
treacker Aug 6, 2022
d49aeed
bugfix
treacker Aug 8, 2022
a68e965
added side file
treacker Aug 22, 2022
5d6fd1f
bugfix
treacker Aug 22, 2022
44be3ad
reverted
treacker Aug 23, 2022
1d37c1e
fixed sampler behaviour
treacker Aug 29, 2022
11c9828
updated for ptl 1.7.2
treacker Aug 30, 2022
7b6b95c
refactored dataloader func
treacker Sep 5, 2022
e30f91e
some cleaning
treacker Sep 6, 2022
170c76b
reverted to vanilla loss
treacker Sep 6, 2022
c4e537e
modified for pickling
treacker Sep 13, 2022
7934c5b
added dataset class
treacker Sep 17, 2022
68c93f9
fixed torch version
treacker Sep 18, 2022
600decb
added autocast for fp training
treacker Sep 19, 2022
7017638
removed coqui files
treacker Sep 20, 2022
abc1b28
'Fixed tokenizer'
treacker Oct 14, 2022
ec5a668
Fix tokenizer
treacker Oct 14, 2022
f806673
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Oct 14, 2022
ae529c5
update branch
ericharper Oct 25, 2022
15a53ec
Fix link to inference notebook (#5247)
redoctopus Oct 26, 2022
16800d4
Update ASR scores table (#5254)
titu1994 Oct 27, 2022
2612f48
Fix links to speaker identification notebook (#5260)
SeanNaren Oct 27, 2022
ead8cc4
Minor typo fixes in TTS tutorial (#5266)
redoctopus Oct 28, 2022
7a072ff
Pcla tutorial fixes (#5271)
jubick1337 Oct 28, 2022
498b61d
Fix bug into Dialogue tutorial (#5277)
Zhilin123 Oct 29, 2022
80bf342
Typo fix (#5288)
jubick1337 Oct 31, 2022
26e3e1d
Fix dialogue tutorial bug (#5297)
Zhilin123 Nov 1, 2022
42f6ac9
small bugfix for r1.13.0 (#5310)
fayejf Nov 4, 2022
df95923
Add italian model checkpoints (#5316)
Kipok Nov 4, 2022
fbd17ad
[STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (#5340)
ssh-meister Nov 7, 2022
8184bea
Pcla tutorial fixes (#5313)
jubick1337 Nov 8, 2022
b791efc
a lot of refactoring
treacker Nov 8, 2022
5b4773e
Merge remote-tracking branch 'origin/r1.13.0' into vits_exp
treacker Nov 8, 2022
5866211
strict ptl version
treacker Nov 8, 2022
93a73ec
strict ptl version
treacker Nov 8, 2022
67d2317
reverted plt version
treacker Nov 8, 2022
e2adafb
Added base text2audio class
treacker Nov 8, 2022
f67fe95
Fix issue with HF Model upload tutorial (#5359)
titu1994 Nov 8, 2022
4eb4351
tutorial fixes (#5354)
jubick1337 Nov 8, 2022
9c3f358
Add SDP documentation (#5274)
erastorgueva-nv Nov 9, 2022
6d9a8d2
[Bugfix] Added rm -f / wget- nc command in multispeaker sim notebook …
tango4j Nov 9, 2022
5d97264
Rename Speech Dataset Processor to Speech Data Processor (#5378)
erastorgueva-nv Nov 9, 2022
e57b3ec
fix for num worker 0 causing issues in losses after 1 epoch (#5379)
arendu Nov 10, 2022
f8f31a1
Fixed bug in notebook (#5382)
vadam5 Nov 10, 2022
4c9c858
Force MHA QKV onto fp32 (#5391)
titu1994 Nov 10, 2022
f789334
Added scheduling variety
treacker Nov 14, 2022
a77b1b3
ref
treacker Nov 14, 2022
dbe41af
Fix for prompt table restore error (#5393)
vadam5 Nov 14, 2022
1b5fac4
Fix args (#5410)
MaximumEntropy Nov 15, 2022
acd34da
bugfix
treacker Nov 15, 2022
9d98c52
import tests
treacker Nov 15, 2022
0718b17
Add temporary fix for CUDA issue in Dockerfile (#5421)
yaoyu-33 Nov 15, 2022
68cd1a7
Megatron Export Update (#5343)
Davood-M Nov 15, 2022
cbf6862
disable pc test (#5426)
ekmb Nov 15, 2022
b211849
Fix GPT generation when using sentencepiece tokenizer (#5413)
MaximumEntropy Nov 15, 2022
01cd8b6
Disable sync_batch_comm in validation_step for GPT (#5397)
ericharper Nov 16, 2022
792fc8a
Revert "Add temporary fix for CUDA issue in Dockerfile (#5421)" (#5431)
yaoyu-33 Nov 16, 2022
988dedb
Revert workaround for T5 that sets number of workers to 0 & sync_batc…
MaximumEntropy Nov 16, 2022
8615ab6
Fixed discrepancies
treacker Nov 16, 2022
f682acf
updated Jenkisfile
treacker Nov 16, 2022
719c55f
updated Jenkisfile
treacker Nov 16, 2022
ad67753
Cleaning
treacker Nov 16, 2022
738e37d
fixed the onnx bug in conformer for non-streaming models. (#5242) (#5…
artbataev Nov 17, 2022
c170e03
Set sync_batch_comm in other places (#5448)
MaximumEntropy Nov 17, 2022
542ab14
Radtts 1.13 (#5451)
borisfom Nov 18, 2022
4b48ea8
Radtts 1.13 plus (#5457)
borisfom Nov 21, 2022
8552c95
Add num layers check (#5470)
MaximumEntropy Nov 21, 2022
4a523ad
Change to kwargs (#5475)
MaximumEntropy Nov 22, 2022
959bddf
Support for finetuning and finetuning inference with .ckpt files & ba…
MaximumEntropy Nov 22, 2022
109fa13
export_utils bugfix (#5480)
Davood-M Nov 23, 2022
10966a1
Export fixes for Riva (#5496)
borisfom Nov 23, 2022
0418a1b
minor bug fix (#5521)
Davood-M Nov 29, 2022
53dae72
added set_start_method + function param bugfix (#5539)
Davood-M Dec 5, 2022
d9e0934
remove notebook (#5548)
ericharper Dec 5, 2022
cc49cda
Remove broadcast (#5558)
MaximumEntropy Dec 7, 2022
b0eec2b
cleaning
treacker Dec 7, 2022
cbac4f8
Fix all gather while writing to a file during T5 finetuning (#5561)
MaximumEntropy Dec 7, 2022
1ff05cc
update readme
ericharper Dec 7, 2022
0d007d7
Merge remote-tracking branch 'origin/r1.13.0' into vits_exp
treacker Dec 7, 2022
fd05dd2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 7, 2022
c299066
added copyright
treacker Dec 7, 2022
258dd67
fixed imports
treacker Dec 7, 2022
4d86224
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 7, 2022
833a522
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 7, 2022
dae9c4f
cleaning
treacker Dec 7, 2022
abb5eff
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 7, 2022
c7fee0a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 7, 2022
ed4b0b5
fixed filesize check
treacker Dec 7, 2022
851c1bc
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 7, 2022
0414f42
last cleaning
treacker Dec 11, 2022
b1c63ae
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Dec 11, 2022
d667dfb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 11, 2022
7e98780
updated cmudict path
treacker Dec 11, 2022
2ab9f27
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 11, 2022
67ba20a
fixed merge bug
treacker Dec 11, 2022
c65b956
warnings fix
treacker Dec 12, 2022
f539675
Merge branch 'main' into vits_exp
treacker Dec 12, 2022
fe63ee5
fix warnings
treacker Dec 12, 2022
411770b
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Dec 13, 2022
591d74a
storing
treacker Dec 13, 2022
400326a
Merge branch 'main' into vits_exp
treacker Dec 13, 2022
9077510
updated version
treacker Dec 13, 2022
342b5d2
update Jenkinsfile versions
treacker Dec 13, 2022
0cfa929
fixed issues
treacker Dec 13, 2022
8926e54
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Dec 14, 2022
d2ac6ed
fixed more issues
treacker Dec 14, 2022
8bfafea
more fixes
treacker Dec 14, 2022
269c444
added experimental tag
treacker Dec 14, 2022
70f1c9c
Clarification updates
treacker Dec 15, 2022
65a1a69
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 15, 2022
4dff876
fix
treacker Dec 15, 2022
164d1f3
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 15, 2022
37bbd2e
remove old cython code
treacker Dec 15, 2022
1cd4041
remove old cython code
treacker Dec 15, 2022
c2e16ca
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 15, 2022
bf522e8
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 15, 2022
089fdf6
docstring fix
treacker Dec 15, 2022
65d7886
Enhancements
treacker Dec 22, 2022
828e5d8
Enhancements
treacker Dec 22, 2022
736666d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 22, 2022
7355a7f
imports fix
treacker Dec 22, 2022
d16d79f
fix imports
treacker Dec 22, 2022
fd67e21
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Dec 22, 2022
6d4a3db
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 22, 2022
fec3f99
fix typo
treacker Dec 22, 2022
2f05c63
Merge branch 'vits_exp' of https://github.com/NVIDIA/NeMo into vits_exp
treacker Dec 22, 2022
725f24f
excessive comtutations fix
treacker Dec 22, 2022
4763c7b
typecheck fix
treacker Dec 23, 2022
d6deedb
Small refactoring
treacker Jan 9, 2023
d52e6d5
Small refactoring
treacker Jan 13, 2023
ab868d9
reversed exp_manager params
treacker Jan 13, 2023
c5672d3
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Jan 17, 2023
120a736
Fixed call for new function signature
treacker Jan 18, 2023
4788bfd
merging main
treacker Jan 19, 2023
e6547cb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 19, 2023
ddb6259
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Jan 24, 2023
4b47ca3
Merge remote-tracking branch 'origin/main' into vits_exp
treacker Jan 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -4472,4 +4472,4 @@ assert_frame_equal(training_curve, gt_curve, rtol=1e-3, atol=1e-3)"'''
cleanWs()
}
}
}
}
215 changes: 215 additions & 0 deletions examples/tts/conf/vits.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
# This config contains the default values for training VITS model on LJSpeech dataset.
# If you want to train model on other dataset, you can change config values according to your dataset.
# Most dataset-specific arguments are in the head of the config file, see below.

# TODO: remove unnecessary arguments, refactoring

name: VITS

train_dataset: ???
validation_datasets: ???
sup_data_path: null
sup_data_types: null

phoneme_dict_path: "scripts/tts_dataset_files/ipa_cmudict-0.7b_nv22.10.txt"
heteronyms_path: "scripts/tts_dataset_files/heteronyms-052722"
whitelist_path: "nemo_text_processing/text_normalization/en/data/whitelist/lj_speech.tsv"

# Default values from librosa.pyin
pitch_fmin: 65.40639132514966
pitch_fmax: 2093.004522404789

sample_rate: 22050
n_mel_channels: 80
n_window_size: 1024
n_window_stride: 256
n_fft: 1024
lowfreq: 0
highfreq: null
window: hann

model:
pitch_fmin: ${pitch_fmin}
pitch_fmax: ${pitch_fmax}

sample_rate: ${sample_rate}
n_mel_channels: ${n_mel_channels}
n_window_size: ${n_window_size}
n_window_stride: ${n_window_stride}
n_fft: ${n_fft}
lowfreq: ${lowfreq}
highfreq: ${highfreq}
window: ${window}
mel_fmin: 0.0
mel_fmax: null

n_speakers: 0
segment_size: 8192
c_mel: 45
c_kl: 1.
use_spectral_norm: false

text_normalizer:
_target_: nemo_text_processing.text_normalization.normalize.Normalizer
lang: en
input_case: cased
whitelist: ${whitelist_path}

text_normalizer_call_kwargs:
verbose: false
punct_pre_process: true
punct_post_process: true

text_tokenizer:
_target_: nemo.collections.common.tokenizers.text_to_speech.tts_tokenizers.IPATokenizer
punct: true
apostrophe: true
pad_with_space: false
g2p:
_target_: nemo_text_processing.g2p.modules.IPAG2P
phoneme_dict: ${phoneme_dict_path}
heteronyms: ${heteronyms_path}
phoneme_probability: 0.8
# Relies on the heteronyms list for anything that needs to be disambiguated
ignore_ambiguous_words: false
use_chars: true
use_stresses: true

train_ds:
dataset:
_target_: "nemo.collections.tts.torch.data.TTSDataset"
manifest_filepath: ${train_dataset}
sample_rate: ${model.sample_rate}
sup_data_path: ${sup_data_path}
sup_data_types: ${sup_data_types}
n_fft: ${model.n_fft}
win_length: ${model.n_window_size}
hop_length: ${model.n_window_stride}
window: ${model.window}
n_mels: ${model.n_mel_channels}
lowfreq: ${model.lowfreq}
highfreq: ${model.highfreq}
max_duration: null
min_duration: 0.1
ignore_file: null
trim: False
pitch_fmin: ${model.pitch_fmin}
pitch_fmax: ${model.pitch_fmax}

dataloader_params:
num_workers: 8
pin_memory: false

batch_sampler:
batch_size: 32
boundaries: [32,300,400,500,600,700,800,900,1000]
num_replicas: ${trainer.devices}
shuffle: true

validation_ds:
dataset:
_target_: "nemo.collections.tts.torch.data.TTSDataset"
manifest_filepath: ${validation_datasets}
sample_rate: ${model.sample_rate}
sup_data_path: ${sup_data_path}
sup_data_types: ${sup_data_types}
n_fft: ${model.n_fft}
win_length: ${model.n_window_size}
hop_length: ${model.n_window_stride}
window: ${model.window}
n_mels: ${model.n_mel_channels}
lowfreq: ${model.lowfreq}
highfreq: ${model.highfreq}
max_duration: null
min_duration: 0.1
ignore_file: null
trim: False
pitch_fmin: ${model.pitch_fmin}
pitch_fmax: ${model.pitch_fmax}

dataloader_params:
drop_last: false
shuffle: false
batch_size: 16
num_workers: 4
pin_memory: false

preprocessor:
_target_: nemo.collections.asr.parts.preprocessing.features.FilterbankFeatures
nfilt: ${model.n_mel_channels}
highfreq: ${model.highfreq}
log: true
log_zero_guard_type: clamp
log_zero_guard_value: 1e-05
lowfreq: ${model.lowfreq}
n_fft: ${model.n_fft}
n_window_size: ${model.n_window_size}
n_window_stride: ${model.n_window_stride}
pad_to: 1
pad_value: 0
sample_rate: ${model.sample_rate}
window: ${model.window}
normalize: null
preemph: null
dither: 0.0
frame_splicing: 1
stft_conv: false
nb_augmentation_prob : 0
mag_power: 1.0
exact_pad: true
use_grads: true

synthesizer:
_target_: nemo.collections.tts.modules.vits_modules.SynthesizerTrn
inter_channels: 192
hidden_channels: 192
filter_channels: 768
n_heads: 2
n_layers: 6
kernel_size: 3
p_dropout: 0.1
resblock: "1"
resblock_kernel_sizes: [3,7,11]
resblock_dilation_sizes: [[1,3,5], [1,3,5], [1,3,5]]
upsample_rates: [8,8,2,2]
upsample_initial_channel: 512
upsample_kernel_sizes: [16,16,4,4]
n_speakers: ${model.n_speakers}
gin_channels: 256 # for multi-speaker

optim:
_target_: torch.optim.AdamW
lr: 2e-4
betas: [0.9, 0.99]
eps: 1e-9

sched:
name: ExponentialLR
lr_decay: 0.999875

trainer:
num_nodes: 1
devices: 2
accelerator: gpu
strategy: ddp
precision: 32
# amp_backend: 'apex'
# amp_level: 'O2'
# benchmark: true
max_epochs: -1
accumulate_grad_batches: 1
enable_checkpointing: false # Provided by exp_manager
logger: false # Provided by exp_manager
log_every_n_steps: 50
check_val_every_n_epoch: 1

exp_manager:
exp_dir: ???
name: ${name}
create_tensorboard_logger: true
create_checkpoint_callback: true
checkpoint_callback_params:
monitor: loss_gen_all
mode: min
resume_if_exists: false
resume_ignore_no_checkpoint: false
Loading