From c5bfee9b73bb43125d55769cf8e94ce4ff09a8b7 Mon Sep 17 00:00:00 2001 From: bshall Date: Tue, 19 May 2020 12:17:10 +0200 Subject: [PATCH] Update README --- README.md | 10 +++++----- config/training/cpc.yaml | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index d1bc587..79f2343 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,14 @@ # Vector-Quantized Contrastive Predictive Coding -To learn discrete representations of speech for the [ZeroSpeech challenges](https://zerospeech.com/), we propose vector-quantized contrastive predictive coding. -An encoder maps input speech into a discrete sequence of codes. -Next, an autoregressive model summarises the latent representation (up until time t) into a context vector. -Using this context, the model learns to discriminate future frames from negative examples sampled randomly from other utterances. -Finally, an RNN based vocoder is trained to generate audio from the discretized representation. +Train and evaluate the VQ-VAE model for our submission to the [ZeroSpeech 2020 challenge](https://zerospeech.com/). +Voice conversion samples can be found [here](https://bshall.github.io/VectorQuantizedCPC/). +Pretrained weights for the 2019 English and Indonesian datasets can be found [here](https://github.com/bshall/VectorQuantizedCPC/releases/tag/v0.1). +Leader-board for the ZeroSpeech 2020 challenge can be found [here](https://zerospeech.com/2020/results.html).

VQ-CPC model summary + Fig 1: VQ-CPC model architecture.

## Requirements diff --git a/config/training/cpc.yaml b/config/training/cpc.yaml index 9fbcdae..eba7d52 100644 --- a/config/training/cpc.yaml +++ b/config/training/cpc.yaml @@ -4,7 +4,7 @@ training: n_utterances_per_speaker: 8 n_prediction_steps: 12 n_negatives: 17 - n_epochs: 25000 + n_epochs: 22000 scheduler: warmup_epochs: 150 initial_lr: 1e-5