DNNによる音源分離(PyTorch実装)
- v0.6.3
- 結果の更新.
- MDXチャレンジ2021の例を追加.
モジュール | 参考文献 | 実装 |
---|---|---|
Depthwise-separable convolution | ✔ | |
Gated Linear Units | ✔ | |
FiLM (Feature-wise Linear Modulation) | FiLM: Visual Reasoning with a General Conditioning Layer | ✔ |
PoCM (Point-wise Convolutional Modulation) | LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation | ✔ |
方法 | 参考文献 | 実装 |
---|---|---|
Pemutation invariant training (PIT) | Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks | ✔ |
One-and-rest PIT | Recursive Speech Separation for Unknown Number of Speakers | ✔ |
Probabilistic PIT | Probabilistic Permutation Invariant Training for Speech Separation | |
Sinkhorn PIT | Towards Listening to 10 People Simultaneously: An Efficient Permutation Invariant Training of Audio Source Separation Using Sinkhorn's Algorithm | ✔ |
Conv-TasNetによるLibriSpeechデータセットを用いた音源分離の例
<REPOSITORY_ROOT>/egs/tutorials/
で他のチュートリアルも確認可能.
cd <REPOSITORY_ROOT>/egs/tutorials/common/
. ./prepare_librispeech.sh --dataset_root <DATASET_DIR> --n_sources <#SPEAKERS>
cd <REPOSITORY_ROOT>/egs/tutorials/conv-tasnet/
. ./train.sh --exp_dir <OUTPUT_DIR>
学習を途中から再開したい場合,
. ./train.sh --exp_dir <OUTPUT_DIR> --continue_from <MODEL_PATH>
cd <REPOSITORY_ROOT>/egs/tutorials/conv-tasnet/
. ./test.sh --exp_dir <OUTPUT_DIR>
cd <REPOSITORY_ROOT>/egs/tutorials/conv-tasnet/
. ./demo.sh