Merge pull request #349 from claritychallenge/347-bug-icassp-2024-is-…

…not-generating-all-the-data 347 bug icassp 2024 is not generating all the data
claritychallenge · Sep 20, 2023 · dd866c1 · dd866c1
2 parents 8c195bf + 32f86ba
commit dd866c1
Show file tree

Hide file tree

Showing 5 changed files with 64 additions and 61 deletions.
diff --git a/README.md b/README.md
@@ -18,10 +18,11 @@
 [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/claritychallenge/clarity/main.svg)](https://results.pre-commit.ci/latest/github/claritychallenge/clarity/main)
 [![Downloads](https://pepy.tech/badge/pyclarity)](https://pepy.tech/project/pyclarity)
 
-[![PyPI](https://img.shields.io/static/v1?label=ICASSP%202024%20Cadenza%20Challenge%20-%20pypi&message=v0.4.0&color=orange)](https://pypi.org/project/pyclarity/0.4.0/)
+[![PyPI](https://img.shields.io/static/v1?label=ICASSP%202024%20Cadenza%20Challenge%20-%20pypi&message=v0.4.1&color=orange)](https://pypi.org/project/pyclarity/0.4.1/)
 [![PyPI](https://img.shields.io/static/v1?label=CAD1%20and%20CPC2%20Challenges%20-%20pypi&message=v0.3.4&color=orange)](https://pypi.org/project/pyclarity/0.3.4/)
 [![PyPI](https://img.shields.io/static/v1?label=ICASSP%202023%20Challenge%20-%20pypi&message=v0.2.1&color=orange)](https://pypi.org/project/pyclarity/0.2.1/)
 [![PyPI](https://img.shields.io/static/v1?label=CEC2%20Challenge%20-%20pypi&message=v0.1.1&color=orange)](https://pypi.org/project/pyclarity/0.1.1/)
+
 [![ORDA](https://img.shields.io/badge/ORDA--DOI-10.15131%2Fshef.data.23230694.v.1-lightgrey)](https://figshare.shef.ac.uk/articles/software/clarity/23230694/1)
 </p>
 
@@ -88,7 +89,7 @@ pip install -e git+https://github.com/claritychallenge/clarity.git@main
 
 Current challenge
 
-- [The ICASSP 2024 Cadenza CHallenge](./recipes/cad_icassp_2024)
+- [The ICASSP 2024 Cadenza Challenge](./recipes/cad_icassp_2024)
 
 Previous challenges
 

diff --git a/recipes/cad_icassp_2024/baseline/README.md b/recipes/cad_icassp_2024/baseline/README.md
@@ -10,9 +10,8 @@ The ICASSP 2024 Cadenza Challenge dataset is based on the MUSDB18-HQ dataset.
 To download the data, please visit [Download data and software](https://cadenzachallenge.org/docs/icassp_2024/take_part/download)
 webpage.
 
-The data is split into four packages: `cadenza_icassp2024_core.v1_0.tgz`,
-`cadenza_icassp2024_augmentation_medleydb.tar.gz`, `cadenza_icassp2024_augmentation_bach10.tar.gz`
-and `cadenza_icassp2024_augmentation_fma_small.tar.gz`.
+The data is split into four packages: `cad_icassp_2024_core.v1.1.tgz`,
+`cad_icassp_2024_train.v1.0.tgz`, and `cad_icassp_2024_medleydb.tgz`.
 
 Unpack packages under the same root directory using
 
@@ -22,43 +21,48 @@ tar -xvzf <PACKAGE_NAME>
 
 ### 1.1 Necessary data
 
-* **Core** contains the metadata and audio signal to generate the ICASSP 2024 dataset.
+* **Core** contains the metadata and HRTFs signals to generate the ICASSP 2024 dataset.
 
 ```text
 cadenza_data
 ├───audio
-|   ├───hrtf (336 kB)
-|   |   |  BTE_fr-VP_E1-n22.5.wav
-|   |   |  BTE_fr-VP_E1-n30.0.wav
-|   |   |  ...
-|   |
-|   └───music
-|       └───train (20.2 GB)
-|           ├───A Classic Education - NightOwl
-|           |   |  bass.wav
-|           |   |  drums.wav
-|           |   |  other.wav
-|           |   |  vocals.wav
-|           |   |  mixture.wav
-|           |
-|           ├───...
+|   └───hrtf (336 kB)
+|       |  BTE_fr-VP_E1-n22.5.wav
+|       |  BTE_fr-VP_E1-n30.0.wav
+|       |  ...
 |
 └───metadata  (328 kB)
     |  gains.json
     |  head_positions.json
     |  listeners.train.json
     |  listeners.valid.json
     |  musdb18.train.json
-    |  musdb18.valid.json
     |  scene_listeners.train.json
     |  scenes.train.json
     |  ...
 
 ```
 
+* **train** contains the MUSDB18 train split signals to generate the ICASSP 2024 dataset.
+
+```text
+cadenza_data
+└───audio
+    └───music (22 GB)
+        └─── Train
+             ├─── A Classic Education - NightOwl
+             |    | Bass.wav
+             |    | Drums.wav
+             |    | Other.wav
+             |    | Vocals.wav
+             |    | Mixture.wav
+             ├─── ...
+```
+
 ### 1.2 Additional optional data
 
-If you need additional music data for training your model, please restrict to the use of [MedleyDB](https://medleydb.weebly.com/) [[5](#references)] [[6](#references)],
+If you need additional music data for training your model, please restrict to the use of
+[MedleyDB](https://medleydb.weebly.com/) [[5](#references)] [[6](#references)],
 [BACH10](https://labsites.rochester.edu/air/resource.html) [7] and [FMA-small](https://github.com/mdeff/fma) [7].
 
 **Keeping the augmentation data restricted to these datasets will ensure that the evaluation is fair for all participants**.
@@ -73,33 +77,22 @@ cadenza_data
         └───Metadata
 ```
 
-* **BACH10** contains the BACH10 dataset [[7](#references)].
+* **BACH10** [[7](#references)].
 
-Tracks from the BACH10 dataset are not included in MUSDB18-HQ and can all be used as training augmentation data.
-
-```text
-cadenza_data
-└───audio
-    └───Bach10 (150 MB)
-        ├───01-AchGottundHerr
-        ├───...
-```
+Bach10 dataset can be downloaded from
+[Download data and software](https://cadenzachallenge.org/docs/icassp_2024/take_part/download#b1-download-the-packages)
+on the Challenge website.
 
 * **FMA Small** contains the FMA small subset of the FMA dataset [[8](references)].
 
+FMA small dataset can be downloaded from
+[Download data and software](https://cadenzachallenge.org/docs/icassp_2024/take_part/download#b1-download-the-packages)
+on the Challenge website.
+
 Tracks from the FMA small dataset are not included in the MUSDB18-HQ.
 This dataset does not provide independent stems but only the full mix.
 However, it can be used to train an unsupervised model to better initialise a supervised model.
 
-```text
-cadenza_data
-└───audio
-    └───fma_small (8 GB)
-        ├───000
-        ├───001
-        ├───...
-```
-
 ## 2. Baseline
 
 In the `baseline/` folder, we provide code for running the baseline enhancement system and performing the objective evaluation.
@@ -113,7 +106,7 @@ the VDBO (vocals, drums, bass and others) stems for each song-listener pair.
 For each estimated stem, the baseline applies the gains and remix the signal.
 A simple NAL-R [2] fitting amplification is applied to the final remix
 
-The basile offers 2 source separation options:
+The baseline offers 2 source separation options:
 
 1. [Hybrid Demucs](https://github.com/facebookresearch/demucs) [[1](#references)]  distributed on [TorchAudio](https://pytorch.org/audio/main/tutorials/hybrid_demucs_tutorial.html)
 2. [Open-Unmix](https://github.com/sigsep/open-unmix-pytorch) [[2](#references)]  distributed through Pytorch hub.
@@ -161,11 +154,6 @@ Please note: you will not get identical HAAQI scores for the same signals if the
 (in the given recipe, the random seed for each signal is set as the last eight digits of the song md5).
 As there are random noises generated within HAAQI, but the differences should be sufficiently small.
 
-The average validation score for the baseline is:
-
-* Demucs = 0.6496 HAAQI
-* Open-Unmix = 0.5822 HAAQI
-
 ## References
 
 * [1] Défossez, A. "Hybrid Spectrogram and Waveform Source Separation". Proceedings of the ISMIR 2021 Workshop on Music Source Separation. [doi:10.48550/arXiv.2111.03600](https://arxiv.org/abs/2111.03600)

diff --git a/recipes/cad_icassp_2024/baseline/config.yaml b/recipes/cad_icassp_2024/baseline/config.yaml
@@ -4,10 +4,10 @@ path:
   music_dir: ${path.root}/audio/at_mic_music
   gains_file: ${path.metadata_dir}/gains.json
   head_positions_file: ${path.metadata_dir}/head_positions.json
-  listeners_file: ${path.metadata_dir}/listeners.valid.json
-  music_file: ${path.metadata_dir}/at_mic_music.valid.json
-  scenes_file: ${path.metadata_dir}/scenes.valid.json
-  scene_listeners_file: ${path.metadata_dir}/scene_listeners.valid.json
+  listeners_file: ${path.metadata_dir}/listeners.train.json
+  music_file: ${path.metadata_dir}/at_mic_music.train.json
+  scenes_file: ${path.metadata_dir}/scenes.train.json
+  scene_listeners_file: ${path.metadata_dir}/scene_listeners.train.json
   exp_folder: ./exp # folder to store enhanced signals and final results
 
 sample_rate: 44100

diff --git a/recipes/cad_icassp_2024/generate_dataset/generate_at_mic_musdb18.py b/recipes/cad_icassp_2024/generate_dataset/generate_at_mic_musdb18.py
@@ -118,7 +118,12 @@ def find_precreated_samples(source_dir: str | Path) -> list[str]:
     if not source_dir.exists():
         return []
 
-    return [f.name for f in source_dir.glob("*/*")]
+    previous_tracks = []
+    for song_path in source_dir.glob("train/*"):
+        if len(list(song_path.glob("*"))) >= 5:
+            previous_tracks.append(song_path.name)
+
+    return previous_tracks
 
 
 @hydra.main(config_path="", config_name="config")
@@ -145,14 +150,14 @@ def run(cfg: DictConfig) -> None:
         music_metadata = {m["Track Name"]: m for m in music_metadata}
 
     # Load the head positions metadata
-    with open(cfg.path.head_positions_file, encoding="utf-8") as f:
+    with open(cfg.path.head_loudspeaker_positions_file, encoding="utf-8") as f:
         head_positions_metadata = json.load(f)
 
     # From the scenes, get the samples names and parameters
     toprocess_samples = {
-        f"{v['music']}-{v['head_position']}": {
+        f"{v['music']}-{v['head_loudspeaker_positions']}": {
             "music": v["music"],
-            "head_position": v["head_position"],
+            "head_loudspeaker_positions": v["head_loudspeaker_positions"],
         }
         for _, v in scenes_metadata.items()
     }
@@ -162,7 +167,7 @@ def run(cfg: DictConfig) -> None:
     for idx, sample in enumerate(toprocess_samples.items(), 1):
         sample_name, sample_detail = sample
         music = music_metadata[sample_detail["music"]]
-        head_position = sample_detail["head_position"]
+        head_position = sample_detail["head_loudspeaker_positions"]
 
         out_music[sample_name] = {
             "Track Name": sample_name,

diff --git a/tests/recipes/cad_icassp_2024/generate_dataset/test_generate_at_mic_musdb18.py b/tests/recipes/cad_icassp_2024/generate_dataset/test_generate_at_mic_musdb18.py
@@ -20,11 +20,20 @@
 @pytest.fixture(name="temp_dir_with_samples")
 def fixture_temp_dir_with_samples(tmp_path):
     source_dir = Path(tmp_path)
-    sample_dirs = ["sample_dir1", "sample_dir2"]
-    for sample_dir in sample_dirs:
+    sample_dirs = {
+        "sample_dir1": [
+            "sample1.wav",
+            "sample2.wav",
+            "sample3.wav",
+            "sample4.wav",
+            "sample5.wav",
+        ],
+        "sample_dir2": ["sample1.wav"],
+    }
+
+    for sample_dir, sample_files in sample_dirs.items():
         sample_dir_path = Path(source_dir) / "train" / sample_dir
         sample_dir_path.mkdir(exist_ok=True, parents=True)
-        sample_files = ["sample1.wav", "sample2.wav"]
         for sample_file in sample_files:
             with open(sample_dir_path / sample_file, "w", encoding="utf-8") as f:
                 f.write("Sample content")
@@ -147,7 +156,7 @@ def test_find_precreated_samples(temp_dir_with_samples):
 
     # Check if the expected sample files are in the result
     assert "sample_dir1" in precreated_samples
-    assert "sample_dir1" in precreated_samples
+    assert "sample_dir2" not in precreated_samples
 
 
 def test_find_precreated_samples_empty_directory(tmp_path):