Skip to content

Commit

Permalink
Merge pull request #349 from claritychallenge/347-bug-icassp-2024-is-…
Browse files Browse the repository at this point in the history
…not-generating-all-the-data

347 bug icassp 2024 is not generating all the data
  • Loading branch information
groadabike authored Sep 20, 2023
2 parents 8c195bf + 32f86ba commit dd866c1
Show file tree
Hide file tree
Showing 5 changed files with 64 additions and 61 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/claritychallenge/clarity/main.svg)](https://results.pre-commit.ci/latest/github/claritychallenge/clarity/main)
[![Downloads](https://pepy.tech/badge/pyclarity)](https://pepy.tech/project/pyclarity)

[![PyPI](https://img.shields.io/static/v1?label=ICASSP%202024%20Cadenza%20Challenge%20-%20pypi&message=v0.4.0&color=orange)](https://pypi.org/project/pyclarity/0.4.0/)
[![PyPI](https://img.shields.io/static/v1?label=ICASSP%202024%20Cadenza%20Challenge%20-%20pypi&message=v0.4.1&color=orange)](https://pypi.org/project/pyclarity/0.4.1/)
[![PyPI](https://img.shields.io/static/v1?label=CAD1%20and%20CPC2%20Challenges%20-%20pypi&message=v0.3.4&color=orange)](https://pypi.org/project/pyclarity/0.3.4/)
[![PyPI](https://img.shields.io/static/v1?label=ICASSP%202023%20Challenge%20-%20pypi&message=v0.2.1&color=orange)](https://pypi.org/project/pyclarity/0.2.1/)
[![PyPI](https://img.shields.io/static/v1?label=CEC2%20Challenge%20-%20pypi&message=v0.1.1&color=orange)](https://pypi.org/project/pyclarity/0.1.1/)

[![ORDA](https://img.shields.io/badge/ORDA--DOI-10.15131%2Fshef.data.23230694.v.1-lightgrey)](https://figshare.shef.ac.uk/articles/software/clarity/23230694/1)
</p>

Expand Down Expand Up @@ -88,7 +89,7 @@ pip install -e git+https://github.com/claritychallenge/clarity.git@main

Current challenge

- [The ICASSP 2024 Cadenza CHallenge](./recipes/cad_icassp_2024)
- [The ICASSP 2024 Cadenza Challenge](./recipes/cad_icassp_2024)

Previous challenges

Expand Down
80 changes: 34 additions & 46 deletions recipes/cad_icassp_2024/baseline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,8 @@ The ICASSP 2024 Cadenza Challenge dataset is based on the MUSDB18-HQ dataset.
To download the data, please visit [Download data and software](https://cadenzachallenge.org/docs/icassp_2024/take_part/download)
webpage.

The data is split into four packages: `cadenza_icassp2024_core.v1_0.tgz`,
`cadenza_icassp2024_augmentation_medleydb.tar.gz`, `cadenza_icassp2024_augmentation_bach10.tar.gz`
and `cadenza_icassp2024_augmentation_fma_small.tar.gz`.
The data is split into four packages: `cad_icassp_2024_core.v1.1.tgz`,
`cad_icassp_2024_train.v1.0.tgz`, and `cad_icassp_2024_medleydb.tgz`.

Unpack packages under the same root directory using

Expand All @@ -22,43 +21,48 @@ tar -xvzf <PACKAGE_NAME>

### 1.1 Necessary data

* **Core** contains the metadata and audio signal to generate the ICASSP 2024 dataset.
* **Core** contains the metadata and HRTFs signals to generate the ICASSP 2024 dataset.

```text
cadenza_data
├───audio
| ├───hrtf (336 kB)
| | | BTE_fr-VP_E1-n22.5.wav
| | | BTE_fr-VP_E1-n30.0.wav
| | | ...
| |
| └───music
| └───train (20.2 GB)
| ├───A Classic Education - NightOwl
| | | bass.wav
| | | drums.wav
| | | other.wav
| | | vocals.wav
| | | mixture.wav
| |
| ├───...
| └───hrtf (336 kB)
| | BTE_fr-VP_E1-n22.5.wav
| | BTE_fr-VP_E1-n30.0.wav
| | ...
|
└───metadata (328 kB)
| gains.json
| head_positions.json
| listeners.train.json
| listeners.valid.json
| musdb18.train.json
| musdb18.valid.json
| scene_listeners.train.json
| scenes.train.json
| ...
```

* **train** contains the MUSDB18 train split signals to generate the ICASSP 2024 dataset.

```text
cadenza_data
└───audio
└───music (22 GB)
└─── Train
├─── A Classic Education - NightOwl
| | Bass.wav
| | Drums.wav
| | Other.wav
| | Vocals.wav
| | Mixture.wav
├─── ...
```

### 1.2 Additional optional data

If you need additional music data for training your model, please restrict to the use of [MedleyDB](https://medleydb.weebly.com/) [[5](#references)] [[6](#references)],
If you need additional music data for training your model, please restrict to the use of
[MedleyDB](https://medleydb.weebly.com/) [[5](#references)] [[6](#references)],
[BACH10](https://labsites.rochester.edu/air/resource.html) [7] and [FMA-small](https://github.com/mdeff/fma) [7].

**Keeping the augmentation data restricted to these datasets will ensure that the evaluation is fair for all participants**.
Expand All @@ -73,33 +77,22 @@ cadenza_data
└───Metadata
```

* **BACH10** contains the BACH10 dataset [[7](#references)].
* **BACH10** [[7](#references)].

Tracks from the BACH10 dataset are not included in MUSDB18-HQ and can all be used as training augmentation data.

```text
cadenza_data
└───audio
└───Bach10 (150 MB)
├───01-AchGottundHerr
├───...
```
Bach10 dataset can be downloaded from
[Download data and software](https://cadenzachallenge.org/docs/icassp_2024/take_part/download#b1-download-the-packages)
on the Challenge website.

* **FMA Small** contains the FMA small subset of the FMA dataset [[8](references)].

FMA small dataset can be downloaded from
[Download data and software](https://cadenzachallenge.org/docs/icassp_2024/take_part/download#b1-download-the-packages)
on the Challenge website.

Tracks from the FMA small dataset are not included in the MUSDB18-HQ.
This dataset does not provide independent stems but only the full mix.
However, it can be used to train an unsupervised model to better initialise a supervised model.

```text
cadenza_data
└───audio
└───fma_small (8 GB)
├───000
├───001
├───...
```

## 2. Baseline

In the `baseline/` folder, we provide code for running the baseline enhancement system and performing the objective evaluation.
Expand All @@ -113,7 +106,7 @@ the VDBO (vocals, drums, bass and others) stems for each song-listener pair.
For each estimated stem, the baseline applies the gains and remix the signal.
A simple NAL-R [2] fitting amplification is applied to the final remix

The basile offers 2 source separation options:
The baseline offers 2 source separation options:

1. [Hybrid Demucs](https://github.com/facebookresearch/demucs) [[1](#references)] distributed on [TorchAudio](https://pytorch.org/audio/main/tutorials/hybrid_demucs_tutorial.html)
2. [Open-Unmix](https://github.com/sigsep/open-unmix-pytorch) [[2](#references)] distributed through Pytorch hub.
Expand Down Expand Up @@ -161,11 +154,6 @@ Please note: you will not get identical HAAQI scores for the same signals if the
(in the given recipe, the random seed for each signal is set as the last eight digits of the song md5).
As there are random noises generated within HAAQI, but the differences should be sufficiently small.

The average validation score for the baseline is:

* Demucs = 0.6496 HAAQI
* Open-Unmix = 0.5822 HAAQI

## References

* [1] Défossez, A. "Hybrid Spectrogram and Waveform Source Separation". Proceedings of the ISMIR 2021 Workshop on Music Source Separation. [doi:10.48550/arXiv.2111.03600](https://arxiv.org/abs/2111.03600)
Expand Down
8 changes: 4 additions & 4 deletions recipes/cad_icassp_2024/baseline/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ path:
music_dir: ${path.root}/audio/at_mic_music
gains_file: ${path.metadata_dir}/gains.json
head_positions_file: ${path.metadata_dir}/head_positions.json
listeners_file: ${path.metadata_dir}/listeners.valid.json
music_file: ${path.metadata_dir}/at_mic_music.valid.json
scenes_file: ${path.metadata_dir}/scenes.valid.json
scene_listeners_file: ${path.metadata_dir}/scene_listeners.valid.json
listeners_file: ${path.metadata_dir}/listeners.train.json
music_file: ${path.metadata_dir}/at_mic_music.train.json
scenes_file: ${path.metadata_dir}/scenes.train.json
scene_listeners_file: ${path.metadata_dir}/scene_listeners.train.json
exp_folder: ./exp # folder to store enhanced signals and final results

sample_rate: 44100
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,12 @@ def find_precreated_samples(source_dir: str | Path) -> list[str]:
if not source_dir.exists():
return []

return [f.name for f in source_dir.glob("*/*")]
previous_tracks = []
for song_path in source_dir.glob("train/*"):
if len(list(song_path.glob("*"))) >= 5:
previous_tracks.append(song_path.name)

return previous_tracks


@hydra.main(config_path="", config_name="config")
Expand All @@ -145,14 +150,14 @@ def run(cfg: DictConfig) -> None:
music_metadata = {m["Track Name"]: m for m in music_metadata}

# Load the head positions metadata
with open(cfg.path.head_positions_file, encoding="utf-8") as f:
with open(cfg.path.head_loudspeaker_positions_file, encoding="utf-8") as f:
head_positions_metadata = json.load(f)

# From the scenes, get the samples names and parameters
toprocess_samples = {
f"{v['music']}-{v['head_position']}": {
f"{v['music']}-{v['head_loudspeaker_positions']}": {
"music": v["music"],
"head_position": v["head_position"],
"head_loudspeaker_positions": v["head_loudspeaker_positions"],
}
for _, v in scenes_metadata.items()
}
Expand All @@ -162,7 +167,7 @@ def run(cfg: DictConfig) -> None:
for idx, sample in enumerate(toprocess_samples.items(), 1):
sample_name, sample_detail = sample
music = music_metadata[sample_detail["music"]]
head_position = sample_detail["head_position"]
head_position = sample_detail["head_loudspeaker_positions"]

out_music[sample_name] = {
"Track Name": sample_name,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,20 @@
@pytest.fixture(name="temp_dir_with_samples")
def fixture_temp_dir_with_samples(tmp_path):
source_dir = Path(tmp_path)
sample_dirs = ["sample_dir1", "sample_dir2"]
for sample_dir in sample_dirs:
sample_dirs = {
"sample_dir1": [
"sample1.wav",
"sample2.wav",
"sample3.wav",
"sample4.wav",
"sample5.wav",
],
"sample_dir2": ["sample1.wav"],
}

for sample_dir, sample_files in sample_dirs.items():
sample_dir_path = Path(source_dir) / "train" / sample_dir
sample_dir_path.mkdir(exist_ok=True, parents=True)
sample_files = ["sample1.wav", "sample2.wav"]
for sample_file in sample_files:
with open(sample_dir_path / sample_file, "w", encoding="utf-8") as f:
f.write("Sample content")
Expand Down Expand Up @@ -147,7 +156,7 @@ def test_find_precreated_samples(temp_dir_with_samples):

# Check if the expected sample files are in the result
assert "sample_dir1" in precreated_samples
assert "sample_dir1" in precreated_samples
assert "sample_dir2" not in precreated_samples


def test_find_precreated_samples_empty_directory(tmp_path):
Expand Down

0 comments on commit dd866c1

Please sign in to comment.