FewSound

Code repository for: https://arxiv.org/abs/2503.02585

Repository based on https://github.com/WUT-AI/hypersound

Abstract

Implicit neural representations (INR) have gained prominence for efficiently encoding multimedia data, yet their applications in audio signals remain limited. This study introduces the Kolmogorov-Arnold Network (KAN), a novel architecture using learnable activation functions, as an effective INR model for audio representation. KAN demonstrates superior perceptual performance over previous INRs, achieving the lowest Log-Spectral Distance of 1.29 and the highest Perceptual Evaluation of Speech Quality of 3.57 for 1.5 s audio. To extend KAN's utility, we propose FewSound, a hypernetwork-based architecture that enhances INR parameter updates. FewSound outperforms the state-of-the-art HyperSound, with a 33.3% improvement in MSE and 60.87% in SI-SNR. These results show KAN as a robust and adaptable audio representation with the potential for scalability and integration into various hypernetwork frameworks

Setup

Setup conda environment:

conda env create -f environment.yml

Set environmental variables in the environment, for example:

conda env config vars set DATA_DIR=~/datasets
conda env config vars set RESULTS_DIR=~/results
conda env config vars set WANDB_ENTITY=my_wandb_entity
conda env config vars set WANDB_PROJECT=fewsound

Make sure that pytorch-yard is using the appropriate version (defined in train.py). If not, then correct package version with something like:

pip install --force-reinstall pytorch-yard==2022.9.1

Experiments

Default experiment:

python train.py

Custom settings:

python train.py cfg.learning_rate=0.01 cfg.pl.max_epochs=100

Isolated training of a target network on a single recording:

python train_inr.py cfg.pl.max_epochs=100 cfg.inr_audio_path=PATH_TO_WAV

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
fewsound		fewsound
img		img
inr		inr
legal		legal
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
train.py		train.py
train_inr.py		train_inr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FewSound

Abstract

Setup

Experiments

About

Releases

Packages

Languages

gmum/fewsound

Folders and files

Latest commit

History

Repository files navigation

FewSound

Abstract

Setup

Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages