Acoustic Scene Classification using Kervolution-Based SubSpectralNet
In this paper, a Kervolution-based SubSpectralNet model is proposed for Acoustic Scene Classification (ASC). SubSpectralNet is a competitive model which divides the mel spectrogram into horizontal slices termed as sub-spectrograms that are considered as input to the Convolutional Neural Network (CNN). In this work, the linear convolutional operation of SubSpectralNet is replaced with a non-linear operation using the kernel trick. This is also known as kervolution (kernel convolution)-based SubSpectralNet. The performance of the proposed methodology is evaluated on the DCASE (Detection and Classification of Acoustic Scenes and Events) 2018 development dataset. The proposed method achieves 73.52% and 75.76% accuracy with Polynomial and Gaussian Kernels respectively.
For reproduction or queries, please contact the authors, Ritika Nandi (ritika.nandi77@gmail.com), and Shashank Shekhar (shashankshekhar90210@gmail.com).
Cite as: Nandi, R., Shekhar, S., Mulimani, M. (2021) Acoustic Scene Classification Using Kervolution-Based SubSpectralNet. Proc. Interspeech 2021, 561-565, doi: 10.21437/Interspeech.2021-656