Environmental Sound Classification 50


Audio classification is often proposed as MFCC classification problem. With this dataset, we intend to give attention to raw audio classification, as performed in the Wavenet network.


The dataset consists in 50 WAV files sampled at 16KHz for 50 different classes.

To each one of the classes, corresponds 40 audio sample of 5 seconds each. All of these audio files have been concatenated by class in order to have 50 wave files of 3 min. 20sec.

In our example notebook, we show how to access the data and visualize a piece of it.


We have not much credit in proposing the dataset here. Much of the work have been done by the authors of the ESC-50 Dataset for Environmental Sound Classification. In order to fit on Kaggle, we processed the files with the to_wav.py file present in the original repository. You might also notice that we transformed the data from OGG to WAV as the former didn't seem to be supported in Anaconda.


You might use this dataset to challenge your algorithms in classifying from raw audio ;)

데이터와 리소스

추가 정보

소스 https://www.kaggle.com/mmoreaux/environmental-sound-classification-50
저자 marc moreaux
최종 업데이트 5월 2, 2021, 06:42 (UTC)
생성됨 5월 2, 2021, 06:42 (UTC)
kaggle_id 3151
kaggle_lastUpdated 2018-10-26T15:54:57.473Z
kaggle_ref mmoreaux/environmental-sound-classification-50