| Paper ID | AUD-12.3 |
| Paper Title |
UNSUPERVISED AND SEMI-SUPERVISED FEW-SHOT ACOUSTIC EVENT CLASSIFICATION |
| Authors |
Hsin-Ping Huang, University of California, Merced, United States; Krishna Puvvada, Ming Sun, Chao Wang, Amazon Alexa, United States |
| Session | AUD-12: Detection and Classification of Acoustic Scenes and Events 1: Few-shot learning |
| Location | Gather.Town |
| Session Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation |
Poster
|
| Topic |
Audio and Acoustic Signal Processing: [AUD-CLAS] Detection and Classification of Acoustic Scenes and Events |
| IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
| Virtual Presentation |
Click here to watch in the Virtual Conference |
| Abstract |
Few-shot Acoustic Event Classification (AEC) aims to learn a model to recognize novel acoustic events using very limited labeled data. Previous works utilize supervised pre-training as well as meta-learning approaches, which heavily rely on labeled data. Here, we study unsupervised and semi-supervised learning approaches for few-shot AEC. Our work builds upon recent advances in unsupervised representation learning introduced for speech recognition and language modeling. We learn audio representations from a large amount of unlabeled data, and use the resulting representations for few-shot AEC. We further extend our model in a semi-supervised fashion. Our unsupervised representation learning approach outperforms supervised pre-training methods, and our semi-supervised learning approach outperforms meta-learning methods for few-shot AEC. We also show that our work is more robust under domain mismatch. |