2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDSPE-8.4
Paper Title SYNAUG: SYNTHESIS-BASED DATA AUGMENTATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
Authors Chenpeng Du, Bing Han, Shuai Wang, Yanmin Qian, Kai Yu, Shanghai Jiao Tong University, China
SessionSPE-8: Speaker Recognition 2: Channel and Domain Robustness
LocationGather.Town
Session Time:Tuesday, 08 June, 14:00 - 14:45
Presentation Time:Tuesday, 08 June, 14:00 - 14:45
Presentation Poster
Topic Speech Processing: [SPE-SPKR] Speaker Recognition and Characterization
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Text-dependent speaker verification systems trained on large amount of labelled data exhibit remarkable performance. However, collecting the speech from a lot of speakers with target transcript is a lengthy and expensive process. In this work, we propose a synthesis based data augmentation method (SynAug) to expand the training set with more speakers and text-controlled synthesized speech. The performance of SynAug is evaluated on the RSR2015 dataset. Experimental results show that for i-vector framework, the proposed methods can boost the system performance significantly, especially for the low-resource condition where the amount of genuine speech is extremely limited. Moreover, combined with traditional data augmentation methods such as adding noises and reverberation, the systems could be further strengthened in extremely limited resource situation.