| Paper ID | AUD-14.1 |
| Paper Title |
SESQA: SEMI-SUPERVISED LEARNING FOR SPEECH QUALITY ASSESSMENT |
| Authors |
Joan Serrà, Jordi Pons, Santiago Pascual, Dolby Laboratories, Spain |
| Session | AUD-14: Quality and Intelligibility Measures |
| Location | Gather.Town |
| Session Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation |
Poster
|
| Topic |
Audio and Acoustic Signal Processing: [AUD-QIM] Quality and Intelligibility Measures |
| IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
| Virtual Presentation |
Click here to watch in the Virtual Conference |
| Abstract |
Automatic speech quality assessment is an important, transversal task whose progress is hampered by the scarcity of human annotations, poor generalization to unseen recording conditions, and a lack of flexibility of existing approaches. In this work, we tackle these problems with a semi-supervised learning approach, combining available annotations with programmatically generated data, and using 3 different optimization criteria together with 5 complementary auxiliary tasks. Our results show that such a semi-supervised approach can cut the error of existing methods by more than 36%, while providing additional benefits in terms of reusable features or auxiliary outputs. Improvement is further corroborated with an out-of-sample test showing promising generalization capabilities. |