| Paper ID | SPE-22.2 |
| Paper Title |
MIXTURE OF INFORMED EXPERTS FOR MULTILINGUAL SPEECH RECOGNITION |
| Authors |
Neeraj Gaur, Brian Farris, Parisa Haghani, Isabel Leal, Pedro J. Moreno, Manasa Prasad, Bhuvana Ramabhadran, Yun Zhu, Google Inc., United States |
| Session | SPE-22: Speech Recognition 8: Multilingual Speech Recognition |
| Location | Gather.Town |
| Session Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation |
Poster
|
| Topic |
Speech Processing: [SPE-MULT] Multilingual Recognition and Identification |
| IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
| Virtual Presentation |
Click here to watch in the Virtual Conference |
| Abstract |
When trained on related or low-resource languages, multilin-gual speech recognition models often outperform their mono-lingual counterparts. However, these models can suffer fromloss in performance for high resource or unrelated languages.We investigate the use of a mixture-of-experts approach toassign per-language parameters in the model to increase net-work capacity in a structured fashion. We introduce a novelvariant of this approach, ‘informed experts’, which attemptsto tackle inter-task conflicts by eliminating gradients fromother tasks in the these task-specific parameters. We conductexperiments on a real-world task with English, French andfour dialects of Arabic to show the effectiveness of our ap-proach. Our model matches or outperforms the monolingualmodels for almost all languages, with gains of as much as31% relative. Our model also outperforms the baseline mul-tilingual model for all languages, with gains as large as 9% |