| Paper ID | SPE-22.1 |
| Paper Title |
Code-Switch Speech Rescoring With Monolingual Data |
| Authors |
Guoyu Liu, Lixin Cao, Tencent, China |
| Session | SPE-22: Speech Recognition 8: Multilingual Speech Recognition |
| Location | Gather.Town |
| Session Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 |
| Presentation |
Poster
|
| Topic |
Human Language Technology: [HLT-LANG] Language Modeling |
| IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
| Virtual Presentation |
Click here to watch in the Virtual Conference |
| Abstract |
In the automatic speech recognition (ASR) system, how to solve the problem of code-switch speech recognition has been a concern. Code-switch speech recognition is challenging due to data scarcity as well as diverse syntactic structures across languages. In this paper, we focus on the code-switch speech recognition in mainland China, which is obviously different from the Hong Kong and Southeast Asia area in linguistic characteristics. We propose a novel approach that only uses monolingual data for code-switch second-pass speech recognition which is also named language model rescoring. The approach converts the code-switch sentence to a monolingual sentence by a word mapping and language model determination step, therefore the issue of data scarcity is unnecessary to be considered. The word pairs during the word mapping step are generated by a fine-designed generation process that incorporates machine translation, word alignment, etc. We show that the proposed approach achieves an over 7.23% relative WER reduction from the monolingual language model (MLM) rescoring in our test set. |