2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDHLT-2.2
Paper Title LARGE MARGIN TRAINING IMPROVES LANGUAGE MODELS FOR ASR
Authors Jilin Wang, Boston University, United States; Jiaji Huang, Kenneth Church, Baidu Research, United States
SessionHLT-2: Language Modeling 2: Neural Language Models
LocationGather.Town
Session Time:Tuesday, 08 June, 13:00 - 13:45
Presentation Time:Tuesday, 08 June, 13:00 - 13:45
Presentation Poster
Topic Human Language Technology: [HLT-LANG] Language Modeling
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Language models (LM) have been widely deployed in modern ASR systems. The LM is often trained by minimizing its perplexity on speech transcript. However, few studies try to discriminate a ``gold'' reference against inferior hypotheses. In this work, we propose a large margin language model (LMLM). LMLM is a general framework that enforces an LM to assign a higher score to the ``gold'' reference, and a lower one to the inferior hypothesis. The general framework is applied to three pretrained LM architectures: left-to-right LSTM, transformer encoder, and transformer decoder. Results show that LMLM significantly outperforms traditional LMs that are trained by minimizing perplexity. Especially for cases where domain shift exists and more robustness is required. Finally, among the three architectures, transformer encoder achieves the best performance.