SS-11: On-device AI for Audio and Speech Applications |
| Session Type: Poster |
| Time: Thursday, 10 June, 14:00 - 14:45 |
| Location: Gather.Town |
| Virtual Session: View on Virtual Platform |
| Session Chairs: Anurag Kumar, Facebook Research, Buye Xu, Facebook Reality Labs and Deliang Wang, Ohio State University |
| SS-11.1: COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT SPEECH ENHANCEMENT |
| Ke Tan; The Ohio State University |
| DeLiang Wang; The Ohio State University |
| SS-11.2: IMPROVED MASK-CTC FOR NON-AUTOREGRESSIVE END-TO-END ASR |
| Yosuke Higuchi; Waseda University |
| Hirofumi Inaguma; Kyoto University |
| Shinji Watanabe; Johns Hopkins University |
| Tetsuji Ogawa; Waseda University |
| Tetsunori Kobayashi; Waseda University |
| SS-11.3: MEMORY-EFFICIENT SPEECH RECOGNITION ON SMART DEVICES |
| Ganesh Venkatesh; Facebook |
| Alagappan Valliappan; Facebook |
| Jay Mahadeokar; Facebook |
| Yuan Shangguan; Facebook |
| Christian Fuegen; Facebook |
| Mike Seltzer; Facebook |
| Vikas Chandra; Facebook |
| SS-11.4: EXPEDITING DISCOVERY IN NEURAL ARCHITECTURE SEARCH BY COMBINING LEARNING WITH PLANNING |
| Farzaneh S. Fard; Fluent.ai |
| Vikrant Tomar; Fluent.ai |
| SS-11.5: SPECIALIZED EMBEDDING APPROXIMATION FOR EDGE INTELLIGENCE: A CASE STUDY IN URBAN SOUND CLASSIFICATION |
| Sangeeta Srivastava; The Ohio State University |
| Dhrubojyoti Roy; The Ohio State University |
| Mark Cartwright; New York University |
| Juan Pablo Bello; New York University |
| Anish Arora; The Ohio State University |
| SS-11.6: LIGHT-TTS: LIGHTWEIGHT MULTI-SPEAKER MULTI-LINGUAL TEXT-TO-SPEECH |
| Song Li; Xiamen University |
| Beibei Ouyang; Xiamen University |
| Lin Li; Xiamen University |
| Qingyang Hong; Xiamen University |