SPE-4: Speech Synthesis 2: Controllability | 
| Session Type: Poster | 
| Time: Tuesday, 8 June, 13:00 - 13:45 | 
| Location: Gather.Town | 
| Virtual Session: View on Virtual Platform | 
| Session Chair: Yu Zhang, Google | 
| SPE-4.1: PARALLEL TACOTRON: NON-AUTOREGRESSIVE AND CONTROLLABLE TTS | 
| Isaac Elias; Google | 
| Heiga Zen; Google | 
| Jonathan Shen; Google | 
| Yu Zhang; Google | 
| Ye Jia; Google | 
| Ron Weiss; Google | 
| Yonghui Wu; Google | 
| SPE-4.2: FCL-TACO2: TOWARDS FAST, CONTROLLABLE AND LIGHTWEIGHT TEXT-TO-SPEECH SYNTHESIS | 
| Disong Wang; The Chinese University of Hong Kong | 
| Liqun Deng; Huawei Noah's Ark Lab | 
| Yang Zhang; Huawei Noah's Ark Lab | 
| Nianzu Zheng; Huawei Noah's Ark Lab | 
| Yu Ting Yeung; Huawei Noah's Ark Lab | 
| Xiao Chen; Huawei Noah's Ark Lab | 
| Xunying Liu; The Chinese University of Hong Kong | 
| Helen Meng; The Chinese University of Hong Kong | 
| SPE-4.3: PROSODIC CLUSTERING FOR PHONEME-LEVEL PROSODY CONTROL IN END-TO-END SPEECH SYNTHESIS | 
| Alexandra Vioni; Innoetics, Samsung Electronics | 
| Myrsini Christidou; Innoetics, Samsung Electronics | 
| Nikolaos Ellinas; Innoetics, Samsung Electronics | 
| Georgios Vamvoukakis; Innoetics, Samsung Electronics | 
| Panos Kakoulidis; Innoetics, Samsung Electronics | 
| Taehoon Kim; Mobile Communications Business, Samsung Electronics | 
| June Sig Sung; Mobile Communications Business, Samsung Electronics | 
| Hyoungmin Park; Mobile Communications Business, Samsung Electronics | 
| Aimilios Chalamandaris; Innoetics, Samsung Electronics | 
| Pirros Tsiakoulis; Innoetics, Samsung Electronics | 
| SPE-4.4: IMPROVING NATURALNESS AND CONTROLLABILITY OF SEQUENCE-TO-SEQUENCE SPEECH SYNTHESIS BY LEARNING LOCAL PROSODY REPRESENTATIONS | 
| Cheng Gong; Tianjin University | 
| Longbiao Wang; Tianjin University | 
| Zhenhua Ling; University of Science and Technology of China | 
| Shaotong Guo; Tianjin University | 
| Ju Zhang; Huiyan Technology (Tianjin) Co., Ltd | 
| Jianwu Dang; Japan Advanced Institute of Science and Technology | 
| SPE-4.5: MULTI-SPEAKER EMOTIONAL SPEECH SYNTHESIS WITH FINE-GRAINED PROSODY MODELING | 
| Chunhui Lu; Samsung Research China-Beijing | 
| Xue Wen; Samsung Research China-Beijing | 
| Ruolan Liu; Samsung Research China-Beijing | 
| Xiao Chen; Samsung Research China-Beijing | 
| SPE-4.6: EMOTION CONTROLLABLE SPEECH SYNTHESIS USING EMOTION-UNLABELED DATASET WITH THE ASSISTANCE OF CROSS-DOMAIN SPEECH EMOTION RECOGNITION | 
| Xiong Cai; Tsinghua University | 
| Dongyang Dai; Tsinghua University | 
| Zhiyong Wu; Tsinghua University | 
| Xiang Li; Tsinghua University | 
| Jingbei Li; Tsinghua University | 
| Helen Meng; Chinese University of Hong Kong |