HLT-10: Multi-modality in Language |
| Session Type: Poster |
| Time: Wednesday, 9 June, 16:30 - 17:15 |
| Location: Gather.Town |
| Virtual Session: View on Virtual Platform |
| Session Chair: Mahnoosh Mehrabani, Interactions Research |
| HLT-10.1: INCORPORATING SYNTACTIC AND PHONETIC INFORMATION INTO MULTIMODAL WORD EMBEDDINGS USING GRAPH CONVOLUTIONAL NETWORKS |
| Wenhao Zhu; Shanghai University |
| Shuang Liu; Shanghai University |
| Chaoming Liu; Shanghai University |
| HLT-10.2: LIFI: TOWARDS LINGUISTICALLY INFORMED FRAME INTERPOLATION |
| Aradhya Mathur; IIIT Delhi |
| Devansh Batra; IIIT-D |
| Yaman Kumar Singla; IIIT-D; Adobe; State University of New York at Buffalo |
| Rajiv Ratn Shah; IIIT Delhi |
| Changyou Chen; State University of New York at Buffalo |
| Roger Zimmermann; NUS |
| HLT-10.3: TRIPLE SEQUENCE GENERATIVE ADVERSARIAL NETS FOR UNSUPERVISED IMAGE CAPTIONING |
| Yucheng Zhou; Fudan University |
| Wei Tao; Fudan University |
| Wenqiang Zhang; Fudan University |
| HLT-10.4: ALIGN OR ATTEND? TOWARD MORE EFFICIENT AND ACCURATE SPOKEN WORD DISCOVERY USING SPEECH-TO-IMAGE RETRIEVAL |
| Liming Wang; University of Illinois, Urbana-Champaign |
| Xinsheng Wang; Delft University of Technology |
| Mark Hasegawa-Johnson; University of Illinois, Urbana-Champaign |
| Odette Scharenborg; Delft University of Technology |
| Najim Dehak; Johns Hopkins University |
| HLT-10.5: TOWARDS PRACTICAL LIPREADING WITH DISTILLED AND EFFICIENT MODELS |
| Pingchuan Ma; Imperial College London |
| Brais Martinez; Samsung AI Research Center |
| Stavros Petridis; Imperial College London |
| Maja Pantic; Imperial College London |
| HLT-10.6: END-TO-END AUDIO-VISUAL SPEECH RECOGNITION WITH CONFORMERS |
| Pingchuan Ma; Imperial College London |
| Stavros Petridis; Imperial College London |
| Maja Pantic; Imperial College London |