MLSP-10: Deep Learning for Speech and Audio |
| Session Type: Poster |
| Time: Tuesday, 8 June, 16:30 - 17:15 |
| Location: Gather.Town |
| Virtual Session: View on Virtual Platform |
| Session Chair: Ritwik Giri, Amazon |
| MLSP-10.1: HIGH-FREQUENCY ADVERSARIAL DEFENSE FOR SPEECH AND AUDIO |
| Raphael Olivier; Carnegie Mellon University |
| Bhiksha Raj; Carnegie Mellon University |
| Muhammad Shah; Carnegie Mellon University |
| MLSP-10.2: LEARNING SEPARABLE TIME-FREQUENCY FILTERBANKS FOR AUDIO CLASSIFICATION |
| Jie Pu; Imperial College London |
| Yannis Panagakis; University of Athens |
| Maja Pantic; Imperial College London |
| MLSP-10.3: UPSAMPLING ARTIFACTS IN NEURAL AUDIO SYNTHESIS |
| Jordi Pons; Dolby Laboratories |
| Santiago Pascual; Dolby Laboratories |
| Giulio Cengarle; Dolby Laboratories |
| Joan Serrà; Dolby Laboratories |
| MLSP-10.4: DEEP CONVOLUTIONAL AND RECURRENT NETWORKS FOR POLYPHONIC INSTRUMENT CLASSIFICATION FROM MONOPHONIC RAW AUDIO WAVEFORMS |
| Kleanthis Avramidis; National Technical University of Athens |
| Agelos Kratimenos; National Technical University of Athens |
| Christos Garoufis; National Technical University of Athens |
| Athanasia Zlatintsi; National Technical University of Athens |
| Petros Maragos; National Technical University of Athens |
| MLSP-10.5: LEARNING AUDIO EMBEDDINGS WITH USER LISTENING DATA FOR CONTENT-BASED MUSIC RECOMMENDATION |
| Ke Chen; University of California, San Diego |
| Beici Liang; Tencent Music Entertainment |
| Xiaoshuan Ma; Tencent Music Entertainment |
| Minwei Gu; Tencent Music Entertainment |
| MLSP-10.6: EFFICIENT SPEECH EMOTION RECOGNITION USING MULTI-SCALE CNN AND ATTENTION |
| Zixuan Peng; Zhuiyi Technology |
| Yu Lu; Zhuiyi Technology |
| Shengfeng Pan; Zhuiyi Technology |
| Yunfeng Liu; Zhuiyi Technology |