2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDASPS-7.1
Paper Title END TO END LEARNING FOR CONVOLUTIVE MULTI-CHANNEL WIENER FILTERING
Authors Masahito Togami, LINE Corporation, Japan
SessionASPS-7: Data Science & Machine Learning
LocationGather.Town
Session Time:Thursday, 10 June, 16:30 - 17:15
Presentation Time:Thursday, 10 June, 16:30 - 17:15
Presentation Poster
Topic Applied Signal Processing Systems: Signal Processing Systems [DIS-EMSA]
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract In this paper, we propose a dereverberation and speech source separation method based on deep neural network (DNN). Unlike the cascade connection of dereverberation and speech source separation, the proposed method performs dereverberation and speech source separation jointly by a unified convolutive multi-channel Wiener filtering (CMWF). The proposed method adopts a time-varying CMWF to achieve more dereverberation and separation performance than a time-invariant CMWF. The time-varying CMWF requires time-frequency masks and time-frequency activities. These variables are inferred via a unified DNN. The DNN is trained to optimize the output signal of the time-varying CMWF with a loss function based on a negative log-posterior probability density function. We also reveal that the time-varying CMWF can be obtained efficiently based on the Sherman-Morrison-Woodbury equation. Experimental results show that the proposed time-varying CMWF can separate speech sources under reverberant environments better than the cascade-connection based method and the time-invariant CMWF.