2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDIVMSP-21.5
Paper Title MULTI-SCALE FEATURE-GUIDED STEREOSCOPIC VIDEO QUALITY ASSESSMENT BASED ON 3D CONVOLUTIONAL NEURAL NETWORK
Authors Yingjie Feng, Sumei Li, Yongli Chang, Tianjin University, China
SessionIVMSP-21: Image & Video Quality
LocationGather.Town
Session Time:Thursday, 10 June, 14:00 - 14:45
Presentation Time:Thursday, 10 June, 14:00 - 14:45
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVSMR] Image & Video Sensing, Modeling, and Representation
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract With the huge development of stereoscopic video techno-logy, the research of stereoscopic video quality assessment (SVQA) has become very important for promoting the development of stereoscopic video system. These years, many SVQA methods based on convolutional neural network (CNN) have emerged. In this paper, we proposed a multi-scale feature-guided 3D convolutional neural network for SVQA which not only use 3D convolution to capture spatio-temporal features but also aggregate multi-scale information by a new multi-scale unit. Besides, we employ a multi-stage growing attention mechanism in this network to learn more critical deep semantic information. The proposed method is tested on two public stereoscopic video quality datasets, and the result shows that this method correlates highly with human visual perception and outperforms state-of-the-art methods by a large margin.