2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

IEEE Signal Processing Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper ID	IVMSP-26.6
Paper Title	WEBLY SUPERVISED DEEP ATTENTIVE QUANTIZATION
Authors	Jinpeng Wang, Bin Chen, Tao Dai, Shutao Xia, Tsinghua University, China
Session	IVMSP-26: Attention for Vision
Location	Gather.Town
Session Time:	Thursday, 10 June, 16:30 - 17:15
Presentation Time:	Thursday, 10 June, 16:30 - 17:15
Presentation	Poster
Topic	Image, Video, and Multidimensional Signal Processing: [IVARS] Image & Video Analysis, Synthesis, and Retrieval
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	Learning to hash has been widely applied in large-scale image retrieval. Although current deep hashing methods yield state-of-the-art performance, their heavy dependence on ground-truth information actually makes it difficult to deploy in practical applications such as social media. To solve this problem, we propose a novel method termed Webly Supervised Deep Attentive Quantization (WSDAQ), where deep quantization is trained on web images associated with some user-provided weak tags, without consulting any ground-truth labels. Specifically, we design a tag processing module to leverage semantic information of tags so as to better supervised quantization learning. Besides, we propose an end-to-end trainable Attentive Product Quantization Module (APQM) to quantize deep features of images. Furthermore, we use a noise-contrastive estimation loss to train the model from the perspective of contrastive learning. Experiments validate that WSDAQ is superior to state-of-the-art baselines in compact coding trained on weakly-tagged web images.