2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDIVMSP-28.6
Paper Title IMAGE GENERATION BASED ON TEXTURE GUIDED VAE-AGAN FOR REGIONS OF INTEREST DETECTION IN REMOTE SENSING IMAGES
Authors Libao Zhang, Yanan Liu, Beijing Normal University, China
SessionIVMSP-28: Image Synthesis
LocationGather.Town
Session Time:Friday, 11 June, 11:30 - 12:15
Presentation Time:Friday, 11 June, 11:30 - 12:15
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVARS] Image & Video Analysis, Synthesis, and Retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Deep learning has shown great strength in regions of interest (ROIs) detection for remote sensing images (RSIs). However, for most of RSIs, the unbalanced distribution of positive and negative samples greatly limits the performance of the deep learning-based methods. To cope with this issue, we propose a novel method based on texture guided variational autoencoder-attention wise generative adversarial network (VAE-AGAN) to augment the training data for ROI detection. First, to generate realistic texture details of RSIs, we propose a texture guidance block to embed texture prior information into encoder and decoder networks. Second, we introduce the channel and spatial-wise attention layers in the discriminator construct to adaptively recalibrate the varying importance of different channels and spatial regions of input RSIs. Finally, we apply the RSI dataset balanced by our proposal to the weakly supervised ROI detection method. Experimental results demonstrate that the proposal can not only improve the performance of ROI detection, but also outperform other competing augmentation methods.