ICASSP 2021 Grand Challenges are held under the general IEEE SPS umbrella of the IEEE Challenges and Data Committee Program whose activities are detailed at : https://signalprocessingsociety.org/publications-resources/challenges-and-data-collections
Text-to-speech (TTS) or speech synthesis has witnessed significant performance improvement with the help of deep learning. The latest advances in end-to-end text-to-speech paradigm and neural vocoder have enabled us to produce very realistic and natural-sounding synthetic speech reaching almost human-parity performance. But this amazing ability is still limited to the ideal scenarios with a large single-speaker less-expressive training set. The speech quality, target similarity, expressiveness and robustness are still not satisfied for synthetic speech with different speakers and various styles, especially in real-world low-resourced conditions, e.g., each speaker only has a few samples at hand. The current open solutions are also not robust enough to unseen speakers. We call this challenging task as multi-speaker multi-style voice cloning (M2VoC).
Recent advances in transfer learning, style transfer, speaker embedding and factor disentanglement have shed light on the potential solutions to low-resource voice cloning.
As a ICASSP2021 Signal Processing Grand Challenge, the M2VoC challenge aims to provide a common sizable dataset as well as a fair testbed for benchmarking the voice cloning task. We highly encourage the researchers from both academia and industry to join the challenge and have deep discussions as well as collaborations.
Further details: http://challenge.ai.iqiyi.com/detail?raceId=5fb2688224954e0b48431fe0
In today’s digital age, network security is critical as billions of computers around the world are connected with each other over networks. Symantec’s Internet Security Threat Report indicates a 56% increase in the number of network attacks in 2019. Network anomaly detection (NAD) is an attempt to detect anomalous network traffic by observing traffic data over time to define what is “normal” traffic and pick out potentially anomalous behavior that differs in some way.
Signature-based or rule-based NAD is conventionally employed to identify anomalous behaviors, which can generally be divided into two categories based on the detection principle: (1) Flow-based method is to analyze a network connection session that may include the connection protocol, connection time, the total number of packets sent, and so forth; (2) Packet-based method is to analyze the content of each packet. However, signatures and rules are essentially insufficient for network threat detection because they can deal only with known attacks and what distinguishes anomalous behaviors from normal traffic are often subtle.
In recent years, deep learning methods have received much attention, since deep neural networks are able to learn complex patterns of anomalies directly from the network traffic data. However, network traffic data are real-world data compounded by properties such as large scale, noisy label, and class imbalance, making it a challenge for deep learning algorithms. For example, anomalies rarely occur and the majority is normal data (i.e. anomalies only typically occur 0.001-1% of the time), and learning from imbalanced data is still an open challenge.
Therefore, ZYELL-NCTU Network Anomaly Detection Challenge is a joint activity with the research teams from the ZYELL group and National Chiao Tung University. In this challenge, we release a million-scale dataset of real-world network traffic data for network anomaly detection and aim at leveraging solutions across industrial and academic communities to help advance the field of network security.
Further details: https://nad2021.nctu.edu.tw/index.html
Novel Coronavirus (COVID-19) has drastically overwhelmed more than 200 countries around the world affecting millions and claiming more than 1.5 million human lives, since its first emergence in late 2019. This highly contagious disease can easily spread, and if not controlled in a timely fashion, can rapidly incapacitate healthcare systems.
The main objective of the 2021 IEEE SPGC-COVID is development of fully automated frameworks to identify/classify COVID-19 infections using only volumetric chest CT scans. The introduced SPGC-COVID dataset is a large dataset of COVID-19, CAP, and normal cases acquired with various imaging settings from different medical centers. The challenge is to design advanced and robust learning models to classify the given CT scans into three classes of COVID-19, CAP, and normal cases. Developed learning models need to perform accurately and robustly over such heterogeneous set of CT scans, which include images with different slice thickness, radiation dose, and noise level. In addition to acquisition and visual variations of CT scans, the SPGC-COVID dataset consists of CT scans that, beside COVID-19 infections, include manifestations related to hearth problems/operations.
Any team can participate in the competition, should complete their submission by March 1st, 2021. The five best teams are selected and announced by May 25th, 2021. Three finalist teams will be judged at IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2021, which will be held June 6-12, 2021, Toronto, Canada. In addition to algorithmic performances, demonstration and presentation performances will also affect the final ranking.
Further details: http://i-sip.encs.concordia.ca/2021SPGC-COVID19/index.html
The ICASSP 2021 Acoustic Echo Cancellation Challenge is intended to stimulate research in the area of acoustic echo cancellation (AEC), which is an important part of speech enhancement and still a top issue in audio communication and conferencing systems. Many recent AEC studies report reasonable performance on synthetic datasets where the train and test samples come from the same underlying distribution. However, the AEC performance often degrades significantly on real recordings. Also, most of the conventional objective metrics such as echo return loss enhancement (ERLE) and perceptual evaluation of speech quality (PESQ) do not correlate well with subjective speech quality tests in the presence of background noise and reverberation found in realistic environments.
In this challenge, we open source two large datasets to train AEC models under both single talk and double talk scenarios. These datasets consist of recordings from more than 2,500 real audio devices and human speakers in real environments, as well as a synthetic dataset. We open source an online subjective test framework based on ITU-T P.808 for researchers to quickly test their results. The winners of this challenge will be selected based on the average P.808 Mean Opinion Score (MOS) achieved across all different single talk and double talk scenarios.
Please use Microsoft Conference Management Toolkit for submitting the results. After logging in, complete the following steps to submit the results:
Submission deadline: Oct 9, 2020, 11:59pm (anywhere on Earth)
For questions, please contact firstname.lastname@example.org
The ICASSP 2021 Deep Noise Suppression (DNS) challenge is designed to foster innovation in the field of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020. We open sourced training and test datasets for researchers to train their noise suppression models. We also open sourced a subjective evaluation framework and used the tool to evaluate and pick the final winners. Many researchers from academia and industry made significant contributions to push the field forward. The results of the INTERSPEECH DNS Challenge show we still have a long way to go in achieving superior speech quality in challenging noisy conditions. In this challenge, we will be adding over 20 hours of clean speech with singing and provide more information about the characteristics of the noise based on stationarity. We will also provide over 100000 synthetic and real room impulse responses (RIRs) curated from other data sets.
We will have two tracks for this challenge:
Participants are forbidden from using the blind test set to retrain or tweak their models.They must not submit clips enhanced using any speech enhancement method that is not being submitted to ICASSP 2021 by the authors.Failing to adhere to these rules will lead to disqualification from the challenge.
Please send an email to email@example.com stating that you are interested to participate in the challenge. Please include the following details in your email:
Top three winning teams from each track will be awarded prizes as outlined in the description of the rules.
Please email us, if you have any questions or need clarification about any aspect of the challenge