SPE-30: Speech Processing 2: General Topics |
| Session Type: Poster |
| Time: Wednesday, 9 June, 16:30 - 17:15 |
| Location: Gather.Town |
| Virtual Session: View on Virtual Platform |
| Session Chair: Vikramjit Mitra, Apple Inc. |
| SPE-30.1: HUMANACGAN: CONDITIONAL GENERATIVE ADVERSARIAL NETWORK WITH HUMAN-BASED AUXILIARY CLASSIFIER AND ITS EVALUATION IN PHONEME PERCEPTION |
| Yota Ueda; University of Tokyo |
| Kazuki Fujii; National Institute of Technology, Tokuyama College |
| Yuki Saito; University of Tokyo |
| Shinnosuke Takamichi; University of Tokyo |
| Yukino Baba; University of Tsukuba |
| Hiroshi Saruwatari; University of Tokyo |
| SPE-30.2: IMPROVING AUDIO ANOMALIES RECOGNITION USING TEMPORAL CONVOLUTIONAL ATTENTION NETWORK |
| Qiang Huang; University of Sheffield |
| Thomas Hain; University of Sheffield |
| SPE-30.3: GENERATIVE SPEECH CODING WITH PREDICTIVE VARIANCE REGULARIZATION |
| W Bastiaan Kleijn; Victoria University of Wellington |
| Andrew Storus; Google |
| Michael Chinen; Google |
| Tom Denton; Google |
| Felicia S. C. Lim; Google |
| Alejandro Luebs; Google |
| Jan Skoglund; Google |
| Hengchin Yeh; Google |
| SPE-30.4: HOW TO MAKE TEXT-TO-SPEECH SYSTEM PRONOUNCE “VOLDEMORT”: AN EXPERIMENTAL APPROACH OF FOREIGN WORD PHONEMIZATION IN VIETNAMESE |
| Dang-Khoa Mac; Vingroup Big Data Institute |
| Van-Huy Nguyen; Vingroup Big Data Institute |
| Dinh-Nghi Nguyen; Vingroup Big Data Institute |
| Kim-Anh Nguyen; Vingroup Big Data Institute |
| SPE-30.5: HOW SIMILAR OR DIFFERENT IS RAKUGO SPEECH SYNTHESIZER TO PROFESSIONAL PERFORMERS? |
| Shuhei Kato; National Institute of Informatics |
| Yusuke Yasuda; National Institute of Informatics |
| Xin Wang; National Institute of Informatics |
| Erica Cooper; National Institute of Informatics |
| Junichi Yamagishi; National Institute of Informatics |
| SPE-30.6: DNSMOS: A NON-INTRUSIVE PERCEPTUAL OBJECTIVE SPEECH QUALITY METRIC TO EVALUATE NOISE SUPPRESSORS |
| Chandan Karadagur Ananda Reddy; Microsoft |
| Vishak Gopal; Microsoft |
| Ross Cutler; Microsoft |
| SPE-30.7: A CAUSAL DEEP LEARNING FRAMEWORK FOR CLASSIFYING PHONEMES IN COCHLEAR IMPLANTS |
| Kevin Chu; Duke University |
| Leslie Collins; Duke University |
| Boyla Mainsah; Duke University |