|| SPEAKING RATE AND TONAL REALIZATION IN MANDARIN CHINESE: WHAT CAN WE LEARN FROM LARGE SPEECH CORPORA?
||Jiahong Yuan, Kenneth Church, Baidu Research, USA, United States|
|Session||SPE-29: Speech Processing 1: Production|
|Session Time:||Wednesday, 09 June, 16:30 - 17:15|
|Presentation Time:||Wednesday, 09 June, 16:30 - 17:15|
|| Speech Processing: [SPE-SPRD] Speech Production|
|IEEE Xplore Open Preview
|| Click here to view in IEEE Xplore
|| Click here to watch in the Virtual Conference
|| Two Mandarin speech corpora were used to investigate tonal realization in terms of duration and pitch. The data consist of nearly 1000 hours of speech from more than 1600 speakers. The two corpora, both developed for ASR, differ in speaking rate by approximately 25%. This provides an opportunity to examine the influence of speaking rate on the realization of tones in natural speech. Our analysis found two differences for slower speaking rates: (1) lower "static" tones and (2) more change for "dynamic" tones. Tone 1 was higher and Tone 3 was lower on the first syllable of disyllabic words, suggesting a metrical structure of left-prominence. On the other hand, however, the second syllable was longer, and the slope of Tone 2 and Tone 4 was higher on the second syllable in one of the corpora, both of which suggest right-prominence. We also found a shift from right-prominence to left-prominence, with respect to the realization of the "dynamic" tones, when the speaking rate became slower. Our study demonstrated that both phrasing and metrical structure play an important role in tonal realization.