Abstract
In order to synthesize natural sounding speech with voice quality variations, we propose a concatenative synthesis method based on stored formant/antiformant templates of vowel-consonant-vowel (VCV) segments and on sophisticated control of voice source parameters. By using the parametric Rosenberg-Klatt (RK) model to generate a voiced source waveform and an autoregressive exogenous (ARX) model to represent voiced speech production process, a new adaptive pitch-synchronous analysis method has been devised to estimate the model parameters from which the templates are semiautomatically created. The Kalman filter algorithm deals with the ARX model identification and a simulated annealing method is used for the nonlinear optimization to estimate the voice source parameters. The method has been tested with synthetic speech sounds by comparing widi some other approaches in terms of the accuracy of estimated parameter values. Preliminary synthesis experiments have shown that natural sounding speech with various voice qualities can be generated with the proposed method by manipulating the voice source parameters.
Original language | English |
---|---|
Pages | 159-162 |
Number of pages | 4 |
Publication status | Published - 1994 |
Externally published | Yes |
Event | 3rd International Conference on Spoken Language Processing, ICSLP 1994 - Yokohama, Japan Duration: 1994 Sept 18 → 1994 Sept 22 |
Conference
Conference | 3rd International Conference on Spoken Language Processing, ICSLP 1994 |
---|---|
Country/Territory | Japan |
City | Yokohama |
Period | 94/9/18 → 94/9/22 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language