The articulatory approach is a very captivating research topic, but it’s relatively hard, and is based on a hefty amount of multidisciplinary documents and results. Germane papers and books are somewhat old or difficult to find. This is my list of selected resources:
- Gunnar Fant – Acoustic Theory of Speech Production
- James L. Flanagan – Speech Analysis, Synthesis and Perception
- Kenneth N. Stevens – Acoustic Phonetics
- Paul Boersma – Functional Phonology
- J. M. Pickett – The Acoustics of Speech Communication
- D. G. Childers – Speech Processing and Synthesis Toolboxes
- A. Seikel, D. King and D. Drumright – Anatomy and Physiology for Speech, Language and Hearing. Lovely book.
Papers, Theses and Slides
- P. Mermelstein – Articulatory Model for the Study of Speech Production. Mermelstein’s paper is a classic, and a mandatory read for anyone interested in the subject of articulatory synthesis and inversion. It describes the most used articulatory model to date. Albeit 3D articulatory models are the current trend, Mermelstein’s model is still very useful for research.
- Michael Portnoff – A Quasi-one-dimensional Digital Simulation for the Time-varying Vocal Tract
- J. Dang and K. Honda – Construction and Control of a Physiological Articulatory Model
- I. Howard and M. Huckvale – Learning to control an articulatory synthesizer by imitating real speech
- Olov Engwall – Vocal tract modeling in 3D
- R. Sproat, M. Ostendorf and A. Hunt (Editors) – The Need for Increased Speech Synthesis Research
- W. Hess – Artikulatorische und akustische Phonetik (German) This concrete presentation of articulatory and acoustic phonetics by professor Wolfgang Hess is one of the best in the web.
- Qiguang Lin – Speech Production Theory and Articulatory Speech Synthesis.
A few of my own documents
My research is just a drop compared to the above oceans. Although some of the ideas and assumptions I followed are already old, someone may find them useful:
- Genetic Learning of Vocal Tract Area Functions for Articulatory Synthesis of Spanish Vowels.
- Multipopulation genetic learning of midsagittal articulatory models for speech synthesis.
- Técnicas de Aprendizaje Artificial Aplicadas al Problem Inverso de la Síntesis Articulatoria de Voz por Computadora (Spanish) My doctoral dissertation: Parte I, Parte II, Parte III.
Finally, please notice that the list is by no means complete or comprehensive. Some very important papers are missing, such as Flanagan and Ishizaka vocal fold modeling, Maeda’s simulation of the vocal tract, Rubin’s description of a synthesizer, Sorokin’s model and others.