Recently, we barely scratched the surface of characterizing fricative phonemes. Now, how to characterize nasal consonants acoustically? A good answer to this question would require plenty of explanations. I suggest you to check on pages 487-514 of Acoustic Phonetics by Professor Kenneth Stevens. But I’ll provide you with some hints, anyway. Nasal consonants are sonorant phonemes, but they exhibit significant losses due to the nasal tract coupling. Further, nasal spectra is relatively very stable during the oral tract closure (there are minimum acoustic alterations). Typically, F1 is located near to 250 Hz, F2 is weak, and F3 is near to 2 kHz. Remember that for these phonemes the acoustic energy also transits the nasal cavities. Such nasal cavities have different frequency properties. But the oral tract, albeit closed, also alters the acoustic transfer function. This transfer function, for simple phonemes such as vowels, includes only poles. However, when the oral tract is closed, the acoustic transfer function also includes zeros. And that changes the output in a great deal. The location of the first spectral zero of nasal consonants depends on the point of oral closure (for instance, the point of closure for /m/ is more anterior than /n/’s).
For my thesis, I developed my own inversion toolbox. But no matter the toolbox, you require a “source” of information for inversion. That information may be spectral energy distribution, formants, etc.
Is it possible to invert fricatives by using Childers’ Toolboxes?
At first sight, I think that the answer is that you can’t. IIRC, Childers’ toolbox allowed for inversion of the sentence “we were away a year ago”. But that’s a very convenient sentence to invert, because most of its relevant acoustic information can be clearly seen with a formants analysis. Nevertheless, that’s not the case for fricatives (and nasals, for instance, have other interesting problems too).
For my thesis, I developed my own inversion toolbox. But no matter the toolbox, you require a “source” of information for inversion. That information may be spectral energy distribution, formants, etc. For fricatives, formants are out-of-question. Fricatives’ spectrum differs importantly from voiced phonemes’, as you know. When we utter fricatives, the oral tract naturally adopts a specific “constriction” configuration… and such configuration would yield a formantic structure. The problem is that turbulence generated in the oral tract hides resonances, and that’s why formant tracking is misleading in such cases.
A list of important documents for Articulatory Speech Synthesis and Inversion research.
The articulatory approach is a very captivating research topic, but it’s relatively hard, and is based on a hefty amount of multidisciplinary documents and results. Germane papers and books are somewhat old or difficult to find. This is my list of selected resources:
- Gunnar Fant – Acoustic Theory of Speech Production
- James L. Flanagan – Speech Analysis, Synthesis and Perception
- Kenneth N. Stevens – Acoustic Phonetics
- Paul Boersma – Functional Phonology
- J. M. Pickett – The Acoustics of Speech Communication
- D. G. Childers – Speech Processing and Synthesis Toolboxes
- A. Seikel, D. King and D. Drumright – Anatomy and Physiology for Speech, Language and Hearing. Lovely book.