Novel learning-based multi-modal methods for human voice apparatus imaging
Abstract
Details
- Title: Subtitle
- Novel learning-based multi-modal methods for human voice apparatus imaging
- Creators
- Rushdi Zahid Rusho
- Contributors
- Sajan Goud Lingala (Advisor)Sean B. Fain (Committee Member)Mathews Jacob (Committee Member)David Meyer (Committee Member)Sarah C. Vigmostad (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Biomedical Engineering
- Date degree season
- Autumn 2024
- DOI
- 10.25820/etd.007602
- Publisher
- University of Iowa
- Number of pages
- xx, 103 pages
- Copyright
- Copyright 2024 Rushdi Zahid Rusho
- Grant note
- This research was conducted using an MRI instrument funded by the NIH under grant 1S10OD025025-01. Additionally, this research was partially supported by the National Institutes of Health Predoctoral Training Grant T32 HL 144461, with PIs Eric A. Hoffman and Joseph M. Reinhardt; the University of Iowa Radiology Pilot Grant; and the University of Iowa OVPR Jump Start Award.
- Language
- English
- Date submitted
- 09/12/2024
- Description illustrations
- illustrations, graphs
- Description bibliographic
- Includes bibliographical references (pages 86-103).
- Public Abstract (ETD)
Human speech production is a complex process that involves generating vibrations at vocal folds, and modulating breath through vocal tract shaping to produce meaningful sounds primarily for communication and social interaction. Vocal apparatus refers to the human vocal tract anatomy responsible for sound generation, and consists of several structures such as vocal cords, tongue, lips, and nose. Disorders in any of these components can compromise one’s ability to produce effective language and communication. Visualizing the complex movements of vocal apparatus is crucial to advance our understanding of speech production mechanism, diagnose speech disorders, optimize speech therapy, aid in planning surgical procedures on speech organs, and improve various technologies related to speech synthesis and recognition. However, current imaging modalities possess several technical limitations. X-ray Computed Tomography (CT) can clearly detect air-tissue boundaries, and bony structures, but it faces limitations due to radiation exposure. Ultrasound can safely image rapidly moving organs, but cannot capture deeper structures, and has poor image quality for soft tissues. Electromagnetic articulography (EMA) and Electropalatography (EPG) track movement and position of articulators in real time, but it is invasive and do not provide anatomical views. Optical endoscopy uses a flexible optical fiber with a camera to visualize nose and larynx, but it invasively deforms the anatomy. Magnetic Resonance Imaging (MRI) is emerging as a powerful modality for dynamic vocal apparatus imaging during speech due to its excellent soft-tissue image quality, no radiation exposure, and capability to image along any arbitrary plane orientations, but challenges remain for its widespread adaption. For example, due to physical working principles of the device, MRI image quality deteriorates at air and tissue boundaries, and MRI becomes vulnerable to faithfully capture vocal apparatus motion while trying to reconstruct fast arbitrary speech vi and breathing tasks. There is, therefore, a critical need for an imaging modality that can safely and comprehensively visualize vocal tract shaping during speech production with high spatial and temporal fidelity. This thesis develops novel vocal apparatus MRI techniques for speech and voice production tasks at scanners with 3 Tesla field strength. Specifically, this thesis has four components related to imaging various aspects of the vocal apparatus: (1) Development and evaluation of a novel MRI method for speech imaging with improved image quality and motion capture capabilities; (2) Development of a novel technique using MRI that provides a three-dimensional view of vocal tract shaping during speech; (3) Feasibility study and development of novel MRI methods for imaging of larynx to visualize overall changes in vocal folds configuration during speech and breathing; (4) Development of a framework to combine bony structures from ultra-low dose (very low radiation exposure) CT and soft-tissue structures from MRI to create a three-dimensional high image quality hybrid CT-MRI model of vocal tract. The effectiveness of our methods was validated by several experimental data, blind image quality analysis by experts, and in-vivo experiments in a range of speech science applications.
- Academic Unit
- Roy J. Carver Department of Biomedical Engineering
- Record Identifier
- 9984774766902771