Journal article
Deep Learning for Neuromuscular Control of Vocal Source for Voice Production
Applied sciences, Vol.14(2), p.769
01/01/2024
DOI: 10.3390/app14020769
PMCID: PMC11281313
PMID: 39071945
Abstract
A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, LeTalker, a biophysical computational model of the vocal system was used as the physical plant. In the LeTalker, a three-mass vocal fold model was used to simulate self-sustained vocal fold oscillation. A constant /(sic)/ vowel was used for the vocal tract shape. The trachea was modeled after MRI measurements. The neuromuscular control system generates control parameters to achieve four acoustic targets (fundamental frequency, sound pressure level, normalized spectral centroid, and signal-to-noise ratio) and four somatosensory targets (vocal fold length, and longitudinal fiber stress in the three vocal fold layers). The deep-learning-based control system comprises one acoustic feedforward controller and two feedback (acoustic and somatosensory) controllers. Fifty thousand steady speech signals were generated using the LeTalker for training the control system. The results demonstrated that the control system was able to generate the lung pressure and the three muscle activations such that the four acoustic and four somatosensory targets were reached with high accuracy. After training, the motor command corrections from the feedback controllers were minimal compared to the feedforward controller except for thyroarytenoid muscle activation.
Details
- Title: Subtitle
- Deep Learning for Neuromuscular Control of Vocal Source for Voice Production
- Creators
- Anil Palaparthi - University of UtahRishi K. Alluri - University of UtahIngo R. Titze - University of Utah
- Resource Type
- Journal article
- Publication Details
- Applied sciences, Vol.14(2), p.769
- DOI
- 10.3390/app14020769
- PMID
- 39071945
- PMCID
- PMC11281313
- NLM abbreviation
- Appl Sci (Basel)
- ISSN
- 2076-3417
- eISSN
- 2076-3417
- Publisher
- Mdpi
- Number of pages
- 18
- Grant note
- NIH/NIDCD; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; NIH National Institute on Deafness & Other Communication Disorders (NIDCD)
- Language
- English
- Date published
- 01/01/2024
- Academic Unit
- School of Music; Communication Sciences and Disorders
- Record Identifier
- 9984719754602771
Metrics
10 Record Views