Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning

Ahmed M Yousef; Adrián Castillo-Allendes; Mark L Berardi; Juliana Codino; Adam D Rubin; Eric J Hunter

doi:10.1159/000544852

Back

Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning

Journal article

Open access

Peer reviewed

Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning

Ahmed M Yousef, Adrián Castillo-Allendes, Mark L Berardi, Juliana Codino, Adam D Rubin and Eric J Hunter

Folia phoniatrica et logopaedica, Vol.77(5), pp.480-494

10/2025

DOI: 10.1159/000544852

PMCID: PMC12353333

PMID: 39987907

View Online

Abstract

The Acoustic Voice Quality Index (AVQI) and smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessing of voice quality in persons seeking voice care across many languages. This study aims to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.INTRODUCTIONThe Acoustic Voice Quality Index (AVQI) and smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessing of voice quality in persons seeking voice care across many languages. This study aims to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (Random Forest (RF), k-Nearest Neighbors (k-NN), Support Vector Machine (SVM), and Decision Tree (DT)) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden index.METHODSThis retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (Random Forest (RF), k-Nearest Neighbors (k-NN), Support Vector Machine (SVM), and Decision Tree (DT)) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden index.A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered the best balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.RESULTSA cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered the best balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.Machine learning shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.CONCLUSIONSMachine learning shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.

Voice disorders

Machine learning

Speech acoustics

Acoustic Voice Quality Index

Cepstral Peak Prominence

Details

Title: Subtitle: Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning
Creators: Ahmed M Yousef
Adrián Castillo-Allendes
Mark L Berardi
Juliana Codino
Adam D Rubin
Eric J Hunter
Resource Type: Journal article
Publication Details: Folia phoniatrica et logopaedica, Vol.77(5), pp.480-494
DOI: 10.1159/000544852
PMID: 39987907
PMCID: PMC12353333
NLM abbreviation: Folia Phoniatr Logop
ISSN: 1421-9972
eISSN: 1421-9972
Publisher: KARGER
Grant note: National Institute of Deafness and Other Communication Disorders of the National Institutes of Health: R01DC012315
This work was supported by the National Institute of Deafness and Other Communication Disorders of the National Institutes of Health (Award No. R01DC012315). The funder had no role in the design, data collection, data analysis, and reporting of this study. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Language: English
Electronic publication date: 02/21/2025
Date published: 10/2025
Academic Unit: Communication Sciences and Disorders; Teaching and Learning; Otolaryngology
Record Identifier: 9984795372902771

Metrics

67 Record Views

3 Times Cited - Web of Science

See more details