Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications

McCall E. Sarrett; Joseph C. Toscano

doi:10.1111/psyp.14475

Back

Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications

Journal article

Peer reviewed

Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications

McCall E. Sarrett and Joseph C. Toscano

Psychophysiology, Vol.61(4), pp.e14475-n/a

04/2024

DOI: 10.1111/psyp.14475

PMID: 37947235

View Online

Abstract

Machine learning techniques have proven to be a useful tool in cognitive neuroscience. However, their implementation in scalp-recorded electroencephalography (EEG) is relatively limited. To address this, we present three analyses using data from a previous study that examined event-related potential (ERP) responses to a wide range of naturally-produced speech sounds. First, we explore which features of the EEG signal best maximize machine learning accuracy for a voicing distinction, using a support vector machine (SVM). We manipulate three dimensions of the EEG signal as input to the SVM: number of trials averaged, number of time points averaged, and polynomial fit. We discuss the trade-offs in using different feature sets and offer some recommendations for researchers using machine learning. Next, we use SVMs to classify specific pairs of phonemes, finding that we can detect differences in the EEG signal that are not otherwise detectable using conventional ERP analyses. Finally, we characterize the timecourse of phonetic feature decoding across three phonological dimensions (voicing, manner of articulation, and place of articulation), and find that voicing and manner are decodable from neural activity, whereas place of articulation is not. This set of analyses addresses both practical considerations in the application of machine learning to EEG, particularly for speech studies, and also sheds light on current issues regarding the nature of perceptual representations of speech. This work is a methodological contribution to the growing field of applying machine learning techniques to EEG data. We compare different approaches to EEG decoding and offer recommendations for best practices to researchers who are interested in using classification analyses. Then, we demonstrate how these techniques can be applied to questions in spoken language processing, providing theoretical insights into the nature of perceptual representations of speech that conventional analysis methods have been unable to address.

Life Sciences & Biomedicine

Neurosciences

Neurosciences & Neurology

Physiology

Psychology

Psychology, Biological

Psychology, Experimental

Science & Technology

Social Sciences

Details

Title: Subtitle: Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications
Creators: McCall E. Sarrett - Villanova University
Joseph C. Toscano - Villanova University
Resource Type: Journal article
Publication Details: Psychophysiology, Vol.61(4), pp.e14475-n/a
DOI: 10.1111/psyp.14475
PMID: 37947235
NLM abbreviation: Psychophysiology
ISSN: 0048-5772
eISSN: 1540-5958
Publisher: Wiley
Number of pages: 23
Grant note: NSF; National Science Foundation (NSF) 2018933 / National Science Foundation; National Science Foundation (NSF) 1945069 / This material is based on upon work supported by the National Science Foundation under Grant No. 1945069. This work used the Augie High-Performance Computing cluster, funded by NSF Grant No. 2018933, at Villanova University. We would like to thank Bob McMu
Language: English
Date published: 04/2024
Academic Unit: Psychological and Brain Sciences
Record Identifier: 9984627197302771

Metrics

6 Record Views

2 Times Cited - Web of Science