Online processing of acoustic cues used in speech perception: Comparing statistical and neural network models

Joseph Toscano; Bob McMurray

doi:10.1121/1.4782535

Back

Abstract

Online processing of acoustic cues used in speech perception: Comparing statistical and neural network models

Joseph Toscano and Bob McMurray

The Journal of the Acoustical Society of America, Vol.124(4), pp.2437-2437

10/2008

DOI: 10.1121/1.4782535

View Online

Abstract

Most phonological contrasts are signaled by multiple acoustic cues, yet it is unclear how these cues are combined during speech perception. Formal computational modeling offers a useful tool for studying this process. Two computational approaches are presented here. The first is a mixture of Gaussians (MOG) model that forms categories and combines cues based on their statistical distributions [Toscano and McMurray, Proceedings of the Cognitive Science Society (2008)]. The second is a neural network model that combines statistical learning and dynamic online processing [McMurray and Spivey, Proceedings of the Chicago Linguistic Society (1999)]. Both the MOG and the network use the statistical distributions of speech sounds to form categories. The MOG offers transparency in that its categories correspond directly to distributional statistics measured from phonetic data. However, it does not capture the online processing observed in behavioral experiments that suggest that the speech system makes preliminary commitments before all cues are available [McMurray, Clayards, Tanenhaus, and Aslin (submitted)]. The network offers an approach that may allow us to observe this processing. Thus, while the MOG may better clarify the relationship between acoustics and phonological categories, the network may better model the process of speech perception.

Details

Title: Subtitle: Online processing of acoustic cues used in speech perception: Comparing statistical and neural network models
Creators: Joseph Toscano
Bob McMurray
Resource Type: Abstract
Publication Details: The Journal of the Acoustical Society of America, Vol.124(4), pp.2437-2437
DOI: 10.1121/1.4782535
ISSN: 0001-4966
eISSN: 1520-8524
Language: English
Date published: 10/2008
Academic Unit: Communication Sciences and Disorders; Psychological and Brain Sciences; Linguistics; Iowa Neuroscience Institute; Otolaryngology
Record Identifier: 9984071658002771

Metrics

8 Record Views