Conference proceeding
Efficient Segmental Cascades for Speech Recognition: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5, pp.1903-1907
Interspeech
01/01/2016
DOI: 10.21437/Interspeech.2016-1298
Abstract
Discriminative segmental models offer a way to incorporate flexible feature functions into speech recognition. However, their appeal has been limited by their computational requirements, due to the large number of possible segments to consider. Multi-pass cascades of segmental models introduce features of increasing complexity in different passes, where in each pass a segmental model restores lattices produced by a previous (simpler) segmental model. In this paper, we explore several ways of making segmental cascades efficient and practical: reducing the feature set in the first pass, frame subsampling, and various pruning approaches. In experiments on phonetic recognition, we find that with a combination of such techniques, it is possible to maintain competitive performance while greatly reducing decoding, pruning, and training time.
Details
- Title: Subtitle
- Efficient Segmental Cascades for Speech Recognition: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES
- Creators
- Hao Tang - Toyota Technol Inst Chicago, Chicago, IL 60637 USAWeiran Wang - Toyota Technol Inst Chicago, Chicago, IL 60637 USAKevin Gimpel - Toyota Technol Inst Chicago, Chicago, IL 60637 USAKaren Livescu - Toyota Technol Inst Chicago, Chicago, IL 60637 USA
- Resource Type
- Conference proceeding
- Publication Details
- 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5, pp.1903-1907
- Publisher
- Isca-Int Speech Communication Assoc
- Series
- Interspeech
- DOI
- 10.21437/Interspeech.2016-1298
- ISSN
- 2308-457X
- Number of pages
- 5
- Grant note
- Google faculty research award; Google Incorporated IIS-1433485 / NSF; National Science Foundation (NSF)
- Language
- English
- Date published
- 01/01/2016
- Academic Unit
- Computer Science
- Record Identifier
- 9984696570602771
Metrics
2 Record Views