Conference proceeding
Acoustic Feature Learning via Deep Variational Canonical Correlation Analysis: SITUATED INTERACTION
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6, pp.1656-1660
Interspeech
01/01/2017
DOI: 10.21437/Interspeech.2017-1581
Abstract
We study the problem of acoustic feature learning in the setting where we have access to another (non-acoustic) modality for feature learning but not at test time. We use deep variational canonical correlation analysis (VCCA), a recently proposed deep generative method for multi-view representation learning. We also extend VCCA with improved latent variable priors and with adversarial learning. Compared to other techniques for multi-view feature learning, VCCA's advantages include an intuitive latent variable interpretation and a variational lower bound objective that can be trained end-to-end efficiently. We compare VCCA and its extensions with previous feature learning methods on the University of Wisconsin X-ray Microbeant Database, and show that VCCA-based feature learning improves over previous methods for speaker-independent phonetic recognition.
Details
- Title: Subtitle
- Acoustic Feature Learning via Deep Variational Canonical Correlation Analysis: SITUATED INTERACTION
- Creators
- Qingming Tang - Toyota Technol Inst, Chicago, IL 60637 USAWeiran Wang - Toyota Technol Inst, Chicago, IL 60637 USAKaren Livescu - Toyota Technol Inst, Chicago, IL 60637 USA
- Resource Type
- Conference proceeding
- Publication Details
- 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6, pp.1656-1660
- Publisher
- Isca-Int Speech Communication Assoc
- Series
- Interspeech
- DOI
- 10.21437/Interspeech.2017-1581
- ISSN
- 2308-457X
- Number of pages
- 5
- Grant note
- IIS-1321015 / NSF; National Science Foundation (NSF)
- Language
- English
- Date published
- 01/01/2017
- Academic Unit
- Computer Science
- Record Identifier
- 9984696581202771
Metrics
6 Record Views