Unsupervised learning of acoustic features via deep canonical correlation analysis

Weiran Wang; Raman Arora; Karen Livescu; Jeff A. Bilmes

doi:10.1109/ICASSP.2015.7178840

Back

Conference proceeding

Unsupervised learning of acoustic features via deep canonical correlation analysis

Weiran Wang, Raman Arora, Karen Livescu and Jeff A. Bilmes

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol.2015-, pp.4590-4594

04/01/2015

DOI: 10.1109/ICASSP.2015.7178840

View Online

Abstract

It has been previously shown that, when both acoustic and articulatory training data are available, it is possible to improve phonetic recognition accuracy by learning acoustic features from this multi-view data with canonical correlation analysis (CCA). In contrast with previous work based on linear or kernel CCA, we use the recently proposed deep CCA, where the functional form of the feature mapping is a deep neural network. We apply the approach on a speaker-independent phonetic recognition task using data from the University of Wisconsin X-ray Microbeam Database. Using a tandem-style recognizer on this task, deep CCA features improve over earlier multi-view approaches as well as over articulatory inversion and typical neural network-based tandem features. We also present a new stochastic training approach for deep CCA, which produces both faster training and better-performing features.

articulatory measurements

Artificial intelligence

deep canonical correlation analysis

Kernel

Mel frequency cepstral coefficient

multi-view learning

neural networks

Principal component analysis

Speech

Speech recognition

Training

XRMB

Details

Title: Subtitle: Unsupervised learning of acoustic features via deep canonical correlation analysis
Creators: Weiran Wang - Toyota Technological Institute at Chicago
Raman Arora - Johns Hopkins University
Karen Livescu - Toyota Technological Institute at Chicago
Jeff A. Bilmes - University of Washington
Resource Type: Conference proceeding
Publication Details: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol.2015-, pp.4590-4594
Publisher: IEEE
DOI: 10.1109/ICASSP.2015.7178840
ISSN: 1520-6149
eISSN: 2379-190X
Language: English
Date published: 04/01/2015
Academic Unit: Computer Science
Record Identifier: 9984696721602771

Metrics

1 Record Views

74 Times Cited - Web of Science