Conference proceeding
Incorporating Scalability in Unsupervised Spatio- Temporal Feature Learning
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol.2018-, pp.1503-1507
09/10/2018
DOI: 10.1109/ICASSP.2018.8461758
Abstract
Deep neural networks are efficient learning machines which leverage upon a large amount of manually labeled data for learning discriminative features. However, acquiring substantial amount of supervised data, especially for videos can be a tedious job across various computer vision tasks. This necessitates learning of visual features from videos in an unsupervised setting. In this paper, we propose a computationally simple, yet effective, framework to learn spatio-temporal feature embedding from unlabeled videos. We train a Convolutional 3D Siamese network using positive and negative pairs mined from videos under certain probabilistic assumptions. Experimental results on three datasets demonstrate that our proposed framework is able to learn weights which can be used for same as well as cross dataset and tasks.
Details
- Title: Subtitle
- Incorporating Scalability in Unsupervised Spatio- Temporal Feature Learning
- Creators
- Sujoy Paul - University of California, RiversideSourya Roy - University of California, RiversideAmit K. Roy-Chowdhury - University of California, Riverside
- Resource Type
- Conference proceeding
- Publication Details
- 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol.2018-, pp.1503-1507
- Publisher
- IEEE
- DOI
- 10.1109/ICASSP.2018.8461758
- ISSN
- 1520-6149
- eISSN
- 2379-190X
- Language
- English
- Date published
- 09/10/2018
- Academic Unit
- Computer Science
- Record Identifier
- 9984446421802771
Metrics
3 Record Views