Lexicon-free fingerspelling recognition from video: Data, models, and signer adaptationI

Taehwan Kim; Jonathan Keane; Weiran Wang; Hao Tang; Jason Riggle; Gregory Shakhnarovich; Diane Brentari; Karen Livescu

doi:10.1016/j.csl.2017.05.009

Back

Lexicon-free fingerspelling recognition from video: Data, models, and signer adaptationI

Journal article

Open access

Peer reviewed

Lexicon-free fingerspelling recognition from video: Data, models, and signer adaptationI

Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari and Karen Livescu

Computer speech & language, Vol.46, pp.209-232

11/01/2017

DOI: 10.1016/j.csl.2017.05.009

Files and links (1)

url

https://doi.org/10.1016/j.csl.2017.05.009View

Published (Version of record) Open Access

Abstract

We study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL). Fingerspelling comprises a significant but relatively understudied part of ASL. Recognizing fingerspelling is challenging for a number of reasons: it involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected. In this work we collect and annotate a new data set of continuous fingerspelling videos, compare several types of recognizers, and explore the problem of signer variation. Our best-performing models are segmental (semi-Markov) conditional random fields using deep neural network-based features. In the signer dependent setting, our recognizers achieve up to about 92% letter accuracy. The multi-signer setting is much more challenging, but with neural network adaptation we achieve up to 83% letter accuracies in this setting. (C) 2017 Elsevier Ltd. All rights reserved.

Computer Science

Computer Science, Artificial Intelligence

Science & Technology

Technology

Details

Title: Subtitle: Lexicon-free fingerspelling recognition from video: Data, models, and signer adaptationI
Creators: Taehwan Kim - Kenwood (United Kingdom)
Jonathan Keane - University of Chicago
Weiran Wang - Toyota Technological Institute at Chicago
Hao Tang - Toyota Technological Institute at Chicago
Jason Riggle - University of Chicago
Gregory Shakhnarovich - Toyota Technological Institute at Chicago
Diane Brentari - University of Chicago
Karen Livescu - Toyota Technological Institute at Chicago
Resource Type: Journal article
Publication Details: Computer speech & language, Vol.46, pp.209-232
DOI: 10.1016/j.csl.2017.05.009
ISSN: 0885-2308
eISSN: 1095-8363
Publisher: Elsevier
Number of pages: 24
Grant note: 1433485; 1409886; 1251807 / NSF grants; National Science Foundation (NSF) Google Faculty Award; Google Incorporated 1433485; 1409886 / Div Of Information & Intelligent Systems; Direct For Computer & Info Scie & Enginr; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE) 1251807 / Direct For Social, Behav & Economic Scie; Division Of Behavioral and Cognitive Sci; National Science Foundation (NSF); NSF - Directorate for Social, Behavioral & Economic Sciences (SBE)
Language: English
Date published: 11/01/2017
Academic Unit: Computer Science
Record Identifier: 9984696565102771

Metrics

7 Record Views

33 Times Cited - Web of Science