Book chapter
Hybrid Crowd-Machine Methods as Alternatives to Pooling and Expert Judgments
Information Retrieval Technology, pp.60-72
Lecture Notes in Computer Science, Springer International Publishing
2014
DOI: 10.1007/978-3-319-12844-3_6
Abstract
Pooling is a document sampling strategy commonly used to collect relevance judgments when multiple retrieval/ranking algorithms are involved. A fixed number of top ranking documents from each algorithm form a pool. Traditionally, expensive experts judge the pool of documents for relevance. We propose and test two hybrid algorithms as alternatives that reduce assessment costs and are effective. The machine part selects documents to judge from the full set of retrieved documents. The human part uses inexpensive crowd workers to make judgments. We present a clustered and a non-clustered approach for document selection and two experiments testing our algorithms. The first is designed to be statistically robust, controlling for variations across crowd workers, collections, domains and topics. The second is designed along natural lines and investigates more topics. Our results demonstrate high quality can be achieved and at low cost. Moreover, this can be done by judging far fewer documents than with pooling. Precision, recall, F-scores and LAM are very strong, indicating that our algorithms with crowd sourcing offer viable alternatives to collecting judgments via pooling with expert assessments.
Details
- Title: Subtitle
- Hybrid Crowd-Machine Methods as Alternatives to Pooling and Expert Judgments
- Creators
- Christopher G Harris - Department of Computer Science, SUNY Oswego, Oswego, USAPadmini Srinivasan - Department of Computer Science, University of Iowa, Iowa City, USA
- Resource Type
- Book chapter
- Publication Details
- Information Retrieval Technology, pp.60-72
- Publisher
- Springer International Publishing; Cham
- Series
- Lecture Notes in Computer Science
- DOI
- 10.1007/978-3-319-12844-3_6
- eISSN
- 1611-3349
- ISSN
- 0302-9743
- Language
- English
- Date published
- 2014
- Academic Unit
- Nursing; Business Analytics; Computer Science
- Record Identifier
- 9984003182802771
Metrics
13 Record Views