Journal article
Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews
Information technology & tourism, Vol.18(1), pp.43-59
04/2018
DOI: 10.1007/s40558-017-0098-z
Abstract
As an emerging research paradigm, big data analytics has been gaining currency in various fields. However, in existing hospitality and tourism literature there is scarcity of discussions on the quality of data which may impact the validity and generalizability of research findings. This study examines the reliability of online hotel reviews in TripAdvisor by developing a text classifier to predict travel purpose (i.e., business vs. leisure) based upon review textual contents. The classifier is tested over a range of cities and data sizes to examine its sensitivity to data samples. The findings show that, while the classifier’s performance is consistent across different cities, there are variations in response to data sizes and sampling methods. More importantly, a considerable amount of noise is found in the data, which leads to misclassification. Furthermore, a novel approach is developed to address the misclassification problem resulting from data noise. This study reveals important data quality issues and contributes to the theoretical development of social media analytics in hospitality and tourism.
Details
- Title: Subtitle
- Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews
- Creators
- Zheng Xiang - 0000 0001 2214 9197 grid.411618.b Collaborative Innovation Center of eTourism Beijing Union University Beijing ChinaQianzhou Du - 0000 0001 0694 4940 grid.438526.e Department of Business Information Technology, Pamplin College of Business Virginia Tech Blacksburg VA USAYufeng Ma - 0000 0001 0694 4940 grid.438526.e Department of Computer Science Virginia Tech Blacksburg VA USAWeiguo Fan - 0000 0001 0694 4940 grid.438526.e Department of Accounting and Information Systems, Pamplin College of Business Virginia Tech Blacksburg VA USA
- Resource Type
- Journal article
- Publication Details
- Information technology & tourism, Vol.18(1), pp.43-59
- Publisher
- Springer Berlin Heidelberg
- DOI
- 10.1007/s40558-017-0098-z
- ISSN
- 1098-3058
- eISSN
- 1943-4294
- Grant note
- 71373023 / Natural Science Foundation of China SM201611417001 / Beijing Municipal Commission of Education (http://dx.doi.org/10.13039/501100002888)
- Language
- English
- Date published
- 04/2018
- Academic Unit
- Business Analytics
- Record Identifier
- 9984083298002771
Metrics
19 Record Views