Logo image
Mining the Demographics of Craigslist Casual Sex Ads to Inform Public Health Policy
Conference proceeding

Mining the Demographics of Craigslist Casual Sex Ads to Inform Public Health Policy

Jason A Fries, Philip M Polgreen and Alberto M Segre
2014 IEEE International Conference on Healthcare Informatics, pp.61-70
09/2014
DOI: 10.1109/ICHI.2014.16

View Online

Abstract

Anonymous sexual encounters negotiated via the Internet present many challenges to public health officials addressing outbreaks of sexually transmitted infections. The anonymity and potential geographic scale of encounters weaken traditional tools like contact tracing and partner notification. These developments complicate interventions within the men who have sex with men (MSM) population, which has seen increasing health disparities in HIV and syphilis incidence rates over the last decade. This paper presents text-mining methods for conducting public health surveillance of the anonymous MSM populations using the online classified advertisement website Craig list to negotiate casual sexual encounters. We analyze 2.5 years of Craig list data (134 million ads) and present machine learning and rule-based approaches for efficiently mining race/ethnicity and age information from Craig list text. Using previous work in geographic entity recognition, we link ads with specific locations and generate Craig list MSM summary statistics for race/ethnicity and age cohorts in urban and rural geographic areas. This data is then compared to demographic information from the 2010 U.S. Census to quantify how well it reflects the known, underlying population. We find significant correlations between Craig list and census population statistics, suggesting our approach's utility for surveillance applications.
Cities and towns Human immunodeficiency virus Internet Knowledge discovery Natural language processing Public healthcare Sociology Statistics Supervised learning Terminology Text mining

Details

Metrics

Logo image