Logo image
Using Online Classified Ads to Identify the Geographic Footprints of Anonymous, Casual Sex-Seeking Individuals
Conference proceeding

Using Online Classified Ads to Identify the Geographic Footprints of Anonymous, Casual Sex-Seeking Individuals

J. A Fries, A. M Segre and P. M Polgreen
2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing, pp.402-410
09/2012
DOI: 10.1109/SocialCom-PASSAT.2012.86

View Online

Abstract

This paper describes a method of using Craig list personal ads to better understand the movement behavior of anonymous, casual sex-seeking individuals within the men-who-have-sex-with-men community. Given recent dramatic increases in HIV and sexually transmitted disease within this community, gaining insight into how sexual networks connect neighborhoods and cities is important for formulating public health interventions. Due to the high degree of similarity exhibited by subsets of Craig list ads, and the presumption that a set of near-identical ads, when not spam, originate from the same author, we can apply techniques for efficient near-duplicate detection to identify clusters of near-identical ads. By examining each of these clusters and identifying differences in user-supplied location tags, we can then reconstruct an approximation of an anonymous individual's movement footprint over time, as well as estimate the rate at which ad authors seek sexual encounters. For the state of California, we find that 86% of all encounter requests for a given set occur within a 50 mile area, with only less that 4% of messages reflecting long-distance travel over 250 miles. 60% of all detected clusters reposted ads within 2 weeks of the first detected post. We show that even in the relatively noisy, unstructured data environment of anonymous personal ads, it is still possible to extract meaningful signal and identify useful social network properties for analysis.
Cities and towns Communities Craigslist Educational institutions Entropy Feature extraction geography Human immunodeficiency virus near-duplicate detection networks Public healthcare sexual behavior sexually transmitted diseases

Details

Metrics

Logo image