Other
Appendices. Connecting family trees to construct a population-scale and longitudinal geo-social network for the U.S
Taylor & Francis
01/01/2021
DOI: 10.6084/m9.figshare.13026504
Abstract
We collected 92,832 user-contributed and publicly available family trees from rootsweb.com, including 250 million individuals who were born in North America and Europe between 1630 and 1930. We cleaned and connected the family trees to create a population-scale and longitudinal family tree dataset using a workflow of data collection and cleaning, geocoding, fuzzy record linkage and a relation-based iterative search for connecting trees and deduplication of records. Given the largest connected component of nearly 40 million individuals, and a total of 80 million individuals, we generated, to date, the largest population-scale and longitudinal geo-social network over centuries. We evaluated the representativeness of the family tree dataset for historical population demography and mobility by comparing the data to the 1880 Census. Our results showed that the family trees were biased towards males, the elderly, farmers, and native-born white segments of the population. Individuals were highly mobile in our 1880 sample of parent-child pairs where both were born in the U.S., 47% were born in different states. Our findings agreed with prior studies that people migrated from East to West in horizontal bands, and the trend was reflected in the dialects and regional structure of the U.S.
Details
- Title: Subtitle
- Appendices. Connecting family trees to construct a population-scale and longitudinal geo-social network for the U.S
- Creators
- Caglar KoyluDiansheng GuoYuan HuangAlice Bee KasakoffJack Grieve
- Resource Type
- Other
- DOI
- 10.6084/m9.figshare.13026504
- Publisher
- Taylor & Francis
- Language
- English
- Date published
- 01/01/2021
- Academic Unit
- Center for Social Science Innovation; Geographical and Sustainability Sciences
- Record Identifier
- 9984528097802771
Metrics
20 Record Views