Journal article
ARTCDP: An automated data platform for monitoring emerging patterns concerning road traffic crashes in China
Accident analysis and prevention, Vol.174, p.106727
09/01/2022
DOI: 10.1016/j.aap.2022.106727
PMID: 35667199
Abstract
•High-quality data are critical for road traffic injury prevention.•Traditional road traffic injury data have clearly limitations.•We developed an online platform automatically collecting media-reported data.•The platform provides valuable and timely data to traditional official data.
Online media reports provide valuable information for road traffic injury prevention, but technical challenges concerning data acquisition and processing limit analysis and interpretation of such data. Integrating injury epidemiology theory and big data technology, we developed a data platform consisting of four layers (data acquisition, data processing, application and data storage) to automatically collect reports from online Chinese media concerning road traffic crashes every 24 h. We built a text classification model using 20,000 manually annotated news stories based on the Bidirectional Encoder Representations from Transformers (BERT) and then used natural language processing algorithms to extract data concerning 27 structured variables from the news sources. The accuracy of the BERT-based text classification model was 0.9271, with information extraction accuracy exceeding 80% for 22 variables. As of November 30, 2021, the data platform collected 244,650 eligible media reports covering all 333 prefecture-level divisions in China. These reports were from 37,073 websites or social media accounts, which were geographically located in all 31 provinces and over 98% of prefecture-level divisions. Data availability varied greatly from 0.9% to 100% across the 27 structured variables. Additionally, the platform identified 645,787 potentially relevant keywords when applying natural language processing techniques to the textual media reports. Platform data were highly correlated with road police data in province-based road traffic crash statistics (crashes, rs = 0.799; non-fatal injuries, rs = 0.802; deaths, rs = 0.775). In particular, the platform offers valuable data (like crashes involving electric vehicles) that are not included in official road traffic crash statistics. The new automated data platform shows great potential for timely detection of emerging characteristics of road traffic crashes. Further research is needed to improve the platform and apply it to real-time monitoring and analysis of road traffic injuries.
Details
- Title: Subtitle
- ARTCDP: An automated data platform for monitoring emerging patterns concerning road traffic crashes in China
- Creators
- Peixia Cheng - Central South UniversityWangxin Xiao - Central South UniversityPeishan Ning - Central South UniversityLi Li - Central South UniversityZhenzhen Rao - Central South UniversityLei Yang - Central South UniversityDavid C. Schwebel - University of Alabama at BirminghamYang Yang - University of FloridaYun Huang - Central South UniversityGuoqing Hu - Central South University
- Resource Type
- Journal article
- Publication Details
- Accident analysis and prevention, Vol.174, p.106727
- DOI
- 10.1016/j.aap.2022.106727
- PMID
- 35667199
- NLM abbreviation
- Accid Anal Prev
- ISSN
- 0001-4575
- eISSN
- 1879-2057
- Publisher
- Elsevier Ltd
- Language
- English
- Date published
- 09/01/2022
- Academic Unit
- Research Administration
- Record Identifier
- 9984949203302771
Metrics
4 Record Views