Journal article
Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012
Scientific data, Vol.12(1), 1552
09/24/2025
DOI: 10.1038/s41597-025-05558-9
PMCID: PMC12460618
PMID: 40993151
Abstract
This paper introduces the largest and most comprehensive dataset of US presidential campaign television advertisements, available in digital format. The dataset also includes machine-searchable transcripts and high-quality summaries designed to facilitate a variety of academic research. To date, there has been great interest in collecting and analyzing US presidential campaign advertisements, but the need for manual procurement and annotation has led many to rely on smaller subsets. We design a large-scale, parallelized, AI-based analysis pipeline that automates the laborious process of preparing, transcribing, storyboarding, and summarizing videos. We then apply this methodology to the 9,707 presidential ads from the Julian P. Kanter Political Commercial Archive. We conduct extensive human evaluations to show that these transcripts and summaries match the quality of manually generated alternatives. We illustrate the value of this data by including an application that tracks the genesis and evolution of current focal issue areas over seven decades of presidential elections. Our analysis pipeline and codebase also show how to use LLM-based tools to obtain high-quality summaries for other video datasets.
Details
- Title: Subtitle
- Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012
- Creators
- Adam Breuer - Dartmouth CollegeBryce J Dietrich - Purdue University West LafayetteMichael H Crespin - University of OklahomaMatthew Butler - University of Iowa, Digital Scholarship & Publishing Studio, Iowa City, IA, USAJ A Pryse - University of OklahomaKosuke Imai - Harvard University
- Resource Type
- Journal article
- Publication Details
- Scientific data, Vol.12(1), 1552
- DOI
- 10.1038/s41597-025-05558-9
- PMID
- 40993151
- PMCID
- PMC12460618
- NLM abbreviation
- Sci Data
- ISSN
- 2052-4463
- eISSN
- 2052-4463
- Publisher
- NATURE PORTFOLIO
- Grant note
- 2147635 / National Science Foundation (NSF) 2148928 / National Science Foundation (NSF) 2148202 / National Science Foundation (NSF)
- Language
- English
- Date published
- 09/24/2025
- Academic Unit
- Digital Scholarship and Publishing Studio
- Record Identifier
- 9984966343202771
Metrics
3 Record Views