Dissertation
Applications of machine learning to marketing science
University of Iowa
Doctor of Philosophy (PhD), University of Iowa
Spring 2023
DOI: 10.25820/etd.007036
Abstract
Machine learning can efficiently process large volumes of data and identify specific trends and complex patterns that may have gone unnoticed by traditional statistical models. In recent years, there have been several attempts to utilize machine learning models in marketing literature; however, machine learning models typically tend to be a “black box,” and such models have been criticized for their lack of interpretability. As the primary motivation for my dissertation stems from the notion that marketing science models should be interpretable, this study endeavors to develop marketing science models using machine learning algorithms that can offer interpretable solutions to marketing-related problems.
Essay 1 examines the manner in which certain financial behaviors affect consumer credit scores. I analyze a large volume of credit report data that itself contains 314 variables related to the financial activities of 93,798 individuals. The number of predictors and the volume of data inspired me to utilize the machine learning model. Among the five candidates I compared, the LightGBM performed most effectively in accurately predicting credit scores, which is categorized as a black box model. I then use the Shapley Additive exPlanations (SHAP) to enhance the black box model’s overall interpretability. This combined approach allowed me to distill a large amount of credit report data into actionable insights, which can in turn be used to help consumers more fully understand both the negative consequences of their risky financial behavior and the sound decisions that can aid in improving their credit score.
In Essay 2, I develop a methodology that can identify market segmentation using a large volume of consumer panel data. The existing marketing science models used to identify consumer heterogeneity generally do not embrace large data sets well. A more promising approach is to utilize machine learning methods, which are capable of handling large quantities of data efficiently. The problem is that machine learning models are not based on marketing science theory and often produce results that are not interpretable by managers. Given these challenges, I describe an approach to the calibration of a choice model with unobserved consumer heterogeneity in a Big Data context. Our proposed Autoencoder-Latent Class Model (ALCM) commences with utilizing autoencoder–a type of feedforward deep neural network–to create parsimonious latent representations of a large volume of consumers based on their different shopping patterns. I then use the latent representation with stratified sampling to produce a representative sample of the original data. The choice model is then fit to this representative sample. Using simulation, I show that for a variety of retail marketing mix scenarios, the latent representation generated by autoencoder always provides clustering of consumers that more accurately reflects differences in not only their decision rules but also the store environment, with up to a 50% decrease in sampling errors compared to other benchmark models. Furthermore, a latent class model fit to the representative sample accurately recovers the segmentation structure of the population dataset. Moreover, the computational time is minimal compared to the same analysis of the large population dataset. I also apply the ACLM procedure to an analysis of market segmentation in the carbonated soft drink industry using a large dataset from the states of Texas and California in 2018 and 2019. As in the simulations, compared to the traditional model using raw data, our proposed model uncovers the structure of market segmentation and predicts consumer segmentation with 82% accuracy and approximately 95% decrease in computational time. I argue that the modeling approach offers new opportunities for large retailers and researchers who have encountered challenges in fitting conventional marketing science models in a Big Data context.
Details
- Title: Subtitle
- Applications of machine learning to marketing science
- Creators
- Seung Wook Kim
- Contributors
- Gary Russell (Advisor)Thomas Gruca (Advisor)Hyeong-Tak Lee (Committee Member)Tong Wang (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Business Administration (Marketing)
- Date degree season
- Spring 2023
- DOI
- 10.25820/etd.007036
- Publisher
- University of Iowa
- Number of pages
- x, 97 pages
- Copyright
- Copyright 2023 Seung-Wook Kim
- Language
- English
- Date submitted
- 04/25/2023
- Date approved
- 06/30/2023
- Description illustrations
- illustrations, tables, graphs
- Description bibliographic
- Includes bibliographical references (pages 86-97).
- Public Abstract (ETD)
- This research utilizes recent advances in machine learning to explore marketing-related problems that cannot be solved through traditional statistical models and consequently provide novel information that generates insights for both managers and consumers. This involves developing a quantitative model that can both handle Big Data and render a deeper understanding of consumer behavior in the domain of financial decision-making and retail shopping. Essay 1 aims to understand how consumers’ financial activities influence their credit scores. I analyze a large volume of credit report data that itself contains 314 variables related to the financial activities of 93,798 individuals. The number of predictors and the volume of data inspired me to utilize the machine learning model. Using an interpretable machine learning model, I identify key determinants of credit scores and quantify their effects. Based on my robust empirical findings, I provide consumers with financial advice tied to specific behaviors. This study contributes to the financial decision-making and transformative consumer research literature by improving the quality of consumers’ decisions and enhancing consumer welfare. Essay 2 develops a methodology that can be used to identify market segmentation using a large volume of consumer panel data. The existing marketing science models used to identify consumer heterogeneity generally do not embrace large data sets well. Also, machine learning prediction models cannot account for consumer heterogeneity unless their detailed demographic information is available. My modeling strategy therefore involves using the Autoencoder—a specific type of feedforward neural network—in order to extract a representative sample from the large dataset and calibrate the latent class model for this sample. I demonstrate that my proposed model outperforms benchmarks in discovering informative representations of consumers’ complex shopping patterns. By reducing sampling error, the latent class model can highlight differences in consumers’ choice rules at a highly accurate level. This study suggests that my proposed model presents innovative opportunities for large retailers and researchers who face challenges when attempting to fit conventional marketing science models in the context of Big Data.
- Academic Unit
- Tippie College of Business
- Record Identifier
- 9984425312902771
Metrics
1 File views/ downloads
50 Record Views