The e-commerce industry has witnessed exponential growth due to its numerous benefits, including convenience, time efficiency, and readily available online information. As millions of users now frequent e-commerce platforms, accurately predicting and understanding user behavior is paramount. However, decoding user behavior patterns is a critical area of research and application, especially when businesses aim to venture into new markets or product categories. Machine learning techniques have transformed the e-commerce industry by addressing cold start issues and deciphering purchase decision-making.
The cold-start problem, referring to situations where businesses have minimal or no user data in a new category, makes it challenging to create accurate machine learning models for predicting user behavior. This issue particularly concerns businesses branching out into new markets or product categories. While existing literature has demonstrated the successful prediction of user behavior within individual product categories, predicting user behavior across different product domains remains difficult due to data scarcity. Furthermore, user behavior is influenced by numerous complex factors, from high-level elements dictating user intentions to low-level factors characterizing user preferences when implementing an intention. Learning user representations that reveal and disentangle these latent factors can enhance robustness, interpretability, and controllability. Still, it is a challenging task widely overlooked in the existing literature.
This thesis investigates user behavior in the e-commerce industry and advances our comprehension through cross-domain knowledge transfer and interpretable user representations. Cross-domain knowledge transfer techniques, such as transfer learning and representation learning, employ existing data from related categories or products to predict user behavior in new categories. We examine the success of cross-domain transfer between different product domains and explore the motivation and necessity of utilizing information from existing product categories to predict user behavior in new categories. In addition to cross-domain knowledge transfer, understanding why such transfer can be successful is also an essential area of research. Existing cross-domain recommendation systems have primarily focused on developing methods without thoroughly explaining the transfer mechanisms’ functionality. This thesis aims to understand the transfer process among various domains and provide insights into users’ decision-making processes. By generating interpretable user representations, we can gain insight into the fundamental user behavior patterns and improve our ability to predict future behaviors accurately.
This thesis consists of two main chapters. Chapter 2 investigates transferring customer behavior knowledge to a different product domain. In contrast, Chapter 3 explains why such transfer works and what information the model captures during the transfer process.
In Chapter 2, we explore the possibility of transferring customer behavior knowledge between distinct product domains using traditional and machine learning techniques. Specifically, we investigate whether the knowledge obtained from the consumer goods domain can be applied to the financial products domain to enhance prediction accuracy. Utilizing data from one of China’s largest online shopping platforms, we develop a transfer learning-based neural network model to facilitate knowledge transfer. The results reveal that users’ browsing and shopping history in consumer goods significantly improve the prediction accuracy of mutual fund purchases for existing and new users. We also examine how prediction performance lifts vary across users with different socioeconomic statuses and investment risk preferences. Our findings suggest that information from the consumer goods domain has a greater prediction performance lift for users in the high socioeconomic group. Finally, we compare the predictive power of various information sources and discover that browsing and shopping history for consumer goods are more predictive than profile features. This research holds significant implications for the financial industry and online platforms seeking to expand their product domains.
In Chapter 3, we pioneer an innovative methodology that employs the β-VAE model to unlock insights into user behavior related to mutual fund interactions. This approach enables us to derive disentangled user representations that enhance the predictive performance for existing user categories and further permits a nuanced exploration into the latent information encoded within these representations. We also undertake a pioneering investigation into the realm of “knowledge migration” from consumer goods to mutual funds. This involves scrutinizing what specific information is transferred and how this transferred knowledge influences performance in the mutual fund domain. The focal point of our research is the user representation learned through our model.
Initially, our hypothesis centered on the idea that the learned user representation would act as a rich repository of intricate, user-specific details. We believed that these detailed representations would arm our model with the ability to make highly accurate predictions. However, our subsequent observations challenged this initial premise. We found that the performance of our model was surprisingly invariant to the user representation. This suggested that rather than harnessing the wealth of information encapsulated in the user representations, the model was primarily leaning on the user’s identity, as indicated by the “user id”, to make its predictions.
These findings hint at a shift towards a memorization-centric approach to training. This approach sees the model memorizing specific user patterns in the training data, but struggling to generalize from the underlying latent features. One potential explanation for this behavior could be the inherent nature of the β-VAE model itself. A completely unsupervised learning method frequently employed in image data, it may falter when tasked with extracting anticipated, meaningful information from non-image data crucial for accurate predictions.
In essence, this chapter brings together our deep dive into latent factor interpretation, user representation learning, and cross-domain transfer to broaden our understanding of user behavior and its predictable patterns. We balance theory and practice, aiming to enrich the academic discourse and provide actionable insights for industry applications.
Details
Title: Subtitle
Understanding how and why machine learning can transfer customer behavior across product domains
Creators
Shenghao Wang
Contributors
Tong Wang (Advisor)
Patrick Fan (Advisor)
Yu Jeffrey Hu (Committee Member)
Kang Zhao (Committee Member)
Resource Type
Dissertation
Degree Awarded
Doctor of Philosophy (PhD), University of Iowa
Degree in
Business Administration (Business Analytics)
Date degree season
Autumn 2023
Publisher
University of Iowa
DOI
10.25820/etd.007009
Number of pages
xvi, 153 pages
Copyright
Copyright 2023 Shenghao Wang
Language
English
Date submitted
11/24/2023
Description illustrations
Illustrations, tables, graphs, charts
Description bibliographic
Includes bibliographical references (pages 104-137).
Public Abstract (ETD)
As the world of online shopping continues to grow, it becomes more and more important to understand the behavior of the millions of users who browse and buy from e-commerce platforms. In this research, we examine how data about how users interact with online platforms can help us predict what they might do next. Specifically, we’re interested in how well we can use information about how users shop in one category to predict what they might do in a completely different category. For example, can we use data about a user’s behavior while shopping for groceries to predict whether they might be interested in investing in a mutual fund?
We also want to understand how and why this prediction works. For instance, what kind of information from a user’s shopping history is most useful when predicting their behavior in another category? And can we interpret this information to understand why users make certain decisions?
To answer these questions, we analyzed data from one of China’s largest online shopping platforms. We used advanced computational techniques to train a model to predict a user’s behavior based on their past behavior. Our results showed that the more we know about a user’s browsing and shopping history, the better we can predict their behavior in another category. Interestingly, we also found that browsing and shopping history are more useful for making these predictions than demographic information.
In a separate analysis, we used a special model to represent user behavior. This representation helped us understand which factors are most important when users are making decisions. For example, we found that factors like a user’s socioeconomic status and preferred shopping style play a big role in their behavior.
Our research has many potential applications, especially for businesses looking to expand into new markets. For example, a company that sells consumer goods might use our findings to predict which customers might also be interested in their new line of financial products. As online shopping evolves, we believe understanding user behavior will become more important. Our research contributes to this understanding and provides tools to help businesses succeed in this changing landscape.