From distributionally robust optimization to broader machine learning applications
Abstract
Details
- Title: Subtitle
- From distributionally robust optimization to broader machine learning applications
- Creators
- Dixian Zhu
- Contributors
- Tianbao Yang (Advisor)Kasturi Varadarajan (Committee Member)Bijaya Adhikari (Committee Member)Qihang Lin (Committee Member)Xun Zhou (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Computer Science
- Date degree season
- Spring 2023
- Publisher
- University of Iowa
- DOI
- 10.25820/etd.007289
- Number of pages
- xii, 178 pages
- Copyright
- Copyright 2023 Dixian Zhu
- Language
- English
- Date submitted
- 04/25/2023
- Date approved
- 06/30/2023
- Description illustrations
- illustrations (some color)
- Description bibliographic
- Includes bibliographical references (pages 86-102).
- Public Abstract (ETD)
For the purpose of training machine learning models, it is conventionally accepted that peers define a differentiable loss function for each data sample and optimize the averaged empirical loss. However, people have been questioning whether this is the only approach to train a model. In the last decade, an alternative approach based on natural philosophy has been proposed. This approach involves trading-off between optimizing the averaged individual loss and the maximal individual loss. By focusing more on the harder data samples, the approach can be more robust and perform better.
Inspired by this philosophy, we propose to apply this high-level idea to various machine learning applications. For instance, this philosophy can be used not only to guide model training but also to query labeled data under the active learning paradigm. Additionally, we have discovered that the hard-attention mechanism can naturally adapt to optimizing the partial Area under the ROC curve, which is especially significant for machine learning on imbalanced datasets such as medical and healthcare data. We have also found that this philosophy is related to the multiclass classification problem and its commonly used loss functions. To this end, we have proposed a unified loss function and investigated its properties to enhance multi-class classification performance. Lastly, we propose to employ this philosophy to multiple instance learning, where we aim to classify bags of data with limited instances displaying the interests.
- Academic Unit
- Computer Science
- Record Identifier
- 9984437257402771