Dissertation
Deep AUC maximization
University of Iowa
Doctor of Philosophy (PhD), University of Iowa
Spring 2023
DOI: 10.25820/etd.007059
Abstract
AUC (Area Under the ROC Curve) is one of the most widely-used metrics for evaluating the capability of a binary classifier. AUC aims to rank the prediction scores of any positive data higher than the scores of any negative data and is particularly useful for imbalanced datasets. From the optimization perspective, directly optimizing AUC can potentially lead to the largest improvements in the model’s performance. To this end, we propose Deep AUC Maximization (DAM) to make stochastic AUC maximization more practical for learning deep neural networks in challenging real-world tasks.
Our recent study reveals that the existing AUC formulation with squared-based surrogate loss (named AUC-Square loss) in DAM has an adverse effect when trained with easy data and is sensitive to noisy data. To address this issue, we propose a new margin-based surrogate loss function for optimizing AUC score (named AUC-margin loss), which is more robust than the commonly used AUC-square loss. Despite the success of DAM with the new AUC-margin loss in many applications, such as medical imaging classification tasks, we observed that using DAM to train deep neural networks from scratch does not necessarily yield satisfactory performance. To tackle this challenge, the existing solutions adopt a two-stage learning framework, i.e., using cross-entropy loss to pretrain the model for learning feature representation and using DAM to fine tune it for better classification performance. However, this learning strategy has several issues: (1) it needs extensive experiments to determine the end of the first stage and the start of the second stage as well as the specific model layers used for fine tuning and (2) it can lead to lower-quality representations and poor initializations for learning DAM in the downstream tasks when pretraining on limited labeled data. To address the first challenge, we propose a novel compositional training framework for end-to-end DAM training, namely Compositional DAM. The key idea of compositional training is to minimize a compositional objective function, where the outer function corresponds to an AUC loss and the inner function represents a gradient descent step for minimizing a traditional loss, e.g., the cross-entropy (CE) loss. To address the second challenge, we leverage unlabeled data for enhancing DAM by proposing a memory-efficient Stochastic Optimization algorithm for solving the Global objective of Contrastive Learning of Representations, named SogCLR. This algorithm enables us to train self-supervised contrastive models with limited computational resources, such as small batch sizes, which significantly facilitates the use of DAM in various downstream tasks.
Furthermore, we observe that most existing training pipelines for optimizing non-decomposable measures like AUC rely on random data sampling, where data samples are randomly selected from the dataset, and mini-batch based technique, where model updates its parameters based on the gradient of a mini-batch loss to estimate the true gradient of loss on the whole data, which may not converge well or require large batch sizes to achieve good convergence performance. To address these issues, we develop a deep learning library named LibAUC by proposing a new training pipeline including two novel components: (1) controlled data samplers, which balance the number of positive and negative samples in mini-batch data to boost the model convergence, and (2) dynamic mini-batch losses, which uses moving average estimators to improve the stochastic gradient estimation to make the model training robust to the change of batch size. LibAUC supports optimizing a wide range of risk functions (denoted as X-Risk), which include surrogate losses for AUROC, AUPRC or AP, and partial AUROC that are suitable for Classification for Imbalanced Data (CID), surrogate losses for NDCG, top-K NDCG, and listwise losses that are used in Learning to Rank (LTR), and global contrastive losses for Contrastive Learning of Representations (CLR). To the best of our knowledge, LibAUC is the first deep learning library that provides easy-to-use APIs for optimizing a wide range of performance measures and losses.
Details
- Title: Subtitle
- Deep AUC maximization
- Creators
- Zhuoning Yuan
- Contributors
- Tianbao Yang (Advisor)Xun Zhou (Committee Member)Tong Wang (Committee Member)Milan Sonka (Committee Member)Bijaya Adhikari (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Computer Science
- Date degree season
- Spring 2023
- Publisher
- University of Iowa
- DOI
- 10.25820/etd.007059
- Number of pages
- xiv, 169 pages
- Copyright
- Copyright 2023 Zhuoning Yuan
- Language
- English
- Date submitted
- 04/25/2023
- Date approved
- 06/30/2023
- Description illustrations
- illustrations, tables, graphs
- Description bibliographic
- Includes bibliographical references (pages 145-169).
- Public Abstract (ETD)
- AUC (Area Under the Curve) is a widely used measure to evaluate the capability of an AI model to distinguish between two classes. AUC measures the success of various applications involving imbalanced data, where one class contains significantly fewer examples than the other, such as diagnosing COVID- 19. Directly optimizing AUC can potentially lead to the largest improvements in these applications. To this end, we propose Deep AUC Maximization (DAM) to make stochastic AUC optimization more practical for learning deep neural networks. However, deploying DAM presents several challenges. First, existing works use square-based surrogate loss to formulate AUC, which is less robust to noisy data and sensitive to easy data. Second, existing works adopt a two-stage training framework (i.e., pretraining and finetuning) to tackle complex tasks, however, this requires extensive experiments to find the optimal configuration for each stage. Third, supervised pretraining may not be suitable for datasets with limited labeled samples, where models can learn low-quality representations that can negatively affect the performance of DAM in downstream tasks. Fourth, the standard training pipeline relies on random data sampling and mini-batch based technique, which requires large batch sizes or additional efforts to achieve good performance. In this dissertation, we aim to address the first three challenges by developing efficient stochastic algorithms for optimizing AUC and tackle the fourth challenge by developing a novel deep learning library named LibAUC.
- Academic Unit
- Computer Science
- Record Identifier
- 9984425390602771
Metrics
1 File views/ downloads
48 Record Views