Nonconvex min-max optimization in deep learning: algorithms and applications
Abstract
Details
- Title: Subtitle
- Nonconvex min-max optimization in deep learning: algorithms and applications
- Creators
- Mingrui Liu
- Contributors
- Tianbao Yang (Advisor)Qihang Lin (Committee Member)Kasturi Varadarajan (Committee Member)Suely Oliveira (Committee Member)Weiyu Xu (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Computer Science
- Date degree season
- Summer 2020
- DOI
- 10.17077/etd.005562
- Publisher
- University of Iowa
- Number of pages
- xv, 168 pages
- Copyright
- Copyright 2020 Mingrui Liu
- Comment
- This thesis has been optimized for improved web viewing. If you require the original version, contact the University Archives at the University of Iowa: https://www.lib.uiowa.edu/sc/contact/
- Language
- English
- Description illustrations
- color illustrations
- Description bibliographic
- Includes bibliographical references (pages 149-168).
- Public Abstract (ETD)
Nonconvex min-max optimization receives increasing attention in modern machine learning, especially in the context of deep learning. Examples include stochastic AUC maximization with deep neural networks and Generative Adversarial Nets (GANs), which correspond to a nonconvex-concave and nonconvex-nonconcave min-max problem respectively. The classical theory of min-max optimization mainly focuses on convex-concave setting, which is not applicable for deep learning applications with nonconvex min-max formulation. A natural question is proposed—how to design provably efficient algorithms for nonconvex min-max problems in deep learning? To answer this question, this dissertation focuses on solving two important deep learning applications: stochastic AUC maximization with deep neural networks, and GANs. For stochastic AUC maximization with deep neural network, we approach this problem as a nonconvex-concave min-max problem, explore an special property called Polyak-Lojasiewicz (PL) that has been proved and observed in deep learning, and develop provably fast stochastic algorithms with practical step size scheme. For GANs, we view this problem as a nonconvex-nonconcave min-max problem and tackle it from the following three perspectives: the first-order convergence theory for weakly-convex-weakly-concave min-max problem with corresponding algorithms, a provably efficient adaptive gradient algorithm for training GANs, and a decentralized parallel algorithm for training GANs.
- Academic Unit
- Computer Science
- Record Identifier
- 9983988296702771