Optimizing neural network structures: faster speed, smaller size, less tuning
Abstract
Details
- Title: Subtitle
- Optimizing neural network structures: faster speed, smaller size, less tuning
- Creators
- Zhe Li - University of Iowa
- Contributors
- Tianbao Yang (Advisor)Qihang Lin (Committee Member)Suely Oliveira (Committee Member)Kasturi R. Varadarajan (Committee Member)Padmini Srinivasan (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Computer Science
- Date degree season
- Autumn 2018
- DOI
- 10.17077/etd.rgwxihal
- Publisher
- University of Iowa
- Number of pages
- xv, 94 pages
- Copyright
- Copyright © 2018 Zhe Li
- Language
- English
- Date submitted
- 11/19/2018
- Description illustrations
- color illustrations
- Description bibliographic
- Includes bibliographical references (pages 85-94).
- Public Abstract (ETD)
Deep neural networks have achieved tremendous success in many domains (e.g., computer vision [48, 79, 19], speech recognition [33, 14], natural language processing [14, 11], games [78, 77]), however, there are still many challenges in deep learning comunity such as how to speed up training large deep neural networks, how to compress the large nerual networks for mobile/embed device without performance loss and how to automatically design the optimal network structures for a certain task.
To speed up training process of large neural network, we propose to use multinomial sampling for dropout, i.e., sampling features or neurons according to a multinomial distribution with different probabilities for different features/neurons. Further we propose an efficient adaptive dropout (named evolutional dropout) that computes the sampling probabilities on-the-y from a mini-batch of examples to tackle the issue of evolving distribution of neurons in deep learning. To compress large neural network structures, we propose a simple yet powerful method for compressing the size of deep CNNs based on parameter binarization, and design a new block structure codenamed the pattern residual block that adds transformed feature maps generated by 1 x 1 convolutions to the pattern feature maps generated by k x k convolutions, based on which we design a small network with ~ 1 million parameters. To automatically design neural networks, we study how to design a genetic programming approach for optimizing the structure of a CNN for a given task under limited computational resources yet without imposing strong restrictions on the search space. To reduce the computational costs, we propose two general strategies that are observed to be helpful: (i) aggressively selecting strongest individuals for survival and reproduction, and killing weaker individuals at a very early age; (ii) increasing mutation frequency to encourage diversity and faster evolution. To further design the optimal networks with improved performance and certain model size with reduced computation cost, we propose an ecologically inspired genetic approach for neural network structure search, that includes two types of succession: primary and secondary succession as well as accelerated extinction.
- Academic Unit
- Computer Science
- Record Identifier
- 9983777230602771