Training Autoencoders Using Stochastic Hessian-Free Optimization with LSMR

Ibrahim Emirahmetoglu; David E Stewart

doi:10.48550/arxiv.2504.13302

Back

Training Autoencoders Using Stochastic Hessian-Free Optimization with LSMR

Preprint

Open access

Training Autoencoders Using Stochastic Hessian-Free Optimization with LSMR

Ibrahim Emirahmetoglu and David E Stewart

ArXiV.org

Cornell University

04/17/2025

DOI: 10.48550/arxiv.2504.13302

Files and links (1)

url

https://doi.org/10.48550/arxiv.2504.13302View

Preprint (Author's original)This preprint has not been evaluated by subject experts through peer review. Preprints may undergo extensive changes and/or become peer-reviewed journal articles. Open Access

Abstract

Hessian-free (HF) optimization has been shown to effectively train deep autoencoders (Martens, 2010). In this paper, we aim to accelerate HF training of autoencoders by reducing the amount of data used in training. HF utilizes the conjugate gradient algorithm to estimate update directions. Instead, we propose using the LSMR method, which is known for effectively solving large sparse linear systems. We also incorporate Chapelle & Erhan (2011)'s improved preconditioner for HF optimization. In addition, we introduce a new mini-batch selection algorithm to mitigate overfitting. Our algorithm starts with a small subset of the training data and gradually increases the mini-batch size based on (i) variance estimates obtained during the computation of a mini-batch gradient (Byrd et al., 2012) and (ii) the relative decrease in objective value for the validation data. Our experimental results demonstrate that our stochastic Hessian-free optimization, using the LSMR method and the new sample selection algorithm, leads to rapid training of deep autoencoders with improved generalization error.

Computer Science - Learning

Mathematics - Optimization and Control

Details

Title: Subtitle: Training Autoencoders Using Stochastic Hessian-Free Optimization with LSMR
Creators: Ibrahim Emirahmetoglu - University of Iowa
David E Stewart - University of Iowa
Resource Type: Preprint
Publication Details: ArXiV.org
DOI: 10.48550/arxiv.2504.13302
ISSN: 2331-8422
Publisher: Cornell University; Ithaca, New York
Language: English
Date posted: 04/17/2025
Academic Unit: Mathematics
Record Identifier: 9984813150402771

Metrics

9 Record Views