Logo image
MLAFormer: Multi-scale transformer with local convolutional auto-correlation and pre-training for time series forecasting
Journal article   Peer reviewed

MLAFormer: Multi-scale transformer with local convolutional auto-correlation and pre-training for time series forecasting

Ying Jin, Ming Gao, Jiafu Tang, Weiguo Fan, Yanwu Yang and Jingmin An
Information processing & management, Vol.63(6), 104741
09/2026
DOI: 10.1016/j.ipm.2026.104741

View Online

Abstract

•We propose MLAFormer, a multi-scale Transformer model for time series forecasting.•Local Convolutional Auto-Correlation enhances local context perception by combining causal convolution with frequency-aware auto-correlation.•Asymmetric self-supervised pre-training enables the model to automatically infer positional information and learn richer temporal structures.•MLAFormer reduces MAE by 7.4% on average, with an additional 3.0% gain via pre-training compared with the recent advanced model. Transformer-based models have achieved strong performance in long-term time series forecasting, yet they often struggle to jointly model short-term local patterns and long-range dependencies under a unified representation, especially when temporal dynamics vary across different resolutions. To address this challenge, we propose MLAFormer, a multi-scale encoder-decoder framework for long-sequence forecasting. MLAFormer learns temporal representations across multiple resolutions through a cross multi-scale architecture, enabling information exchange between fine-grained and coarse-grained temporal features. In addition, we propose a Local Convolutional Auto-correlation mechanism to enhance local contextual modeling while capturing periodic dependencies efficiently. To further improve representation quality, we incorporate self-supervised pre-training within the encoder-decoder framework and investigate different masking strategies for time series reconstruction. Extensive experiments on nine benchmark datasets demonstrate that MLAFormer consistently outperforms recent state-of-the-art methods in long-term forecasting. On average, MLAFormer reduces MAE by 7.4% compared with the recent advanced model, with an additional 3.0% improvement through pre-training.
Multi-scale Pre-training Time series forecasting Transformer

Details

Metrics

1 Record Views
Logo image