Logo image
Align-Consistency: Improving Non-autoregressive and Semi-supervised ASR with Consistency Regularization
Preprint   Open access

Align-Consistency: Improving Non-autoregressive and Semi-supervised ASR with Consistency Regularization

Wanting Huang and Weiran Wang
ArXiv.org
Cornell University
02/26/2026
DOI: 10.48550/arxiv.2602.23171
url
https://doi.org/10.48550/arxiv.2602.23171View
Preprint (Author's original)This preprint has not been evaluated by subject experts through peer review. Preprints may undergo extensive changes and/or become peer-reviewed journal articles. Open Access

Abstract

Consistency regularization (CR) improves the robustness and accuracy of Connectionist Temporal Classification (CTC) by ensuring predictions remain stable across input perturbations. In this work, we propose Align-Consistency, an extension of CR designed for Align-Refine – a non-autoregressive (non-AR) model that performs iterative refinement of frame-level hypotheses. This method leverages the speed of parallel inference while significantly boosting recognition performance. The effectiveness of Align-Consistency is demonstrated in two settings. First, in the fully supervised setting, our results indicate that applying CR to both the base CTC model and the subsequent refinement steps is critical, and the accuracy improvements from non-AR decoding and CR are mutually additive. Second, for semi-supervised ASR, we employ fast non-AR decoding to generate online pseudo-labels on unlabeled data, which are used to further refine the supervised model and lead to substantial gains.

Details

Metrics

1 Record Views
Logo image