Logo image
Finite-time optimal policy identification for the stochastic shortest path problem
Journal article   Peer reviewed

Finite-time optimal policy identification for the stochastic shortest path problem

Wangzhi Zhou, Yuanqiu Mo and Soura Dasgupta
IEEE control systems letters, Vol.10, pp.409-414
2026
DOI: 10.1109/LCSYS.2026.3697256

View Online

Abstract

This letter investigates finite-time optimal policy identification for stochastic shortest path (SSP) problems. Unlike classical assumptions such as graph acyclicity or the prior requirement that all policies be proper, the proposed approach employs a Lyapunov-like function to guarantee that the optimal policy can be obtained within a finite number of iterations under both value iteration (VI) and policy iteration (PI). The proposed condition is further shown to be both necessary and sufficient for policy properness. Moreover, within this framework, the expected hitting time to the terminal state is guaranteed to be finite under any stationary policy. Simulation results are provided to validate the theoretical findings.

Details

Metrics

1 Record Views
Logo image