Journal article
Finite-time optimal policy identification for the stochastic shortest path problem
IEEE control systems letters, Vol.10, pp.409-414
2026
DOI: 10.1109/LCSYS.2026.3697256
Abstract
This letter investigates finite-time optimal policy identification for stochastic shortest path (SSP) problems. Unlike classical assumptions such as graph acyclicity or the prior requirement that all policies be proper, the proposed approach employs a Lyapunov-like function to guarantee that the optimal policy can be obtained within a finite number of iterations under both value iteration (VI) and policy iteration (PI). The proposed condition is further shown to be both necessary and sufficient for policy properness. Moreover, within this framework, the expected hitting time to the terminal state is guaranteed to be finite under any stationary policy. Simulation results are provided to validate the theoretical findings.
Details
- Title: Subtitle
- Finite-time optimal policy identification for the stochastic shortest path problem
- Creators
- Wangzhi Zhou - Southeast UniversityYuanqiu Mo - Southeast UniversitySoura Dasgupta - University of Iowa
- Resource Type
- Journal article
- Publication Details
- IEEE control systems letters, Vol.10, pp.409-414
- DOI
- 10.1109/LCSYS.2026.3697256
- ISSN
- 2475-1456
- eISSN
- 2475-1456
- Publisher
- IEEE
- Language
- English
- Electronic publication date
- 05/26/2026
- Date published
- 2026
- Academic Unit
- Electrical and Computer Engineering
- Record Identifier
- 9985166828202771
Metrics
1 Record Views