Journal article
Multivariate online regression analysis with heterogeneous streaming data
Canadian journal of statistics, Vol.51(1), pp.111-133
03/2023
DOI: 10.1002/cjs.11667
Abstract
New data collection and storage technologies have given rise to a new field of streaming data analytics, called real-time statistical methodology for online data analyses. Most existing online learning methods are based on homogeneity assumptions, which require the samples in a sequence to be independent and identically distributed. However, inter-data batch correlation and dynamically evolving batch-specific effects are among the key defining features of real-world streaming data such as electronic health records and mobile health data. This article is built under a state-space mixed model framework in which the observed data stream is driven by a latent state process that follows a Markov process. In this setting, online maximum likelihood estimation is made challenging by high-dimensional integrals and complex covariance structures. In this article, we develop a real-time Kalman-filter-based regression analysis method that updates both point estimates and their standard errors for fixed population average effects while adjusting for dynamic hidden effects. Both theoretical justification and numerical experiments demonstrate that our proposed online method has statistical properties similar to those of its offline counterpart and enjoys great computational efficiency. We also apply this method to analyze an electronic health record dataset.
Details
- Title: Subtitle
- Multivariate online regression analysis with heterogeneous streaming data
- Creators
- Lan Luo - University of IowaPeter X.‐K Song - University of Michigan
- Resource Type
- Journal article
- Publication Details
- Canadian journal of statistics, Vol.51(1), pp.111-133
- DOI
- 10.1002/cjs.11667
- ISSN
- 0319-5724
- eISSN
- 1708-945X
- Grant note
- DOI: 10.13039/100000001, name: National Science Foundation, award: DMS 1811734, DMS 2113564
- Language
- English
- Electronic publication date
- 12/03/2021
- Date published
- 03/2023
- Academic Unit
- Statistics and Actuarial Science
- Record Identifier
- 9984199859502771
Metrics
58 Record Views