Conference proceeding
Provably Efficient Interaction-Grounded Learning with Personalized Reward
Advances in Neural Information Processing Systems 37 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024, Vol.37
2024
Abstract
Interaction-Grounded Learning (IGL) [Xie et al., 2021] is a powerful framework in which a learner aims at maximizing unobservable rewards through interacting with an environment and observing reward-dependent feedback on the taken actions. To deal with personalized rewards that are ubiquitous in applications such as recommendation systems, Maghakian et al. [2022] study a version of IGL with context-dependent feedback, but their algorithm does not come with theoretical guarantees. In this work, we consider the same problem and provide the first provably efficient algorithms with sublinear regret under realizability. Our analysis reveals that the step-function estimator of prior work can deviate uncontrollably due to finite-sample effects. Our solution is a novel Lipschitz reward estimator which underestimates the true reward and enjoys favorable generalization performances. Building on this estimator, we propose two algorithms, one based on explore-then-exploit and the other based on inverse-gap weighting. We apply IGL to learning from image feedback and learning from text feedback, which are reward-free settings that arise in practice. Experimental results showcase the importance of using our Lipschitz reward estimator and the overall effectiveness of our algorithms.
Details
- Title: Subtitle
- Provably Efficient Interaction-Grounded Learning with Personalized Reward
- Creators
- Mengxiao Zhang - University of Iowa, United StatesYuheng Zhang - University of Illinois Urbana-ChampaignHaipeng Luo - University of Southern CaliforniaPaul Mineiro - Microsoft Research New York City (United States)
- Resource Type
- Conference proceeding
- Publication Details
- Advances in Neural Information Processing Systems 37 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024, Vol.37
- ISSN
- 1049-5258
- Grant note
- 1943607 / National Sleep Foundation (100003187) National Science Foundation (http://data.elsevier.com/vocabulary/SciValFunders/100000001) IIS-1943607 / National Science Foundation (http://data.elsevier.com/vocabulary/SciValFunders/100000001)
- Language
- English
- Date published
- 2024
- Academic Unit
- Business Analytics
- Record Identifier
- 9985024139802771
Metrics
21 Record Views