“Regard is enough"

Abstract: The provoking interpretation of the reward hypothesis in “Reward is enough” [Silver e.a., 2021] is due to an overly wide scope “of what we mean by goals and purposes”, while in a more restricted setting the validity of the reward hypothesis can be precisely established [Bowling e.a., 2023]. There are obviously applications of reinforcement learning where the validity of the reward hypothesis is less clear, such as in multi-agent, multi-objective, and biological reinforcement learning, as well as in learning under uncertainty or from human feedback. Likewise, the inverse reinforcement learning problem depends on the interpretability of a given data set in terms of the reward hypothesis. We present a framework that combines reward with an attentive (“regard”) component to improve robustness, exploration, and engagement, and that allows for different stances in the reinforcement problem. We will also discuss the relevance of this biologically inspired approach in the context of human-robot interaction.

Date: 
Thursday, 23 January, 2025 - 13:00
Speaker: 
Michael Hermann
Affiliation: 
University of Edinburgh
Location: 
Informatics Forum. G.03