AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind
Published in RSS SIHR Workshop Spotlight, 2024
Wei Ding*, Fanhong Li*, Ziteng Ji, Zhengrong Xue, Jia Liu
* (Equal contribution)
We propose AToM-Bot, a novel task generation and execution framework for proactive robot-human interaction, which leverages the human mental and physical state inference capabilities of the Vision Language Model (VLM) prompted by the Affective Theory of Mind (AToM). Without requiring explicit commands by humans, AToM-Bot proactively generates and follows feasible tasks to improve general human well-being. When around humans, AToM-Bot first detects current human needs based on inferred human states and observations of the surrounding environment. It then generates tasks to fulfill these needs, taking into account its embodied constraints.