Separating Signal from Noise: Denoised World Models for Generalization in Reinforcement Learning
יום ראשון 14.06 16:00 - 16:30
- Graduate Student Seminar
-
Cognitive Robotics Lab, Cooper Building
Abstract: A central challenge in reinforcement learning is generalization: agents trained on a set of tasks often fail catastrophically when deployed on new ones, even when the underlying structure is approximately shared. We hypothesize that one key reason for this failure is spurious correlations: during training the agent connects exogenous details (e.g. colors, backgrounds, and moving parts) to the task at hand, so at test time it fails when these correlations break down. Learning under exogenous noise is an active research area in reinforcement learning, with provable algorithms for identifying the endogenous MDP. However, none address how exogenous noise hinders generalization in a multi-task setting. We introduce the Contextual EXogenous Block MDP (CEX-BMDP), a formal framework that separates the true task dynamics from exogenous details and explains why naive agents conflate the two. We then describe a model-based algorithm that learns disentangled endogenous and exogenous dynamics. This isolates the exogenous noise and its spurious correlation with the task. The learned representation is used to train a policy that better generalizes in zero-shot to unseen tasks from the task distribution.