The Paths Perspective on Value Learning
Introduction In the last few years, reinforcement learning (RL) has made remarkable progress, including beating world-champion Go players, controlling robotic hands, and even painting pictures. One of the key sub-problems of RL is value estimation – learning the long-term consequences of being in a state. This can be tricky because future returns are generally noisy, […]