New preprint! 🚨
Performance of standard reinforcement learning (RL) algorithms depends on the scale of the rewards they aim to maximize.
Inspired by human cognitive processes, we leverage a cognitive bias to develop scale-invariant RL algorithms: reward range normalization.
Curious? Have a read!👇
about 1 year ago