Random Network Distillation in Pytorch

An example of a improvement of the reward when transfer is done. I developed a pytorch version of OpenAI’s RND (Random Network Distillation with Proximal Policy Optimization) and trained it on Montezuma’s Revenge to address the challenge of sparse rewards.