I developed a pytorch version of OpenAI’s RND (Random Network Distillation with Proximal Policy Optimization) and trained it on Montezuma’s Revenge to address the challenge of sparse rewards.
See the source code.
>> Home~/kimiya
I developed a pytorch version of OpenAI’s RND (Random Network Distillation with Proximal Policy Optimization) and trained it on Montezuma’s Revenge to address the challenge of sparse rewards.
See the source code.
>> Home