We Developed Deep Q-Networks (DQNs) with transfer learning to adapt knowledge from single-player to two-player game environments, resulting in improved training efficiency and performance across ten Atari 2600 games.
See our paper.
See the source code.
I developed a pytorch version of OpenAI’s RND (Random Network Distillation with Proximal Policy Optimization) and trained it on Montezuma’s Revenge to address the challenge of sparse rewards.