We dsigned and implemented an adaptive Monte Carlo tree search algorithm enhanced with temporal difference learning, to improve performance without requiring pre-training in the simplified version of the game XCOM.
See our paper.
See the source code.