Intro

This post simply converts code from Andrej Karpathy’s excellent post into Pytorch.

Pong game

(Here left is game’s built-in AI and right is the one played by a simple 2 layer FC network).

There main differenes with Andrej’s code are:

  • It uses Pytorch autograd and some other handy functions
  • It is organized around specifying loss function weighted by advantage instead of manipulating gradients direcly
  • It does run some things on GPU

The code is in Jupyter notebooks and is located here.

It needs very long time to converge (many hours). Currently, it’s use of GPU is very inefficient since it first generates games on CPU and then send relatively small data to GPU.