Speed-running games using AI

Nischal Madiraju
DataDrivenInvestor
Published in
2 min readMay 31, 2020

--

Many of us have come across AI speed-runs of our favourite arcade games. I always used to wonder how those systems worked. So, I started looking into some existing open-source AI speed-runners and this is what I have found:

When creating AI to play games you have to face two challenges. The first one is to create the interface between your AI agent and the game. In other words, you have to create a way to capture screens (or memory state) and send actions back to the game. For this, you can usually use an emulator that takes care of this for you such as the Gym project, maintained by OpenAI. You can find this in:

http://gym.openai.com/

Under the ‘Environments’ tab, you can find many game environments such as the ‘Atari’(http://gym.openai.com/envs/#atari) which has a list of our favourite childhood arcade games like Pacman, Pinball, Breakout, Boxing, Tennis, Double dunk, Roadrunner and so on…

You can use these environments with your AI agent to build your speed-run.

Now, the second challenge is which style of learning paradigm can be used for helping your model in learning how to play the game? The most commonly used learning paradigm is reinforcement learning.

What is reinforcement learning?

Reinforcement learning is an artificial intelligence paradigm in which an intelligent agent learns to execute tasks by trial and error, interacting with the environment. Instead of learning from a labelled dataset, like in supervised learning, in reinforcement learning the agent obtain information about the environment, performs an action based on this information and then receives a reinforcement signal to indicate if the action was correct.

For example in this video, Sethbling explains how the MarI/O neural network uses reinforcement learning to play the game ‘Mario’. He explains how in the beginning the neural net would not even press any buttons or do anything, anytime Mario did not move the algorithm would switch to a different node which performed different actions until it performed the right kind of action and received the reinforcement signal/reward letting it know that the respective action was the correct one to perform.

--

--

Writes about Artificial intelligence, Machine Learning and Deep Learning. Pursuing Msc in Artificial Intelligence at the University of Groningnen