Here’s a great explanation of Reinforcement Learning, AlphaGo Zero, and how it compares to other forms of machine learning.

For example, AlphaGo, in order to learn to play (the action) the game of Go (the environment), first learned to mimic human Go players from a large data set of historical games (apprentice learning). It then improved its play through trial and error (reinforcement learning), by playing large numbers of Go games against independent instances of itself.

tt ads