The great AI Wizard Siraj Raval explains Move 37, reinforcement learning, and the future of human work in this video.
Code Bullet has video on how it learned how to play a hill racing game. Some foul language, Linkin Park soundbytes, and random shenanigans. Entertaining.
In case you didn’t know, I write a monthly column for MSDN Magazine on AI called “Artificially Intelligent”
In the last two articles, I covered one of the most exciting topics in AI in these days: reinforcement learning
Here’s a snippet and link to the full articles on MSDN.
In previous articles, I’ve mentioned both supervised learning and unsupervised learning algorithms. Beyond these two methods of machine learning lays another type: Reinforcement Learning (RL). Formally defined, RL is a computational approach to goal-oriented learning through interaction with the environment under ideal learning conditions.
Like other aspects of AI, many of the algorithms and approaches actively used today trace their origins back to the 1980s (bit.ly/2NZP177). With the advent of inexpensive storage and on-demand compute power, reinforcement learning techniques have re-emerged.
In last month’s column, I explored a few basic concepts of reinforcement learning (RL), trying both a strictly random approach to navigating a simple environment and then implementing a Q-Table to remember both past actions and which actions led to which rewards. In the demo, an agent working randomly was able to reach the goal state approximately 1 percent of the time and roughly half the time when using a Q-Table to remember previous actions. However, this experiment only scratched the surface of the promising and expanding field of RL.
In my session from Azure Data Fest Reston 2018, I explore reinforcement learning. As an added bonus, Andy (who’s holding the camera) chimes in now and then.
Press the play button below to listen here or visit the show page at DataDriven.tv
This video, a follow up from his intro video on reinforcement learning, Arxiv dives into three advanced papers that address the problem of the sparse reward setting in Deep Reinforcement Learning and pose interesting research directions for mastering unsupervised learning in autonomous agents.
By leveraging powerful prior knowledge about how the world works, humans can quickly figure out efficient strategies in new and unseen environments.
Currently, even state-of-the-art Reinforcement Learning algorithms typically don’t have strong priors and this is one of the fundamental challenges in current research on Transfer Learning.
If you’re a reader of this blog, then you know that I have mentioned Alpha Go Zero before on a few occasions. However, I think this video by the incomparable Siraj Raval explains it best. Watch this video to get a technical overview of its neural components.
In case you didn’t already know, DeepMind’s AlphaGo Zero algorithm beat the best Go player in the world by training entirely by self-play. It played against itself repeatedly, getting better over time with no human gameplay input. AlphaGo Zero was a remarkable moment in AI history, a moment that will always be remembered.
The folks over at OpenAI have come up with some very interesting work in training robots on improving their dexterity via reinforcement learning.
Their robot, Dactyl, has learned to manipulate objects with unprecedented levels of dexterity.
If my post yesterday about DeepMind’s AlphaZero piqued your interest but answered too few questions, then check out this video from teh 2017 NIPS conference where Dr. David Silver delivers the keynote.
Dr. David Silver leads the reinforcement learning research group at DeepMind and is lead researcher on AlphaGo.
First, AlphaGo beat the best human player in the world by studying thousands of human vs. human games.
Then AlphaGo Zero came along and taught itself to be even better without any human generated data. By the way, it beat AlphaGo.
This is the power of Reinforcement Learning.