
Deep Q-Network Solves Cart and Pole – Reinforcement Learning Code Project
- Frank
- March 28, 2022
- agent environment
- AI
- AlphaGo
- artificial intelligence
- artificial neural network
- Bellman equation
- CNN
- Deep Learning
- Deep Q-network
- DQN
- Education
- experience replay
- Machine Learning
- markov decision process
- MDP
- Neural Network
- OpenAI Five
- OpenAI Gym
- policy gradients
- policy network
- Python
- PyTorch
- Q-learning
- Q-value
- Reinforcement Learning
- replay memory
- SGD
- stochastic gradient descent
- Supervised Learning
- TensorFlow
- Tutorial
- Unsupervised Learning
In this episode, learn how to use a deep Q-network to solve the Cart and Pole environment.
Read More
AlphaFold: The making of a scientific breakthrough
Here is the inside story of the DeepMind team of scientists and engineers who created AlphaFold, an AI system that is recognized as a solution to “protein folding”, a grand scientific challenge for more than 50 years.
Read More
DeepMind’s New AI MuZero Mastered More Than 50 Games
Two Minute Papers explores the paper “Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model.”
Read More
DeepMind’s AlphaStar: A Grandmaster Level StarCraft 2 AI
Two Minute Papers explores the paper “AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning” in this video.
Read More
Reinforcement learning explained
Here’s a great explanation of Reinforcement Learning, AlphaGo Zero, and how it compares to other forms of machine learning. For example, AlphaGo, in order to learn to play (the action) the game of Go (the environment), first learned to mimic human Go players from a large data set of historical games (apprentice learning). It then […]
Read More
How Deepmind’s AlphaZero Can Master Games Without Human Knowledge
If my post yesterday about DeepMind’s AlphaZero piqued your interest but answered too few questions, then check out this video from teh 2017 NIPS conference where Dr. David Silver delivers the keynote. Dr. David Silver leads the reinforcement learning research group at DeepMind and is lead researcher on AlphaGo.
Read More
How AlphaGo Zero Taught Itself to Be a Go Master
First, AlphaGo beat the best human player in the world by studying thousands of human vs. human games. Then AlphaGo Zero came along and taught itself to be even better without any human generated data. By the way, it beat AlphaGo. This is the power of Reinforcement Learning.
Read More