AlphaZero information

AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero.

On December 5, 2017, the DeepMind team released a preprint paper introducing AlphaZero, which within 24 hours of training achieved a superhuman level of play in these three games by defeating world-champion programs Stockfish, Elmo, and the three-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use.^[1] AlphaZero was trained solely via self-play using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables. After four hours of training, DeepMind estimated AlphaZero was playing chess at a higher Elo rating than Stockfish 8; after nine hours of training, the algorithm defeated Stockfish 8 in a time-controlled 100-game tournament (28 wins, 0 losses, and 72 draws).^[1]^[2]^[3] The trained algorithm played on a single machine with four TPUs.

DeepMind's paper on AlphaZero was published in the journal Science on 7 December 2018;^[4] however, the AlphaZero program itself has not been made available to the public.^[5] In 2019, DeepMind published a new paper detailing MuZero, a new algorithm able to generalise AlphaZero's work, playing both Atari and board games without knowledge of the rules or representations of the game.^[6]

^ ^a ^b Silver, David; Hubert, Thomas; Schrittwieser, Julian; Antonoglou, Ioannis; Lai, Matthew; Guez, Arthur; Lanctot, Marc; Sifre, Laurent; Kumaran, Dharshan; Graepel, Thore; Lillicrap, Timothy; Simonyan, Karen; Hassabis, Demis (December 5, 2017). "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm". arXiv:1712.01815 [cs.AI].
^ Knapton, Sarah; Watson, Leon (December 6, 2017). "Entire human chess knowledge learned and surpassed by DeepMind's AlphaZero in four hours". Telegraph.co.uk. Retrieved December 6, 2017.
^ Vincent, James (December 6, 2017). "DeepMind's AI became a superhuman chess player in a few hours, just for fun". The Verge. Retrieved December 6, 2017.
^ Silver, David; Hubert, Thomas; Schrittwieser, Julian; Antonoglou, Ioannis; Lai, Matthew; Guez, Arthur; Lanctot, Marc; Sifre, Laurent; Kumaran, Dharshan; Graepel, Thore; Lillicrap, Timothy; Simonyan, Karen; Hassabis, Demis (December 7, 2018). "A general reinforcement learning algorithm that masters chess, shogi, and go through self-play". Science. 362 (6419): 1140–1144. Bibcode:2018Sci...362.1140S. doi:10.1126/science.aar6404. PMID 30523106.
^ "Chess Terms: AlphaZero". Chess.com. Retrieved July 30, 2022.
^ Schrittwieser, Julian; Antonoglou, Ioannis; Hubert, Thomas; Simonyan, Karen; Sifre, Laurent; Schmitt, Simon; Guez, Arthur; Lockhart, Edward; Hassabis, Demis; Graepel, Thore; Lillicrap, Timothy (2020). "Mastering Atari, Go, chess and shogi by planning with a learned model". Nature. 588 (7839): 604–609. arXiv:1911.08265. Bibcode:2020Natur.588..604S. doi:10.1038/s41586-020-03051-4. PMID 33361790. S2CID 208158225.

[preprint-1] Silver, David; Hubert, Thomas; Schrittwieser, Julian; Antonoglou, Ioannis; Lai, Matthew; Guez, Arthur; Lanctot, Marc; Sifre, Laurent; Kumaran, Dharshan; Graepel, Thore; Lillicrap, Timothy; Simonyan, Karen; Hassabis, Demis (December 5, 2017). "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm". arXiv:1712.01815 [cs.AI].

[telegraph-2] Knapton, Sarah; Watson, Leon (December 6, 2017). "Entire human chess knowledge learned and surpassed by DeepMind's AlphaZero in four hours". Telegraph.co.uk. Retrieved December 6, 2017.

[3] Vincent, James (December 6, 2017). "DeepMind's AI became a superhuman chess player in a few hours, just for fun". The Verge. Retrieved December 6, 2017.

[Science20181207-4] Silver, David; Hubert, Thomas; Schrittwieser, Julian; Antonoglou, Ioannis; Lai, Matthew; Guez, Arthur; Lanctot, Marc; Sifre, Laurent; Kumaran, Dharshan; Graepel, Thore; Lillicrap, Timothy; Simonyan, Karen; Hassabis, Demis (December 7, 2018). "A general reinforcement learning algorithm that masters chess, shogi, and go through self-play". Science. 362 (6419): 1140–1144. Bibcode:2018Sci...362.1140S. doi:10.1126/science.aar6404. PMID 30523106.

[5] "Chess Terms: AlphaZero". Chess.com. Retrieved July 30, 2022.

[6] Schrittwieser, Julian; Antonoglou, Ioannis; Hubert, Thomas; Simonyan, Karen; Sifre, Laurent; Schmitt, Simon; Guez, Arthur; Lockhart, Edward; Hassabis, Demis; Graepel, Thore; Lillicrap, Timothy (2020). "Mastering Atari, Go, chess and shogi by planning with a learned model". Nature. 588 (7839): 604–609. arXiv:1911.08265. Bibcode:2020Natur.588..604S. doi:10.1038/s41586-020-03051-4. PMID 33361790. S2CID 208158225.

AlphaZero information

and 24 Related for: AlphaZero information

AlphaZero

AlphaGo Zero

AlphaGo

MuZero

Google DeepMind

AlphaDev

Leela Chess Zero

Leela Zero

Computer chess

Machine learning in video games

History of chess engines

AI alignment

No Castling Chess

Street Fighter Alpha

Street Fighter Alpha 3

Monte Carlo tree search

AZ

Lichess

Street Fighter Alpha 2

Intelligent agent

Tensor Processing Unit

Chess

Timothy Lillicrap

Sundar Pichai

Chess programming
This article is part of the series on

Board representations 0x88 Bitboards
Evaluation functions Deep neural networks (Transformers) Attention Efficiently updatable neural networks Handcrafted evaluation functions Piece-square tables Reinforcement learning Stochastic gradient descent Supervised learning Texel tuning Unsupervised learning
Graph and tree search algorithms Minimax Alpha-beta pruning Principal variation search Quiescence search Monte Carlo tree search
Chess computers Belle ChessMachine ChipTest Cray Blitz Deep Blue Deep Thought HiTech Hydra Mephisto Saitek
Chess engines AlphaZero Chess Tiger Crafty CuckooChess Deep Fritz Dragon by Komodo Chess Fairy-Max Fritz Fruit GNU Chess HIARCS Houdini Ikarus Junior KnightCap Komodo Leela Chess Zero MChess Pro Mittens MuZero Naum REBEL Rybka Shredder Sjeng SmarThink Stockfish Torch Turochamp Zappa
v t e

Artificial intelligence
Part of a series on

Major goals Artificial general intelligence Recursive self-improvement Planning Computer vision General game playing Knowledge reasoning Machine learning Natural language processing Robotics AI safety
Approaches Symbolic Deep learning Bayesian networks Evolutionary algorithms Situated approach Hybrid intelligent systems Systems integration
Applications Projects Deepfake Machine translation Generative AI Art Audio Music Healthcare Mental health Government Industry Earth sciences Bioinformatics Physics
Philosophy Chinese room Friendly AI Control problem/Takeover Ethics Existential risk Turing test Regulation
History Timeline Progress AI winter AI boom AI era
Glossary Glossary
v t e