AlphaZero
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero. On December 5, 2017, the DeepMind team released a preprint introducing AlphaZero, which within 24 hours of training achieved a superhuman level of play in these three games by defeating world-champion programs Stockfish, elmo, and the three-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use. AlphaZero was trained solely via self-play using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables. After four hours of training, DeepMind estimated AlphaZero was playing chess at a higher Elo rating than Stockfish 8; after nine hours of training, the algorithm defeated ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Stockfish (chess)
Stockfish is a free and open-source chess engine, available for various desktop and mobile platforms. It can be used in chess software through the Universal Chess Interface. Stockfish has consistently ranked first or near the top of most chess-engine rating lists and, as of October 2022, is the strongest CPU chess engine in the world. It has won the Top Chess Engine Championship 13 times and the Chess.com Computer Chess Championship 19 times. Stockfish is developed by Marco Costalba, Joona Kiiski, Gary Linscott, Tord Romstad, Stéphane Nicolet, Stefan Geschwentner, and Joost VandeVondele, with many contributions from a community of open-source developers. It is derived from Glaurung, an open-source engine by Tord Romstad released in 2004. Features Stockfish can use up to 1024 CPU threads in multiprocessor systems. The maximal size of its transposition table is 32 TB. Stockfish implements an advanced alpha–beta search and uses bitboards. Compared to other engines, it is ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
AlphaGo Zero
AlphaGo Zero is a version of DeepMind's Go software AlphaGo. AlphaGo's team published an article in the journal ''Nature'' on 19 October 2017, introducing AlphaGo Zero, a version created without using data from human games, and stronger than any previous version. By playing games against itself, AlphaGo Zero surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0, reached the level of AlphaGo Master in 21 days, and exceeded all the old versions in 40 days. Training artificial intelligence (AI) without datasets derived from human experts has significant implications for the development of AI with superhuman skills because expert data is "often expensive, unreliable or simply unavailable." Demis Hassabis, the co-founder and CEO of DeepMind, said that AlphaGo Zero was so powerful because it was "no longer constrained by the limits of human knowledge". Furthermore, AlphaGo Zero performed better than standard reinforcement deep learning models (such as DQN implem ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
MuZero
MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go (setting a new world record), and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain. MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual algorithms as AlphaZero, but with 20% fewer computation steps per node in the search tree. History On November 19, 2019, the DeepMind team released a preprint introducing MuZero. Derivation from AlphaZero MuZero (MZ) is a combination of the high-performance p ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
DeepMind
DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc, after Google's restructuring in 2015. The company is based in London, with research centres in Canada, France, and the United States. DeepMind has created a neural network that learns how to play video games in a fashion similar to that of humans, as well as a Neural Turing machine, or a neural network that may be able to access an external memory like a conventional Turing machine, resulting in a computer that mimics the short-term memory of the human brain. DeepMind made headlines in 2016 after its AlphaGo program beat a human professional Go player Lee Sedol, a world champion, in a five-game match, which was the subject of a documentary film. A more general program, AlphaZero, beat the most powerful programs playing go, chess and shogi (Japanese chess) ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Monte Carlo Tree Search
In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree. MCTS was combined with neural networks in 2016 and has been used in multiple board games like Chess, Shogi, Checkers, Backgammon, Contract Bridge, Computer Go, Scrabble, and Clobber as well as in turn-based-strategy video games (such as Total War: Rome II's implementation in the high level campaign AI). History Monte Carlo method The Monte Carlo method, which uses random sampling for deterministic problems which are difficult or impossible to solve using other approaches, dates back to the 1940s. In his 1987 PhD thesis, Bruce Abramson combined minimax search with an ''expected-outcome model'' based on random game playouts to the end, instead of the usual static evaluation function. Abramson said the expected-outcome model "is shown t ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Chess
Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to distinguish it from related games, such as xiangqi (Chinese chess) and shogi (Japanese chess). The recorded history of chess goes back at least to the emergence of a similar game, chaturanga, in seventh-century India. The rules of chess as we know them today emerged in Europe at the end of the 15th century, with standardization and universal acceptance by the end of the 19th century. Today, chess is one of the world's most popular games, played by millions of people worldwide. Chess is an abstract strategy game that involves no hidden information and no use of dice or cards. It is played on a chessboard with 64 squares arranged in an eight-by-eight grid. At the start, each player controls sixteen pieces: one king, one queen, two rooks, ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Tensor Processing Unit
Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and in 2018 made them available for third party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale. Overview The tensor processing unit was announced in May 2016 at Google I/O, when the company said that the TPU had already been used inside their data centers for over a year. The chip has been specifically designed for Google's TensorFlow framework, a symbolic math library which is used for machine learning applications such as neural networks."TensorFlow: Open source machine learning" "It is machine learning software being use ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Elmo (shogi Engine)
Elmo, stylized as elmo (the name is a blend of ''elastic'' and ''monkey''), is a computer shogi evaluation function and book file ('' joseki'') created by Makoto Takizawa (). It is designed to be used with a third-party shogi alpha–beta search engine. Combined with the ''yaneura ou'' () search, Elmo became the champion of the 27th annual World Computer Shogi Championship () in May 2017. However, in the Den Ō tournament () in November 2017, Elmo was not able to make it to the top five engines losing to (1st), shotgun (2nd), ponanza (3rd), (4th), and Qhapaq_conflated (5th). In October 2017, DeepMind claimed that its program AlphaZero, after two hours of massively parallel training (700,000 steps or 10,300,000 games), began to exceed Elmo's performance. With a full nine hours of training (24 million games), AlphaZero defeated Elmo in a 100-game match, winning 90, losing 8, and drawing two. Elmo is free software that may be run on shogi engine interface GUIs such as Shogidoko ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Tensor Processing Unit
Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and in 2018 made them available for third party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale. Overview The tensor processing unit was announced in May 2016 at Google I/O, when the company said that the TPU had already been used inside their data centers for over a year. The chip has been specifically designed for Google's TensorFlow framework, a symbolic math library which is used for machine learning applications such as neural networks."TensorFlow: Open source machine learning" "It is machine learning software being use ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Shogi
, also known as Japanese chess, is a strategy board game for two players. It is one of the most popular board games in Japan and is in the same family of games as Western chess, '' chaturanga, Xiangqi'', Indian chess, and '' janggi''. ''Shōgi'' means general's (''shō'' ) board game (''gi'' ). Western chess is sometimes called (''Seiyō Shōgi'' ) in Japan. Shogi was the earliest chess-related historical game to allow captured pieces to be returned to the board by the capturing player. This drop rule is speculated to have been invented in the 15th century and possibly connected to the practice of 15th century mercenaries switching loyalties when captured instead of being killed. The earliest predecessor of the game, chaturanga, originated in India in the sixth century, and the game was likely transmitted to Japan via China or Korea sometime after the Nara period."Shogi". ''Encyclopædia Britannica''. 2002. Shogi in its present form was played as early as the 16th century, ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Demis Hassabis
Demis Hassabis (born 27 July 1976) is a British artificial intelligence researcher and entrepreneur. In his early career he was a video game AI programmer and designer, and an expert player of board games. He is the chief executive officer and co-founder of DeepMind and Isomorphic Labs, and a UK Government AI Advisor. Early life and education Hassabis was born to a Greek Cypriot father and a Chinese Singaporean mother and grew up in North London. A child prodigy in chess from the age of 4, Hassabis reached master standard at the age of 13 with an Elo rating of 2300 and captained many of the England junior chess teams. He represented the University of Cambridge in the Oxford-Cambridge varsity chess matches of 1995, 1996 and 1997, winning a half blue. Hassabis was briefly home-schooled by his parents, during which time he bought his first computer, a ZX Spectrum 48K funded from chess winnings, and taught himself how to program from books. He went on to be educated at Christ' ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Self-play (reinforcement Learning Technique)
Self-play is a technique for improving the performance of reinforcement learning agents. Intuitively, agents learn to improve their performance by playing "against themselves". Definition and motivation In multi-agent reinforcement learning experiments, researchers try to optimize the performance of a learning agent on a given task, in cooperation or competition with one or more agents. These agents learn by trial-and-error, and researchers may choose to have the learning algorithm play the role of two or more of the different agents. When successfully executed, this technique has a double advantage: # It provides a straightforward way to determine the actions of the other agents, resulting in a meaningful challenge. # It increases the amount of experience that can be used to improve the policy, by a factor of two or more, since the viewpoints of each of the different agents can be used for learning. Usage Self-play is used by the AlphaZero program to improve its perform ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |