Alpha Zero
   HOME
*





Alpha Zero
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero. On December 5, 2017, the DeepMind team released a preprint introducing AlphaZero, which within 24 hours of training achieved a superhuman level of play in these three games by defeating world-champion programs Stockfish, elmo, and the three-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use. AlphaZero was trained solely via self-play using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables. After four hours of training, DeepMind estimated AlphaZero was playing chess at a higher Elo rating than Stockfish 8; after nine hours of training, the algorithm defeated Stockfi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Monte Carlo Tree Search
In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree. MCTS was combined with neural networks in 2016 and has been used in multiple board games like Chess, Shogi, Checkers, Backgammon, Contract Bridge, Computer Go, Scrabble, and Clobber as well as in turn-based-strategy video games (such as Total War: Rome II's implementation in the high level campaign AI). History Monte Carlo method The Monte Carlo method, which uses random sampling for deterministic problems which are difficult or impossible to solve using other approaches, dates back to the 1940s. In his 1987 PhD thesis, Bruce Abramson combined minimax search with an ''expected-outcome model'' based on random game playouts to the end, instead of the usual static evaluation function. Abramson said the expected-outcome model "is shown to b ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computer Program
A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components. A computer program in its human-readable form is called source code. Source code needs another computer program to execute because computers can only execute their native machine instructions. Therefore, source code may be translated to machine instructions using the language's compiler. ( Assembly language programs are translated using an assembler.) The resulting file is called an executable. Alternatively, source code may execute within the language's interpreter. If the executable is requested for execution, then the operating system loads it into memory and starts a process. The central processing unit will soon switch to this process so it can fetch, decode, and then execute each machine instruction. If the source code is requested for execution, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Elo Rating
The Elo rating system is a method for calculating the relative skill levels of players in zero-sum games such as chess. It is named after its creator Arpad Elo, a Hungarian-American physics professor. The Elo system was invented as an improved chess-rating system over the previously used Harkness system, but is also used as a rating system in association football, American football, baseball, basketball, pool, table tennis, and various board games and esports. The difference in the ratings between two players serves as a predictor of the outcome of a match. Two players with equal ratings who play against each other are expected to score an equal number of wins. A player whose rating is 100 points greater than their opponent's is expected to score 64%; if the difference is 200 points, then the expected score for the stronger player is 76%. A player's Elo rating is represented by a number which may change depending on the outcome of rated games played. After every game, the winni ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Demis Hassabis
Demis Hassabis (born 27 July 1976) is a British artificial intelligence researcher and entrepreneur. In his early career he was a video game AI programmer and designer, and an expert player of board games. He is the chief executive officer and co-founder of DeepMind and Isomorphic Labs, and a UK Government AI Advisor. Early life and education Hassabis was born to a Greek Cypriot father and a Chinese Singaporean mother and grew up in North London. A child prodigy in chess from the age of 4, Hassabis reached master standard at the age of 13 with an Elo rating of 2300 and captained many of the England junior chess teams. He represented the University of Cambridge in the Oxford-Cambridge varsity chess matches of 1995, 1996 and 1997, winning a half blue. Hassabis was briefly home-schooled by his parents, during which time he bought his first computer, a ZX Spectrum 48K funded from chess winnings, and taught himself how to program from books. He went on to be educated at Christ ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Reinforcement Learning
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead the focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). The environment is typically stated in the form of a Markov decision process (MDP), because many reinforcement learning algorithms for this context use dynamic programming techniques. The main difference between the classical dynamic programming methods and reinforcement learning algorithms is that the latter do not assume knowledge of an exact mathematica ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Tensor Processing Unit
Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for Artificial neural network, neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and in 2018 made them available for third party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale. Overview The tensor processing unit was announced in May 2016 at Google I/O, when the company said that the TPU had already been used inside Google Data Centers, their data centers for over a year. The chip has been specifically designed for Google's TensorFlow framework, a symbolic math library which is used for machine learning applications such as neural networks."TensorFlow: Open source machine learning"
[...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Tord Romstad
Tord is a given name, derived from the elements thor''' meaning thunder, thunder god; and '' meaning peace, beautiful, fair. The name developed as a short form of Thorfrid (Old Norse). Notable people with the name include: *Tord Andersson (born 1942), Swedish diver *Tord Bernheim (1914–1992), Swedish film actor *Tord Bonde (c. 1350s–1417), medieval Swedish magnate * Tord Boontje (born 1968), Dutch industrial product designer *Tord Filipsson (born 1950), Swedish former cyclist * Tord Ganelius (1925–2016), Swedish mathematician *Tord Asle Gjerdalen (born 1983), Norwegian cross-country skier *Tord Godal (1909–2002), Norwegian theologian and bishop *Tord Grip (born 1938), Swedish football coach and manager *Tord Gustavsen (born 1970), Norwegian jazz pianist and composer * Tord Hagen (1914–2008), Swedish diplomat and ambassador * Tord Hall (1910–1987), Swedish mathematician *Tord Henriksson (born 1965), Swedish triple jumper *Tord Holmgren (born 1957), Swedish footballer *Tor ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Hash Table
In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', also called a ''hash code'', into an array of ''buckets'' or ''slots'', from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored. Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash ''collisions'' where the hash function generates the same index for more than one key. Such collisions are typically accommodated in some way. In a well-dimensioned hash table, the average time complexity for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key–value pairs, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Top Chess Engine Championship
Top Chess Engine Championship, formerly known as Thoresen Chess Engines Competition (TCEC or nTCEC), is a computer chess tournament that has been run since 2010. It was organized, directed, and hosted by Martin Thoresen until the end of Season 6; from Season 7 onward it has been organized by Chessdom. It is often regarded as the ''Unofficial World Computer Chess Championship'' because of its strong participant line-up and long time-control matches on high-end hardware, giving rise to very high-class chess. The tournament has attracted nearly all the top engines compared to the World Computer Chess Championship. After a short break in 2012, TCEC was restarted in early 2013 (as ''nTCEC'') and is currently active (renamed as TCEC in early 2014) with 24/7 live broadcasts of chess matches on its website. Since season 5, TCEC has been sponsored by Chessdom Arena. Overview Basic structure of competition The TCEC competition is divided into seasons, where each season happens over a cour ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Draw (chess)
In chess, there are a number of ways that a game can end in a draw, neither player winning. Draws are codified by various rules of chess including stalemate (when the player to move is not in check but has no legal move), threefold repetition (when the same position occurs three times with the same player to move), and the fifty-move rule (when the last fifty successive moves made by both players contain no or pawn move). Under the standard FIDE rules, a draw also occurs in a dead position (when no sequence of legal moves can lead to checkmate), most commonly when neither player has sufficient to checkmate the opponent. Unless specific tournament rules forbid it, players may agree to a draw at any time. Ethical considerations may make a draw uncustomary in situations where at least one player has a reasonable chance of winning. For example, a draw could be called after a move or two, but this would likely be thought unsporting. In the 19th century, some tournaments, notably ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Hyperparameter (machine Learning)
In machine learning, a hyperparameter is a parameter whose value is used to control the learning process. By contrast, the values of other parameters (typically node weights) are derived via training. Hyperparameters can be classified as model hyperparameters, that cannot be inferred while fitting the machine to the training set because they refer to the model selection task, or algorithm hyperparameters, that in principle have no influence on the performance of the model but affect the speed and quality of the learning process. An example of a model hyperparameter is the topology and size of a neural network. Examples of algorithm hyperparameters are learning rate and batch size as well as mini-batch size. Batch size can refer to the full data sample where mini-batch size would be a smaller sample set. Different model training algorithms require different hyperparameters, some simple algorithms (such as ordinary least squares regression) require none. Given these hyperparameters ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


MuZero
MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go (setting a new world record), and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain. MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual algorithms as AlphaZero, but with 20% fewer computation steps per node in the search tree. History On November 19, 2019, the DeepMind team released a preprint introducing MuZero. Derivation from AlphaZero MuZero (MZ) is a combination of the high-performance pl ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]