Neurogammon
   HOME
*





Neurogammon
Neurogammon is a computer backgammon program written by Gerald Tesauro at IBM's Thomas J. Watson Research Center. It was the first viable computer backgammon program implemented as a neural net, and set a new standard in computer backgammon play. It won the 1st Computer Olympiad in London in 1989, handily defeating all opponents. Its level of play was that of an intermediate-level human player. Neurogammon contains seven separate neural networks, each with a single hidden layer. One network makes doubling-cube decisions; the other six choose moves at different stages of the game. The networks were trained by backpropagation from transcripts of 400 games in which the author played himself. The author's move was taught as the best move in each position. In 1992, Tesauro completed TD-Gammon, which combined a form of reinforcement learning Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


TD-Gammon
TD-Gammon is a computer backgammon program developed in 1992 by Gerald Tesauro at IBM's Thomas J. Watson Research Center. Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning, specifically TD-Lambda. TD-Gammon achieved a level of play just slightly below that of the top human backgammon players of the time. It explored strategies that humans had not pursued and led to advances in the theory of correct backgammon play. Algorithm for play and learning During play, TD-Gammon examines on each turn all possible legal moves and all their possible responses (two-ply lookahead), feeds each resulting board position into its evaluation function, and chooses the move that leads to the board position that got the highest score. In this respect, TD-Gammon is no different than almost any other computer board-game program. TD-Gammon's innovation was in how it learned its evaluation function. TD-Gammon's learning algorithm consists ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

1st Computer Olympiad
The Computer Olympiad is a multi-games event in which computer programs compete against each other. For many games, the Computer Olympiads are an opportunity to claim the "world's best computer player" title. First contested in 1989, the majority of the games are board games but other games such as bridge take place as well. In 2010, several puzzles were included in the competition. History Developed in the 1980s by David Levy, the first Computer Olympiad took place in 1989 at the Park Lane Hotel in London. The games ran on a yearly basis until after the 1992 games, when the Olympiad's ruling committee was unable to find a new organiser. This resulted in the games being suspended until 2000 when the Mind Sports Olympiad resurrected them. Recently, the International Computer Games Association (ICGA) has adopted the Computer Olympiad and tries to organise the event on an annual basis. Games contested The games which have been contested at each olympiad are: 1st Compute ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computer
A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as Computer program, programs. These programs enable computers to perform a wide range of tasks. A computer system is a nominally complete computer that includes the Computer hardware, hardware, operating system (main software), and peripheral equipment needed and used for full operation. This term may also refer to a group of computers that are linked and function together, such as a computer network or computer cluster. A broad range of Programmable logic controller, industrial and Consumer electronics, consumer products use computers as control systems. Simple special-purpose devices like microwave ovens and remote controls are included, as are factory devices like industrial robots and computer-aided design, as well as general-purpose devi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Backgammon
Backgammon is a two-player board game played with counters and dice on tables boards. It is the most widespread Western member of the large family of tables games, whose ancestors date back nearly 5,000 years to the regions of Mesopotamia and Persia. The earliest record of backgammon itself dates to 17th-century England, being descended from the 16th-century Irish (game), game of Irish.Forgeng, Johnson and Cram (2003), p. 269. Backgammon is a two-player game of contrary movement in which each player has fifteen piece (tables game), pieces, known traditionally as 'men' (short for 'tablemen') but increasingly known as 'checkers' in the US in recent decades. These pieces move along twenty-four 'point (tables game), points' according to the roll of two dice. The objective of the game is to move the fifteen pieces around the board and be first to ''bear off'', i.e., remove them from the board. The achievement of this while the opponent is still a long way behind results in a triple wi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Gerald Tesauro
Gerald is a male Germanic given name meaning "rule of the spear" from the prefix ''ger-'' ("spear") and suffix ''-wald'' ("rule"). Variants include the English given name Jerrold, the feminine nickname Jeri and the Welsh language Gerallt and Irish language Gearalt. Gerald is less common as a surname. The name is also found in French as Gérald. Geraldine is the feminine equivalent. Given name People with the name Gerald include: Politicians * Gerald Boland, Ireland's longest-serving Minister for Justice * Gerald Ford, 38th President of the United States * Gerald Gardiner, Baron Gardiner, Lord Chancellor from 1964 to 1970 * Gerald Häfner, German MEP * Gerald Klug, Austrian politician * Gerald Lascelles (other), several people * Gerald Nabarro, British Conservative politician * Gerald S. McGowan, US Ambassador to Portugal * Gerald Wellesley, 7th Duke of Wellington, British diplomat, soldier, and architect Sports * Gerald Asamoah, Ghanaian-born German football player * Ge ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Thomas J
Clarence Thomas (born June 23, 1948) is an American jurist who serves as an associate justice of the Supreme Court of the United States. He was nominated by President George H. W. Bush to succeed Thurgood Marshall and has served since 1991. After Marshall, Thomas is the second African American to serve on the Court and its longest-serving member since Anthony Kennedy's retirement in 2018. Thomas was born in Pin Point, Georgia. After his father abandoned the family, he was raised by his grandfather in a poor Gullah community near Savannah. Growing up as a devout Catholic, Thomas originally intended to be a priest in the Catholic Church but was frustrated over the church's insufficient attempts to combat racism. He abandoned his aspiration of becoming a clergyman to attend the College of the Holy Cross and, later, Yale Law School, where he was influenced by a number of conservative authors, notably Thomas Sowell, who dramatically shifted his worldview from progressive to ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Artificial Neural Network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. The "signal" at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called ''edges''. Neurons and edges typically have a ''weight'' that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Typically ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Backpropagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward neural network, feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In Artificial neural network#Learning, fitting a neural network, backpropagation computes the gradient of the loss function with respect to the Glossary of graph theory terms#weight, weights of the network for a single input–output example, and does so Algorithmic efficiency, efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Reinforcement Learning
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead the focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). The environment is typically stated in the form of a Markov decision process (MDP), because many reinforcement learning algorithms for this context use dynamic programming techniques. The main difference between the classical dynamic programming methods and reinforcement learning algorithms is that the latter do not assume knowledge of an exact mathematica ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]