Performance rating (abbreviated as Rp) in
chess
Chess is a board game for two players. It is an abstract strategy game that involves Perfect information, no hidden information and no elements of game of chance, chance. It is played on a square chessboard, board consisting of 64 squares arran ...
is the level a player performed at in a tournament or match based on the number of games played, their total score in those games, and the
Elo ratings of their opponents. It is the Elo rating a player would have if their performance resulted in no net rating change.
Due to the difficulty of computing performance rating in this manner, however, the linear method and FIDE method for calculating performance rating are in much more widespread use. With these simpler methods, only the average rating (abbreviated as Ra) factors into the calculation instead of the rating of each individual opponent. Regardless of the method, only the total score is used to determine performance rating instead of individual game results.
FIDE
The International Chess Federation or World Chess Federation, commonly referred to by its French acronym FIDE ( , ), is an international organization based in Switzerland that connects the various national chess federations and acts as the Spor ...
performance ratings are also used to determine if a player has achieved a
norm
Norm, the Norm or NORM may refer to:
In academic disciplines
* Normativity, phenomenon of designating things as good or bad
* Norm (geology), an estimate of the idealised mineral content of a rock
* Norm (philosophy), a standard in normative e ...
for FIDE titles such as
Grandmaster (GM).
Definition
A player's performance rating in a series of games is the
Elo rating
The Elo rating system is a method for calculating the relative skill levels of players in zero-sum games such as chess or esports. It is named after its creator Arpad Elo, a Hungarian-American chess master and physics professor.
The Elo system wa ...
a player would need to have to expect to get their actual total score against the opponents they faced in those games. A practical way to understand the performance rating centers around the fact that a player's actual rating changes after each game played. By the definition, the only way a player's actual rating would not change after this series of games is if their rating at the start of these games was already their performance rating over the series. With this definition, individual game results do not directly factor into the calculation. Unlike the linear and FIDE methods, however, the ratings of individual opponents do affect the calculation.
Mathematical definition
Given a total score
over a series of
games and opponent ratings
, the perfect performance rating is the number
where the expected score (calculated from opponent ratings) matches the actual score
:
Note that in the two limiting cases:
* If a player loses all their games (
), their performance rating is
.
* If a player wins all their games (
), their performance rating is
.
Calculation
Since
is a
monotonically increasing function, we can find
by performing
binary search
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
over the domain. We start by setting a reasonable lower and upper bound for ratings (here, 0 to 4000) and then check the expected score at the midpoint (2000). If the actual score is higher than this expectation, it indicates the player's performance was better than 2000, so we repeat the search on the upper half (2000 to 4000, midpoint at 3000). This process repeats until the precise value of
is found.
A sample implementation in
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (prog ...
follows:
def expected_score(opponent_ratings: list loat own_rating: float) -> float:
"""How many points we expect to score in a tourney with these opponents"""
return sum(
1 / (1 + 10**((opponent_rating - own_rating) / 400))
for opponent_rating in opponent_ratings
)
def performance_rating(opponent_ratings: list loat score: float) -> int:
"""Calculate mathematically perfect performance rating with binary search"""
lo, hi = 0, 4000
while hi - lo > 0.001:
mid = (lo + hi) / 2
if expected_score(opponent_ratings, mid) < score:
lo = mid
else:
hi = mid
return round(mid)
print(performance_rating( 851, 2457, 1989, 2379, 2407 4)) # should be 2551
FIDE performance rating
FIDE
The International Chess Federation or World Chess Federation, commonly referred to by its French acronym FIDE ( , ), is an international organization based in Switzerland that connects the various national chess federations and acts as the Spor ...
calculates a player's performance rating
as
, where
is the average rating of the opponents and
is an additional rating difference based on the player's total score divided by the number of rounds played. That fractional score is called
. There is no
analytic expression
In mathematics, an expression or equation is in closed form if it is formed with constants, variables, and a set of functions considered as ''basic'' and connected by arithmetic operations (, and integer powers) and function composition. Co ...
for
. Instead, FIDE provides a
lookup table
In computer science, a lookup table (LUT) is an array data structure, array that replaces runtime (program lifecycle phase), runtime computation of a mathematical function (mathematics), function with a simpler array indexing operation, in a proc ...
for the values of
based on the values of
rounded to the nearest hundredth. The values of
for common lengths of tournaments (eight to eleven rounds) are listed below under rating difference examples.
[
Like the true definition, the FIDE method also does not depend on individual game results. Unlike the true definition, the FIDE method does not depend on individual opponent ratings.]
Rating difference examples
Note: Zero scores have , even scores have , and perfect scores have .
Use in norms
One of the requirements to earn a FIDE title in a standard manner is to achieve a certain number of norms. A norm in chess is awarded if a player has a performance rating in a tournament at or above a threshold rating. As an example, for the Grandmaster (GM) title, a player must achieve three GM norms corresponding to performance ratings of at least 2600 against opponents with an average rating of 2380 and must also have reached a required peak live rating of 2500. These norms are calculated with the FIDE performance rating method.
Linear performance rating
Because of the need to have a lookup table to calculate the rating difference in FIDE performance ratings, another simpler method instead calculates the rating difference as , where is the percentage score in this case. The overall performance rating is then calculated as , the same as the FIDE method.
An equivalent way to calculate this performance rating is by taking the average of
* Opponent's rating + 400 for each win
* Opponent's rating - 400 for each loss
* Just the opponent's rating for each draw
A notable drawback of this approach is that winning against a low-rated player can actually ''lower'' your performance rating.
This method is sometimes called the linear method due to the linear dependence on the percentage score . Like the true definition, the linear method also does not depend on individual game results. Unlike the true definition, the linear method does not depend on individual opponent ratings.
Comparison between methods
Different methods for calculating the performance rating generally give similar results. The only score in which all methods give exactly the same result is an even score against opponents with no skew away from their average rating, in which case the performance rating is the average of the opponents' ratings. There are larger discrepancies closer to zero scores or perfect scores, or a larger variance in the individual ratings (in which case the individual ratings have a larger effect). The true definition of the performance rating gives for a zero score and for a perfect score, whereas the other methods yield finite values.[
As a specific example, if a player scores 2½/3 against three opponents rated 2400, 2500, and 2600, their performance ratings with the different methods are 2785 (true definition), 2773 (FIDE), and 2767 (linear).][
]
References
{{reflist
External links
FIDE Handbook: Rating System
Chess terminology