Risk score (or risk scoring) is the name given to a general practice in applied
statistics,
bio-statistics,
econometrics
Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8 ...
and other related disciplines, of creating an easily calculated number (the score) that reflects the level of
risk
In simple terms, risk is the possibility of something bad happening. Risk involves uncertainty about the effects/implications of an activity with respect to something that humans value (such as health, well-being, wealth, property or the environm ...
in the presence of some
risk factor
In epidemiology, a risk factor or determinant is a variable associated with an increased risk of disease or infection.
Due to a lack of harmonization across disciplines, determinant, in its more widely accepted scientific meaning, is often ...
s (e.g. risk of mortality or disease in the presence of symptoms or genetic profile, risk financial loss considering credit and financial history, etc.).
Risk scores are designed to be:
* Simple to calculate: In many cases all you need to calculate a score is a pen and a piece of paper (although
some
Some may refer to:
*''some'', an English word used as a determiner and pronoun; see use of ''some''
*The term associated with the existential quantifier
*"Some", a song by Built to Spill from their 1994 album ''There's Nothing Wrong with Love''
*S ...
scores use rely on more sophisticated or less transparent calculations that require a computer program).
* Easily interpreted: The result of the calculation is a single number, and higher score usually means higher risk. Furthermore, many scoring methods enforce some form of
monotonicity
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of orde ...
along the measured risk factors to allow a straight forward interpretation of the score (e.g. risk of mortality only increases with age, risk of payment default only increase with the amount of total debt the customer has, etc.).
* Actionable: Scores are designed around a set of possible actions that should be taken as a result of the calculated score. Effective score-based policies can be designed and executed by setting thresholds on the value of the score and associating them with escalating actions.
Formal definition
A typical scoring method is composed of 3 components:
[
]
# A set of consistent rules (or weights) that assign a numerical value ("points") to each risk factor that reflect our estimation of underlying risk.
# A formula (typically a simple sum of all accumulated points) that calculates the score.
# A set of thresholds that helps to translate the calculated score into a level of risk, or an equivalent formula or set of rules to translate the calculated score back into probabilities (leaving the nominal evaluation of severity to the practitioner).
Items 1 & 2 can be achieved by using some form of
regression, that will provide both the risk estimation and the formula to calculate the score. Item 3 requires setting an arbitrary set of thresholds and will usually involve expert opinion.
Estimating risk with GLM
Risk score are designed to represent an underlying probability of an adverse event denoted
given a vector of
explaining variables containing measurements of the relevant risk factors. In order to establish the connection between the risk factors and the probability we estimate a set of weights
is estimated using a
generalized linear model:
:
Where