Strength Of Association
   HOME

TheInfoList



OR:

An odds ratio (OR) is a
statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypo ...
that quantifies the strength of the
association Association may refer to: *Club (organization), an association of two or more people united by a common interest or goal *Trade association, an organization founded and funded by businesses that operate in a specific industry *Voluntary associatio ...
between two events, A and B. The odds ratio is defined as the ratio of the
odds Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics. Odds also have ...
of A in the presence of B and the odds of A in the absence of B, or equivalently (due to
symmetry Symmetry (from grc, συμμετρία "agreement in dimensions, due proportion, arrangement") in everyday language refers to a sense of harmonious and beautiful proportion and balance. In mathematics, "symmetry" has a more precise definit ...
), the ratio of the odds of B in the presence of A and the odds of B in the absence of A. Two events are
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
if and only if the OR equals 1, i.e., the odds of one event are the same in either the presence or absence of the other event. If the OR is greater than 1, then A and B are associated (correlated) in the sense that, compared to the absence of B, the presence of B raises the odds of A, and symmetrically the presence of A raises the odds of B. Conversely, if the OR is less than 1, then A and B are negatively correlated, and the presence of one event reduces the odds of the other event. Note that the odds ratio is symmetric in the two events, and there is no
causal Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cau ...
direction implied (
correlation does not imply causation The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them. The id ...
): an OR greater than 1 does not establish that B causes A, or that A causes B. Two similar statistics that are often used to quantify associations are the
relative risk The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
(RR) and the absolute risk reduction (ARR). Often, the parameter of greatest interest is actually the RR, which is the ratio of the probabilities analogous to the odds used in the OR. However, available data frequently do not allow for the computation of the RR or the ARR but do allow for the computation of the OR, as in case-control studies, as explained below. On the other hand, if one of the properties (A or B) is sufficiently rare (in epidemiology this is called the
rare disease assumption The rare disease assumption is a mathematical assumption in epidemiologic case-control studies where the hypothesis tests the association between an exposure and a disease. It is assumed that, if the prevalence of the disease is low, then the odd ...
), then the OR is approximately equal to the corresponding RR. The OR plays an important role in the logistic model.


Definition and basic properties


A motivating example, in the context of the

rare disease assumption The rare disease assumption is a mathematical assumption in epidemiologic case-control studies where the hypothesis tests the association between an exposure and a disease. It is assumed that, if the prevalence of the disease is low, then the odd ...

Suppose a radiation leak in a village of 1,000 people increased the incidence of a rare disease. The total number of people exposed to the radiation was V_E=400, out of which D_E=20 developed the disease and H_E=380 stayed healthy. The total number of people not exposed was V_N=600, out of which D_N=6 developed the disease and H_N=594 stayed healthy. We can organize this in a
table Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
: : \begin \hline & & \\ \hline \text & 20 & 380 \\ \text & 6 & 594 \\ \hline \end The ''risk'' of developing the disease given exposure is D_E/V_E= 20/400=.05 and of developing the disease given non-exposure is D_N/V_N= 6/600 = .01. One obvious way to compare the risks is to use the ratio of the two, the ''
relative risk The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
'' (another way is to look at the absolute difference, .05-.01=.04). : \text = \frac= \frac= \frac= \frac = 5\,. The odds ratio is different. The ''odds'' of getting the disease if exposed is D_E/H_E=20/380\approx .052 , and the odds if ''not'' exposed is D_N/H_N = 6/594 \approx .010 \,. The ''odds ratio'' is the ratio of the two, : \text = \frac= \frac\approx \frac = 5.2\,. As illustrated by this example, in a rare-disease case like this, the
relative risk The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
and the odds ratio are almost the same. By definition, rare disease implies that V_E\approx H_E and V_N\approx H_N. Thus, the denominators in the relative risk and odds ratio are almost the same (400 \approx 380 and 600 \approx 594).
Relative risk The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
is easier to understand than the odds ratio, but one reason to use odds ratio is that usually, data on the entire population is not available and
random sampling In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attemp ...
must be used. In the example above, if it were very costly to interview villagers and find out if they were exposed to the radiation, then the
prevalence In epidemiology, prevalence is the proportion of a particular population found to be affected by a medical condition (typically a disease or a risk factor such as smoking or seatbelt use) at a specific time. It is derived by comparing the number o ...
of radiation exposure would not be known, and neither would the values of V_E or V_N. One could take a random sample of fifty villagers, but quite possibly such a random sample would not include anybody with the disease, since only 2.6% of the population are diseased. Instead, one might use a case-control study in which all 26 diseased villagers are interviewed as well as a random sample of 26 who do not have the disease. The results might turn out as follows ("might", because this is a random sample): : \begin \hline & & \\ \hline \text & 20 & 10 \\ \text & 6 & 16 \\ \hline \end The odds in this sample of getting the disease given that someone is exposed is 20/10 and the odds given that someone is not exposed is 6/16. The odds ratio is thus \frac \approx 5.3 . The relative risk, however, cannot be calculated, because it is the ratio of the risks of getting the disease and we would need V_E and V_N to figure those out. Because the study selected for people with the disease, half the people in the sample have the disease and it is known that that is more than the population-wide prevalence. It is standard in the medical literature to calculate the odds ratio and then use the rare-disease assumption (which is usually reasonable) to claim that the relative risk is approximately equal to it. This not only allows for the use of case-control studies, but makes controlling for confounding variables such as weight or age using regression analysis easier and has the desirable properties discussed in other sections of this article of invariance and
insensitivity to the type of sampling Sensory processing is the process that organizes sensation from one's own body and the environment, thus making it possible to use the body effectively within the environment. Specifically, it deals with how the brain processes multiple sensory mod ...
.


Definition in terms of group-wise odds

The odds ratio is the ratio of the
odds Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics. Odds also have ...
of an event occurring in one group to the odds of it occurring in another group. The term is also used to refer to sample-based estimates of this ratio. These groups might be men and women, an experimental group and a
control group In the design of experiments, hypotheses are applied to experimental units in a treatment group. In comparative experiments, members of a control group receive a standard treatment, a placebo, or no treatment at all. There may be more than one tr ...
, or any other
dichotomous A dichotomy is a partition of a whole (or a set) into two parts (subsets). In other words, this couple of parts must be * jointly exhaustive: everything must belong to one part or the other, and * mutually exclusive: nothing can belong simultan ...
classification. If the probabilities of the event in each of the groups are ''p''1 (first group) and ''p''2 (second group), then the odds ratio is: : \frac=\frac=\frac, where ''q''x = 1 − ''p''''x''. An odds ratio of 1 indicates that the condition or event under study is equally likely to occur in both groups. An odds ratio greater than 1 indicates that the condition or event is more likely to occur in the first group. And an odds ratio less than 1 indicates that the condition or event is less likely to occur in the first group. The odds ratio must be nonnegative if it is defined. It is undefined if ''p''2''q''1 equals zero, i.e., if ''p''2 equals zero or ''q''1 equals zero.


Definition in terms of joint and conditional probabilities

The odds ratio can also be defined in terms of the joint
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of two binary
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s. The joint distribution of binary random variables and can be written : \begin & Y = 1 & Y = 0 \\ \hline X = 1 & p_ & p_ \\ X = 0 & p_ & p_ \end where 11, 10, 01 and 00 are non-negative "cell probabilities" that sum to one. The odds for within the two subpopulations defined by = 1 and = 0 are defined in terms of the
conditional probabilities In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occur ...
given , ''i.e.'', : : \begin & Y = 1 & Y = 0 \\ \hline X = 1 & \frac & \frac \\ X = 0 & \frac & \frac \end Thus the odds ratio is : \dfrac \bigg/ \dfrac = \frac The simple expression on the right, above, is easy to remember as the product of the probabilities of the "concordant cells" divided by the product of the probabilities of the "discordant cells" . However note that in some applications the labeling of categories as zero and one is arbitrary, so there is nothing special about concordant versus discordant values in these applications.


Symmetry

If we had calculated the odds ratio based on the conditional probabilities given ''Y'', :\begin & Y = 1 & Y = 0 \\ \hline X = 1 & \frac & \frac \\ X = 0 & \frac & \frac \end we would have obtained the same result : \dfrac \bigg / \dfrac = \dfrac. Other measures of effect size for
binary data Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra. Binary data occurs in many different technical and scientific fields, wher ...
such as the
relative risk The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
do not have this symmetry property.


Relation to statistical independence

If ''X'' and ''Y'' are independent, their joint probabilities can be expressed in terms of their marginal probabilities and , as follows :\begin & Y = 1 & Y = 0 \\ \hline X = 1 & p_xp_y & p_x(1-p_y) \\ X = 0 & (1-p_x)p_y & (1-p_x)(1-p_y) \end In this case, the odds ratio equals one, and conversely the odds ratio can only equal one if the joint probabilities can be factored in this way. Thus the odds ratio equals one if and only if ''X'' and ''Y'' are
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
.


Recovering the cell probabilities from the odds ratio and marginal probabilities

The odds ratio is a function of the cell probabilities, and conversely, the cell probabilities can be recovered given knowledge of the odds ratio and the marginal probabilities and . If the odds ratio ''R'' differs from 1, then : p_ = \frac where , and : S = \sqrt. In the case where , we have independence, so . Once we have , the other three cell probabilities can easily be recovered from the marginal probabilities.


Example

Suppose that in a sample of 100 men, 90 drank wine in the previous week (so 10 did not), while in a sample of 80 women only 20 drank wine in the same period (so 60 did not). This forms the contingency table: :\begin & M = 1 & M = 0 \\ \hline D = 1 & 90 & 20 \\ D = 0 & 10 & 60 \end The odds ratio (OR) can be directly calculated from this table as: :=\frac = 27 Alternatively, the odds of a man drinking wine are 90 to 10, or 9:1, while the odds of a woman drinking wine are only 20 to 60, or 1:3 = 0.33. The odds ratio is thus 9/0.33, or 27, showing that men are much more likely to drink wine than women. The detailed calculation is: :=\frac = = 27 This example also shows how odds ratios are sometimes sensitive in stating relative positions: in this sample men are (90/100)/(20/80) = 3.6 times as likely to have drunk wine than women, but have 27 times the odds. The logarithm of the odds ratio, the difference of the
logit In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations. Mathematically, the logit is the ...
s of the probabilities, tempers this effect, and also makes the measure
symmetric Symmetry (from grc, συμμετρία "agreement in dimensions, due proportion, arrangement") in everyday language refers to a sense of harmonious and beautiful proportion and balance. In mathematics, "symmetry" has a more precise definiti ...
with respect to the ordering of groups. For example, using
natural logarithms The natural logarithm of a number is its logarithm to the base of the mathematical constant , which is an irrational and transcendental number approximately equal to . The natural logarithm of is generally written as , , or sometimes, if ...
, an odds ratio of 27/1 maps to 3.296, and an odds ratio of 1/27 maps to −3.296.


Statistical inference

Several approaches to statistical inference for odds ratios have been developed. One approach to inference uses large sample approximations to the sampling distribution of the log odds ratio (the
natural logarithm The natural logarithm of a number is its logarithm to the base of the mathematical constant , which is an irrational and transcendental number approximately equal to . The natural logarithm of is generally written as , , or sometimes, if ...
of the odds ratio). If we use the joint probability notation defined above, the population log odds ratio is :.\, If we observe data in the form of a
contingency table In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business i ...
:\begin & Y = 1 & Y = 0 \\ \hline X = 1 & n_ & n_ \\ X = 0 & n_ & n_ \end then the probabilities in the joint distribution can be estimated as :\begin & Y = 1 & Y = 0 \\ \hline X = 1 & \hat_ & \hat_ \\ X = 0 & \hat_ & \hat_ \end where , with being the sum of all four cell counts. The sample log odds ratio is :. The distribution of the log odds ratio is approximately
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
with: : L\ \sim\ \mathcal(\log (OR),\,\sigma^2). \, The
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
for the log odds ratio is approximately :. This is an asymptotic approximation, and will not give a meaningful result if any of the cell counts are very small. If ''L'' is the sample log odds ratio, an approximate 95%
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the population log odds ratio is . This can be mapped to to obtain a 95% confidence interval for the odds ratio. If we wish to test the hypothesis that the population odds ratio equals one, the two-sided
p-value In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
is , where ''P'' denotes a probability, and ''Z'' denotes a standard normal random variable. An alternative approach to inference for odds ratios looks at the distribution of the data conditionally on the marginal frequencies of ''X'' and ''Y''. An advantage of this approach is that the sampling distribution of the odds ratio can be expressed exactly.


Role in logistic regression

Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear function (calculus), linear combination of one or more independent var ...
is one way to generalize the odds ratio beyond two binary variables. Suppose we have a binary response variable ''Y'' and a binary predictor variable ''X'', and in addition we have other predictor variables ''Z''1, ..., ''Zp'' that may or may not be binary. If we use multiple logistic regression to regress ''Y'' on ''X'', ''Z1'', ..., ''Zp'', then the estimated coefficient \hat_x for ''X'' is related to a conditional odds ratio. Specifically, at the population level : e^ = \exp(\beta_x) = \frac, so \exp(\hat_x) is an estimate of this conditional odds ratio. The interpretation of \exp(\hat_x) is as an estimate of the odds ratio between ''Y'' and ''X'' when the values of ''Z''1, ..., ''Zp'' are held fixed.


Insensitivity to the type of sampling

If the data form a "population sample", then the cell probabilities \widehat_ are interpreted as the frequencies of each of the four groups in the population as defined by their ''X'' and ''Y'' values. In many settings it is impractical to obtain a population sample, so a selected sample is used. For example, we may choose to sample
units Unit may refer to: Arts and entertainment * UNIT, a fictional military organization in the science fiction television series ''Doctor Who'' * Unit of action, a discrete piece of action (or beat) in a theatrical presentation Music * Unit (album), ...
with with a given probability ''f'', regardless of their frequency in the population (which would necessitate sampling units with with probability ). In this situation, our data would follow the following joint probabilities: :\begin & Y = 1 & Y = 0 \\ \hline X = 1 & \frac & \frac \\ X = 0 & \frac & \frac \end The ''odds ratio'' for this distribution does not depend on the value of ''f''. This shows that the odds ratio (and consequently the log odds ratio) is invariant to non-random sampling based on one of the variables being studied. Note however that the standard error of the log odds ratio does depend on the value of ''f''. This fact is exploited in two important situations: * Suppose it is inconvenient or impractical to obtain a population sample, but it is practical to obtain a
convenience sample Convenience sampling (also known as grab sampling, accidental sampling, or opportunity sampling) is a type of non-probability sampling that involves the sample being drawn from that part of the population that is close to hand. This type of sampl ...
of units with different ''X'' values, such that within the and subsamples the ''Y'' values are representative of the population (i.e. they follow the correct conditional probabilities). * Suppose the marginal distribution of one variable, say ''X'', is very skewed. For example, if we are studying the relationship between high alcohol consumption and pancreatic cancer in the general population, the incidence of pancreatic cancer would be very low, so it would require a very large population sample to get a modest number of pancreatic cancer cases. However we could use data from hospitals to contact most or all of their pancreatic cancer patients, and then randomly sample an equal number of subjects without pancreatic cancer (this is called a "case-control study"). In both these settings, the odds ratio can be calculated from the selected sample, without biasing the results relative to what would have been obtained for a population sample.


Use in quantitative research

Due to the widespread use of
logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear function (calculus), linear combination of one or more independent var ...
, the odds ratio is widely used in many fields of medical and social science research. The odds ratio is commonly used in
survey research In research of human subjects, a survey is a list of questions aimed for extracting specific data from a particular group of people. Surveys may be conducted by phone, mail, via the internet, and also at street corners or in malls. Surveys are us ...
, in
epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evidenc ...
, and to express the results of some
clinical trial Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, dietar ...
s, such as in case-control studies. It is often abbreviated "OR" in reports. When data from multiple surveys is combined, it will often be expressed as "pooled OR".


Relation to relative risk

As explained in the "Motivating Example" section, the
relative risk The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
is usually better than the odds ratio for understanding the relation between risk and some variable such as radiation or a new drug. That section also explains that if the
rare disease assumption The rare disease assumption is a mathematical assumption in epidemiologic case-control studies where the hypothesis tests the association between an exposure and a disease. It is assumed that, if the prevalence of the disease is low, then the odd ...
holds, the odds ratio is a good approximation to relative risk and that it has some advantages over relative risk. When the rare disease assumption does not hold, the odds ratio can overestimate the relative risk. If the absolute risk in the unexposed group is available, conversion between the two is calculated by: : \text \approx \frac where ''R''''C'' is the absolute risk of the unexposed group. If the rare disease assumption does not apply, the odds ratio may be very different from the relative risk and can be misleading. Consider the death rate of men and women passengers when the Titanic sank. Of 462 women, 154 died and 308 survived. Of 851 men, 709 died and 142 survived. Clearly a man on the Titanic was more likely to die than a woman, but how much more likely? Since over half the passengers died, the rare disease assumption is strongly violated. To compute the odds ratio, note that for women the odds of dying were 1 to 2 (154/308). For men, the odds were 5 to 1 (709/142). The odds ratio is 9.99 (4.99/.5). Men had ten times the odds of dying as women. For women, the probability of death was 33% (154/462). For men the probability was 83% (709/851). The relative risk of death is 2.5 (.83/.33). A man had 2.5 times a woman's probability of dying. Which number correctly represents how much more dangerous it was to be a man on the Titanic? Relative risk has the advantage of being easier to understand and of better representing how people think.


Confusion and exaggeration

Odds ratios have often been confused with relative risk in medical literature. For non-statisticians, the odds ratio is a difficult concept to comprehend, and it gives a more impressive figure for the effect. However, most authors consider that the relative risk is readily understood. In one study, members of a national disease foundation were actually 3.5 times more likely than nonmembers to have heard of a common treatment for that disease – but the odds ratio was 24 and the paper stated that members were ‘more than 20-fold more likely to have heard of’ the treatment. A study of papers published in two journals reported that 26% of the articles that used an odds ratio interpreted it as a risk ratio. This may reflect the simple process of uncomprehending authors choosing the most impressive-looking and publishable figure. But its use may in some cases be deliberately deceptive. It has been suggested that the odds ratio should only be presented as a measure of
effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...
when the risk ratio cannot be estimated directly.


Invertibility and invariance

The odds ratio has another unique property of being directly mathematically invertible whether analyzing the OR as either disease survival or disease onset incidence – where the OR for survival is direct reciprocal of 1/OR for risk. This is known as the 'invariance of the odds ratio'. In contrast, the relative risk does not possess this mathematical invertible property when studying disease survival vs. onset incidence. This phenomenon of OR invertibility vs. RR non-invertibility is best illustrated with an example: Suppose in a clinical trial, one has an adverse event risk of 4/100 in drug group, and 2/100 in placebo... yielding a RR=2 and OR=2.04166 for drug-vs-placebo adverse risk. However, if analysis was inverted and adverse events were instead analyzed as event-free survival, then the drug group would have a rate of 96/100, and placebo group would have a rate of 98/100—yielding a drug-vs-placebo a RR=0.9796 for survival, but an OR=0.48979. As one can see, a RR of 0.9796 is clearly not the reciprocal of a RR of 2. In contrast, an OR of 0.48979 is indeed the direct reciprocal of an OR of 2.04166. This is again what is called the 'invariance of the odds ratio', and why a RR for survival is not the same as a RR for risk, while the OR has this symmetrical property when analyzing either survival or adverse risk. The danger to clinical interpretation for the OR comes when the adverse event rate is not rare, thereby exaggerating differences when the OR rare-disease assumption is not met. On the other hand, when the disease is rare, using a RR for survival (e.g. the RR=0.9796 from above example) can clinically hide and conceal an important doubling of adverse risk associated with a drug or exposure.


Estimators of the odds ratio


Sample odds ratio

The sample odds ratio ''n''11''n''00 / ''n''10''n''01 is easy to calculate, and for moderate and large samples performs well as an estimator of the population odds ratio. When one or more of the cells in the contingency table can have a small value, the sample odds ratio can be biased and exhibit high
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
.


Alternative estimators

A number of alternative estimators of the odds ratio have been proposed to address limitations of the sample odds ratio. One alternative estimator is the conditional maximum likelihood estimator, which conditions on the row and column margins when forming the likelihood to maximize (as in
Fisher's exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, ...
). Another alternative estimator is the Mantel–Haenszel estimator.


Numerical examples

The following four contingency tables contain observed cell counts, along with the corresponding sample odds ratio (''OR'') and sample log odds ratio (''LOR''): The following
joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
s contain the population cell probabilities, along with the corresponding population odds ratio (''OR'') and population log odds ratio (''LOR''):


Numerical example


Related statistics

There are various other
summary statistics for contingency tables may refer to: * Abstract (summary), shortening a passage or a write-up without changing its meaning but by using different words and sentences * Epitome, a summary or miniature form * Abridgement, the act of reducing a written work into a sho ...
that measure association between two events, such as Yule's ''Y'', Yule's ''Q''; these two are normalized so they are 0 for independent events, 1 for perfectly correlated, −1 for perfectly negatively correlated. studied these and argued that these measures of association must be functions of the odds ratio, which he referred to as the
cross-ratio In geometry, the cross-ratio, also called the double ratio and anharmonic ratio, is a number associated with a list of four collinear points, particularly points on a projective line. Given four points ''A'', ''B'', ''C'' and ''D'' on a line, the ...
.


See also

*
Cohen's h In statistics, Cohen's ''h'', popularized by Jacob Cohen, is a measure of distance between two proportions or probabilities. Cohen's ''h'' has several related uses: * It can be used to describe the difference between two proportions as "small", ...
*
Cross-ratio In geometry, the cross-ratio, also called the double ratio and anharmonic ratio, is a number associated with a list of four collinear points, particularly points on a projective line. Given four points ''A'', ''B'', ''C'' and ''D'' on a line, the ...
*
Diagnostic odds ratio In medical testing with binary classification, the diagnostic odds ratio (DOR) is a measure of the effectiveness of a diagnostic test. It is defined as the ratio of the odds of the test being positive if the subject has a disease relative to the ...
*
Forest plot A forest plot, also known as a blobbogram, is a graphical display of estimated results from a number of scientific studies addressing the same question, along with the overall results. It was developed for use in medical research as a means of ...
*
Hazard ratio In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions characterised by two distinct levels of a treatment variable of interest. For example, in a clinical study of a drug, the treated populati ...
*
Likelihood ratio The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood functi ...
*
Rate ratio In epidemiology, a rate ratio, sometimes called an incidence density ratio or incidence rate ratio, is a relative difference measure used to compare the incidence rates of events occurring at any given point in time. It is defined as: : \text = \ ...


References


Citations


Sources

*


External links


Odds Ratio Calculator – website



OpenEpi, a web-based program that calculates the odds ratio, both unmatched and pair-matched
{{DEFAULTSORT:Odds Ratio Epidemiology Medical statistics Bayesian statistics Summary statistics for contingency tables