Mendelian randomization
   HOME

TheInfoList



OR:

In
epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evide ...
, Mendelian randomization (commonly abbreviated to MR) is a method using measured variation in genes to interrogate the causal effect of an exposure on an outcome. Under key assumptions (see below), the design reduces both reverse causation and confounding, which often substantially impede or mislead the interpretation of results from epidemiological studies. The study design was first proposed in 1986 and subsequently described by Gray and Wheatley as a method for obtaining unbiased estimates of the effects of a putative causal variable without conducting a traditional
randomized controlled trial A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical te ...
(i.e. the "gold standard" in epidemiology for establishing causality). These authors also coined the term ''Mendelian randomization''.


Motivation

One of the predominant aims of epidemiology is to identify modifiable causes of health outcomes and disease especially those of
public health Public health is "the science and art of preventing disease, prolonging life and promoting health through the organized efforts and informed choices of society, organizations, public and private, communities and individuals". Analyzing the det ...
concern. In order to ascertain whether modifying a particular trait (e.g. via an intervention, treatment or policy change) will convey a beneficial effect within a population, firm evidence that this trait causes the outcome of interest is required. However, many observational epidemiological study designs are limited in the ability to discern correlation from causation - specifically whether a particular trait causes an outcome of interest, is simply related to that outcome (but does not cause it) or is a consequence of the outcome itself. Only the former will be beneficial within a public health setting where the aim is to modify that trait to reduce the burden of disease. There are many epidemiological study designs that aim to understand relationships between traits within a population sample, each with shared and unique advantages and limitations in terms of providing causal evidence, with the "gold standard" being
randomized controlled trial A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical te ...
s. Well-known successful demonstrations of causal evidence consistent across multiple studies with different designs include the identified causal links between smoking and lung cancer, and between blood pressure and stroke. However, there have also been notable failures when exposures hypothesized to be a causal risk factor for a particular outcome were later shown by well conducted randomized controlled trials not to be causal. For instance, it was previously thought that hormone replacement therapy would prevent cardiovascular disease, but it is now known to have no such benefit and may even adversely affect health. Another notable example is that of selenium and prostate cancer. Some observational studies found an association between higher circulating selenium levels (usually acquired through various foods and dietary supplements ) and lower risk of prostate cancer. However, the Selenium and Vitamin E Cancer Prevention Trial (SELECT) showed evidence that dietary selenium supplementation actually increased the risk of prostate and advanced prostate cancer and had an additional off-target effect on increasing type 2 diabetes risk. Such inconsistencies between observational epidemiological studies and randomized controlled trials are likely a function of social, behavioral, or physiological confounding factors in many observational epidemiological designs, which are particularly difficult to measure accurately and difficult to control for. Moreover, randomized controlled trials are usually expensive, time-consuming and laborious and many epidemiological findings cannot be ethically replicated in clinical trials.


Definition

Mendelian randomization (MR) is fundamentally an
Instrumental variables estimation In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered t ...
method hailing from
Econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8 ...
. The method uses the properties of germline genetic variation (usually in the form of single nucleotide polymorphisms or SNPs) strongly associated with a putative exposure as a "proxy" or "instrument" for that exposure to test for and estimate a causal effect of the exposure on an outcome of interest from observational data. The genetic variation used will have either well-understood effects on exposure patterns (e.g. propensity to smoke heavily) or effects that mimic those produced by modifiable exposures (e.g., raised blood cholesterol). Importantly, the genotype must only affect the disease status indirectly via its effect on the exposure of interest. As genotypes are assigned randomly when passed from parents to offspring during
meiosis Meiosis (; , since it is a reductional division) is a special type of cell division of germ cells in sexually-reproducing organisms that produces the gametes, such as sperm or egg cells. It involves two rounds of division that ultimately r ...
, then groups of individuals defined by genetic variation associated with an exposure at a population level should be largely unrelated to the confounding factors that typically plague observational epidemiology studies. Germline genetic variation (i.e. that which can be inherited) is also temporarily fixed at conception and not modified by the onset of any outcome or disease, precluding reverse causation. Additionally, given improvements in modern genotyping technologies, measurement error and systematic misclassification is often low with genetic data. In this regard Mendelian randomization can be thought of as analogous to "nature's randomized controlled trial". Mendelian randomization requires three core instrumental variable assumptions. Namely: # The genetic variant(s) being used as an instrument for the exposure is associated with the exposure. This is known as the "relevance" assumption. # There are no common causes (i.e.
confounders In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
) of the genetic variant(s) and the outcome of interest. This is known as the "independence" or "exchangeability" assumption. # There is no independent pathway between the genetic variant(s) and the outcome other than through the exposure. This is known as the "exclusion restriction" or "no horizontal pleiotropy" assumption. To ensure that the first core assumption is validated, Mendelian randomization requires characterized associations between genetic variation and exposures of interest. These are usually sourced from
genome-wide association studies In genomics, a genome-wide association study (GWA study, or GWAS), also known as whole genome association study (WGA study, or WGAS), is an observational study of a genome-wide set of genetic variants in different individuals to see if any varian ...
though can also be candidate gene studies. The second assumption relies on there being no population substructure (e.g. geographical factors that induce an association between the genotype and outcome), mate choice that is not associated with genotype (i.e. random mating or panmixia) and no dynastic effects (i.e. where the expression of parental genotype in the parental phenotype directly affects the offspring phenotype).


Statistical analysis

Mendelian randomization is usually applied through the use of instrumental variables estimation with genetic variants acting as instruments for the exposure of interest. This can be implemented using data on the genetic variants, exposure and outcome of interest for a set of individuals in a single dataset or using summary data on the association between the genetic variants and the exposure and the association between the genetic variants and the outcome in separate datasets. The method has also been used in economic research studying the effects of obesity on earnings, and other labor market outcomes. When a single dataset is used the methods of estimation applied are those frequently used elsewhere in instrumental variable estimation, such as two-stage least squares. If multiple genetic variants are associated with the exposure they can either be used individually as instruments or combined to create an allele score which is used as a single instrument. Analysis using summary data often applies data from genome-wide association studies. In this case the association between genetic variants and the exposure is taken from the summary results produced by a genome-wide association study for the exposure. The association between the same genetic variants and the outcome is then taken from the summary results produced by a genome-wide association study for the outcome. These two sets of summary results are then used to obtain the MR estimate. Given the following notation: \hat_g= effect of genetic variant g on the exposure (X); \hat_g= estimated effect of genetic variant ''g'' on the outcome (Y); \hat_g= estimated standard error of this estimated effect; \hat_= MR estimate of the causal effect of the exposure X on the outcome Y; and considering the effect of a single genetic variant, the MR estimate can be obtained from the Wald ratio: \hat_=\frac When multiple genetic variants are used, the individual ratios for each genetic variants are combined using inverse variance weighting where each individual ratio is weighted by the uncertainty in their estimation. This gives the IVW estimate which can be calculated as: \hat_=\frac Alternatively, the same estimate can be obtained from a linear regression which used the genetic variant-outcome association as the outcome and the genetic variant-exposure association as the exposure. This linear regression is weighted by the uncertainty in the genetic-variant outcome association and does not include a constant. \hat_g=\beta_\hat_g+u_g \qquad weighted \ by \ 1/\hat^2_ These methods only provide reliable estimates of the causal effect of the exposure on the outcome under the core instrumental variable assumptions. Alternative methods are available that are robust to a violation of the third assumption, i.e. that provide reliable results under some types of horizontal pleiotropy. Additionally some biases that arise from violations of the second IV assumption, such as dynastic effects, can be overcome through the use of data which includes siblings or parents and their offspring.


History

The Mendelian randomization method depends on two principles derived from the original work by
Gregor Mendel Gregor Johann Mendel, OSA (; cs, Řehoř Jan Mendel; 20 July 1822 – 6 January 1884) was a biologist, meteorologist, mathematician, Augustinian friar and abbot of St. Thomas' Abbey in Brünn (''Brno''), Margraviate of Moravia. Mendel was ...
on genetic inheritance. Its foundation come from Mendel’s laws namely 1) the law of segregation in which there is complete segregation of the two allelomorphs in equal number of germ-cells of a heterozygote and 2) separate pairs of allelomorphs segregate independently of one another and which were first published as such in 1906 by
Robert Heath Lock Robert Heath Lock (19 January 1879 – 26 June 1915) was an English botanist and geneticist who wrote the first English textbook on genetics. Life Robert Heath Lock was the son of John Bascombe Lock, a priest and Eton College schoolmaster who ...
. Another progenitor of Mendelian randomization is
Sewall Wright Sewall Green Wright FRS(For) Honorary FRSE (December 21, 1889March 3, 1988) was an American geneticist known for his influential work on evolutionary theory and also for his work on path analysis. He was a founder of population genetics alongsi ...
who introduced path analysis, a form of causal diagram used for making causal inference from non-experimental data. The method relies on causal anchors, and the anchors in the majority of his examples were provided by
Mendelian inheritance Mendelian inheritance (also known as Mendelism) is a type of biological inheritance following the principles originally proposed by Gregor Mendel in 1865 and 1866, re-discovered in 1900 by Hugo de Vries and Carl Correns, and later popularize ...
, as is the basis of MR. Another component of the logic of MR is the instrumental gene, the concept of which was introduced by
Thomas Hunt Morgan Thomas Hunt Morgan (September 25, 1866 – December 4, 1945) was an American evolutionary biologist, geneticist, embryologist, and science author who won the Nobel Prize in Physiology or Medicine in 1933 for discoveries elucidating the role that ...
. This is important as it removed the need to understand the physiology of the gene for making the inference about genetic processes. Since that time the literature includes examples of research using molecular genetics to make inference about modifiable risk factors, which is the essence of MR. One example is the work of Gerry Lower and colleagues in 1979 who used the N-acetyltransferase phenotype as an anchor to draw inference about various exposures including smoking and amine dyes as risk factors for bladder cancer. Another example is the work of Martijn Katan (then of
Wageningen University & Research Wageningen University & Research (also known as Wageningen UR; abbreviation: WUR) is a public university in Wageningen, Netherlands, specializing in life sciences with a focus on agriculture, technical and engineering subjects. It is a globally ...
,
Netherlands ) , anthem = ( en, "William of Nassau") , image_map = , map_caption = , subdivision_type = Sovereign state , subdivision_name = Kingdom of the Netherlands , established_title = Before independence , established_date = Spanish Netherl ...
) in which he advocated a study design using Apolipoprotein E allele as an instrumental variable anchor to study the observed relationship between low blood cholesterol levels and increased risk of cancer. In fact, the term “Mendelian randomization” was first used in print by Richard Gray and Keith Wheatley (both of Radcliffe Infirmary,
Oxford, UK Oxford () is a city in England. It is the county town and only city of Oxfordshire. In 2020, its population was estimated at 151,584. It is north-west of London, south-east of Birmingham and north-east of Bristol. The city is home to the ...
) in 1991 in a somewhat different context; in a method allowing instrumental variable estimation but in relation to an approach relying on Mendelian inheritance rather than genotype. In their 2003 paper, Shah Ebrahim and George Davey Smith use the term again to describe the method of using germline genetic variants for understanding causality in an instrumental variable analysis, and it is this methodology that is now widely used and to which the meaning is ascribed. The Mendelian randomization method is now widely adopted in causal epidemiology, and the number of MR studies reported in the scientific literature has grown every year since the 2003 paper. In 2021 STROBE-MR guidelines were published to assist readers and reviewers of Mendelian randomization studies to evaluate the validity and utility of published studies.


References


Further reading

* * * {{refend


External links


Making sense of Mendelian randomisation and its use in health research
Epidemiology Genetic epidemiology Applications of randomness Causal inference Observational study