HOME

TheInfoList



OR:

In
statistical Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industria ...
hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
, p-rep or ''p''rep has been proposed as a statistical alternative to the classic
p-value In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
. Whereas a p-value is the probability of obtaining a result under the null hypothesis, p-rep purports to compute the probability of replicating an effect. The derivation of p-rep contained significant mathematical errors. For a while, the
Association for Psychological Science The Association for Psychological Science (APS), previously the American Psychological Society, is an international non-profit organization whose mission is to promote, protect, and advance the interests of scientifically oriented psychology in ...
recommended that articles submitted to
Psychological Science ''Psychological Science'', the flagship journal of the Association for Psychological Science (APS), is a monthly, peer-reviewed, scientific journal published by SAGE Publications. Publication scope ''Psychological Science'' publishes research r ...
and their other journals report p-rep rather than the classic p-value, but this is no longer the case.


Calculation


Approximation from ''p''

The value of the p-rep (''p''rep) can be approximated based on the p-value (''p'') as follows: : p_\text = \left 1 + \left( \frac \right)^ \right. The above applies for one-tailed distributions.


Criticism

The fact that the p-rep has a
one-to-one correspondence In mathematics, a bijection, also known as a bijective function, one-to-one correspondence, or invertible function, is a function between the elements of two sets, where each element of one set is paired with exactly one element of the other s ...
with the p-value makes it clear that this new measure brings no additional information beyond that conveyed by the significance of the result. Killeen acknowledges this lack of information, but suggests that p-rep better captures the way
naive Naivety (also spelled naïvety), naiveness, or naïveté is the state of being naive. It refers to an apparent or actual lack of experience and sophistication, often describing a neglect of pragmatism in favor of moral idealism. A ''naïve'' may b ...
experimenters conceptualize p-values and
statistical hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
. Among the criticisms of p-rep is the fact that while it attempts to estimate replicability, it ignores results from other studies which can accurately guide this estimate. For example, an experiment on some unlikely paranormal phenomenon may yield a p-rep of 0.75. Most people would still not conclude the probability of a replication was 75%. Rather, they would conclude it is much closer to 0: Extraordinary claims require extraordinary evidence, and p-rep ignores this. Because of this, p-rep may in fact be harder to interpret than a classical p-value. The fact that p-rep requires assumptions about prior probabilities for it to be valid makes its interpretation complex. Killeen argues that new results should be evaluated in their own right, without the "burden of history", with flat priors: that is what p-rep yields. A more pragmatic estimate of replicability would include prior knowledge, via, for instance,
meta-analysis A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting me ...
. Critics have also underscored mathematical errors in the original Killeen paper. For example, the formula relating the
effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...
s from two replications of a given experiment erroneously uses one of these
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s as a
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
of the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of the other while he previously hypothesized these two variables to be
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
, criticisms addressed in Killeen's rejoinder.Killeen, P. R. (2005)" Replicability, Confidence, and Priors", ''Psychological Science'', 2005, 16, 1009–101

/ref> A further criticism of the p-rep statistic involves the logic of experimentation. The scientific value of replicable data lies in the adequate accounting for previously unmeasured factors (e.g., unmeasured participant variables,
experimenter's bias Observer bias is one of the types of detection bias and is defined as any kind of systematic divergence from accurate facts during observation and the recording of data and information in studies. The definition can be further expanded upon to inclu ...
, etc.), The idea that a single study can capture a logical likelihood of such unmeasured factors affecting the outcome, and thus the likelihood of replicability, is a logical fallacy.


References


External links

* {{DEFAULTSORT:P-Rep Statistical tests Statistical hypothesis testing