The median polish is a simple and robust
exploratory data analysis
In statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but pr ...
procedure proposed by the statistician
John Tukey
John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...
. The purpose of median polish is to find an additively-fit model for data in a two-way layout table (usually, results from a
factorial experiment
In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all s ...
) of the form row effect + column effect + overall median.
Median polish utilizes the medians obtained from the rows and the columns of a two-way table to iteratively calculate the row effect and column effect on the data. The results are not meant to be sensitive to the outliers, as the iterative procedure uses the medians rather than the means.
Model for a two-way table
Suppose an experiment observes the variable Y under the influence of two variables. We can arrange the data in a two-way table in which one variable is constant along the rows and the other variable constant along the columns. Let ''i'' and ''j'' denote the position of rows and columns (e.g. y
''ij'' denotes the value of y at the ''i''th row and the ''j''th column). Then we can obtain a simple linear regression equation:
:
where are constants, and and are values associated with rows and columns, respectively.
The equation can be further simplified if no and values are present for the analysis:
:
where and denote row effects and column effects, respectively.
Procedure
To carry out median polish:
(1) find the row medians for each row, find the median of the row medians, record this as the overall effect.
(2) subtract each element in a row by its row median, do this for all rows.
(3) subtract the overall effect from each row median.
(4) do the same for each column, and add the overall effect from column operations to the overall effect generated from row operations.
(5) repeat (1)-(4) until negligible change occur with row or column medians
References
*
Frederick Mosteller
Charles Frederick Mosteller (December 24, 1916 – July 23, 2006) was an American mathematician, considered one of the most eminent statisticians of the 20th century. He was the founding chairman of Harvard's statistics department from 1957 ...
and
John Tukey
John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...
(1977). "Data Analysis and Regression".
Reading, MA
Reading ( ) is a town in Middlesex County, Massachusetts, United States, north of central Boston. The population was 25,518 at the 2020 census.
History
Settlement and American independence
Many of the Massachusetts Bay Colony's original settle ...
:
Addison-Wesley
Addison-Wesley is an American publisher of textbooks and computer literature. It is an imprint of Pearson PLC, a global publishing and education company. In addition to publishing books, Addison-Wesley also distributes its technical titles through ...
. .
* J.D. Emerson and D.C. Hoaglin (1983). "Analysis of two-way tables by medians". In "Understanding Robust and Exploratory Data Analysis", eds D. C. Hoaglin, F. Mosteller and J. W. Tukey.
New York City
New York, often called New York City or NYC, is the List of United States cities by population, most populous city in the United States. With a 2020 population of 8,804,190 distributed over , New York City is also the L ...
:
John Wiley & Sons
John Wiley & Sons, Inc., commonly known as Wiley (), is an American multinational publishing company founded in 1807 that focuses on academic publishing and instructional materials. The company produces books, journals, and encyclopedias, in p ...
. . pp. 165–210.
* William N. Venables and
Brian D. Ripley
Brian David Ripley FRSE (born 29 April 1952) is a British statistician. From 1990, he was professor of applied statistics at the University of Oxford and is also a professorial fellow at St Peter's College, Oxford, St Peter's College. He retired ...
(2002)
Statistics Complements to Modern Applied Statistics with S p.4–5. .
* Anwar Fitrianto, Hari Wijayanto, Sohel Rana, and Cheong Yee Voon (2014). "Median Polish for Final Grades of MTH3000- and MTH4000- Level Courses". Applied Mathematical Sciences, Vol. 8, no. 126, pp. 6295-6302
{{Statistics
Exploratory data analysis