The NM-method or Naszodi–Mendonca method is the operation that can be applied in statistics, econometrics, economics, sociology, and demography to construct

counterfactual Counterfactual conditionals (also ''subjunctive'' or ''X-marked'') are conditional sentences which discuss what would have been true under different circumstances, e.g. "If Peter believed in ghosts, he would be afraid to be here." Counterfactual ...

contingency tables In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business ...

. The method finds the matrix

X

(

X \in \mathbb^

) which is “closest” to matrix

Z

(

Z \in \mathbb^

called the seed table) in the sense of being

ranked A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second. In mathematics, this is known as a weak order or total preorder of ...

the same but with the row and column totals of a target matrix

Y

( Y \in \mathbb^)

. While the row totals and column totals of

Y

are known, matrix

Y

itself may not be known. Since the

solution Solution may refer to: * Solution (chemistry), a mixture where one substance is dissolved in another * Solution (equation), in mathematics ** Numerical solution, in numerical analysis, approximate solutions within specified error bounds * Soluti ...

for matrix

X

is unique, the NM-method is a function:

X=\text(Z, Y e^T_m, e_nY): \mathbb^ \times \mathbb^ \times \mathbb^ \mapsto \mathbb^

, where

e_n

is an all one row vector of size

1\times n

, while

e^T_m

is an all one column vector of size

m\times 1

. The NM-method was developed by Naszodi and Mendonca (2021) to solve for matrix

X

in problems, where matrix

\boldsymbol

is not a sample from the population characterized by the row totals and column totals of matrix

Y

, but represents another population.

Definition of matrix ranking

The closeness between two matrices of the same size can be defined in several ways. The

Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...

, and the Kullback-Leibler divergence are two well-known examples. The NM-method is consistent with a definition relying on the ordinal Liu-Lu index which is the slightly modified version of the Coleman-index defined by Eq. (15) in Coleman (1958). According to this definition, matrix

X

is closest to matrix

Z

, if their Liu-Lu values are the same. In other words, if they are ranked the same by the ordinal Liu-Lu index. If matrix

Z

is a 2-by-2 matrix, its scalar-valued Liu-Lu index is defined as

\text(Z)=\frac

, where

Z_= Z_+ Z_

;

Z_= Z_+ Z_

;

Z_=Z_+Z_

;

Q(Z_)=/

;

Q^-(Z_)=int (Z_)

. Following Coleman (1958), this index is interpreted as the “actual minus expected over maximum minus expected”, where

Z_

is the actual value of the

1,1

entry of the seed matrix

Z

;

Q^-

is its expected (integer) value under the counterfactual assumptions that the corresponding row total and column total of

Z

are predetermined, while its interior is random. Finally,

\text(Z_ , Z_)

is the maximum value of

Z_

(

Z \in \mathbb^

) for given row total

Z_

and column total

Z_

. For matrix

Z

of size n-by-m (

n \geq 2

m \geq 2

), the Liu-Lu index was generalized by Naszodi and Mendonca (2021) to a matrix-valued index. One of the preconditions for the generalization is that the row variable and the column variable of matrix

Z

have to be ordered. Equating the generalized, matrix-valued Liu-Lu index of

Z

with that of matrix

X

is equivalent to dichotomizing their ordered row variable and ordered column variable in all possible meaningful ways and equating the original, scalar-valued Liu-Lu indices of the 2-by-2 matrices obtained with the dichotomizations. I.e., for any pair of

i,j

(

i \in \

, and

j \in \

) the restriction

\text(V_i  X W^T_j) = \text(V_i  Z W^T_j)

is imposed, where

V_i

is the

2 \times  n

matrix

V_i=\begin 1    & \cdots &  1  &  0  & \cdots & 0  \\  0   & \cdots  & 0 & 1  & \cdots  & 1  \end

with its first block being of size

2 \times  i

, and its second block being of size

2 \times  (n-i)

. Similarly,

W^T_j

is the

m \times  2

matrix given by the

transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...

W_j=\begin 1    & \cdots &  1  &  0  & \cdots & 0  \\  0   & \cdots  & 0 & 1  & \cdots  & 1  \end

with its first block being of size

2 \times  j

, and its second block being of size

2 \times  (m-j)

Constraints on the row totals and column totals

Matrix

X

should satisfy not only

\text(V_i  X W^T_j)= \text(V_i  Z W^T_j)

but also the pair of constraints on its row totals and column totals:

Xe^T_m=Ye^T_m

and

e_n X=e_n Y

Solution

Assuming that

\text(V_i  Z W^T_j) \geq 0

for all pairs of

i,j

(where

i \in \

, and

j \in \

), the solution for

X

is unique, deterministic, and given by a closed-form formula. For matrices

Y

and

Z

of size

\boldsymbol

, the solution is

X_ = \frac    +\text\left(/\right)

. The other 3 cells of

X

are uniquely determined by the row totals and column totals. So, this is how the NM-method works for 2-by-2 seed tables. For

Y

, and

Z

matrices of size

\boldsymbol

(

n \geq 2

m  \geq 2

), the solution is obtained by dichotomizing their ordered row variable and ordered column variable in all possible meaningful ways before solving

(n-1)(m-1)

number of problems of 2-by-2 form. Each problem is defined for an

i,j

pair (

i \in \

and

j \in \

) with

\text(V_i  X  W^T_j)= \text(V_i  Z  W^T_j)

, and the target row totals and column totals:

V_i  X e^T_= V_i  Y e^T_

, and

e_ X W^T_j = e_ Y W^T_j

, respectively. Each problem is to be solved separately by the formula for

X_

. The set of solutions determine

(n-1) (m-1)

number of entries of matrix

X

. Its remaining

m+n-1

elements are uniquely determined by the target row totals and column totals. Next, let us see how the NM-method works if matrix

Z

is such that the second precondition of

\boldsymbol

is not met for

\boldsymbol

. If

\boldsymbol

for all pairs of

\boldsymbol

, the solution for

X

is also unique, deterministic, and given by a closed-form formula. However, the corresponding concept of matrix ranking is slightly different from the one discussed above. Liu and Lu (2006) define it as

\text^-(Z)=\frac

, where

Z_= Z_+ Z_

;

Q^+(Z_)

is the smallest integer being larger than or equal to

Q

. Finally, neither the NM-method, nor

\boldsymbol

is defined if

\exist (i,j)

pair such that

\boldsymbol

, while for some other pairs of

k,l  (\neq i,j)

\boldsymbol

A numerical example

Consider the following

\colorZ

complemented with its row totals and column totals and the targets, i.e., the and

\colorY

: As a first step of the NM-method,

\color Z

is multiplied by the

\boldsymbol

, and

\boldsymbol

matrices for each pair of

i,j

(

i \in \

, and

j \in \

). It yields the following 9 matrices of size 2-by-2 with their target row totals and column totals: The next step is to calculate the generalized matrix-valued Liu-Lu index

\text()

, (where

\text()_=\text(V_i  X W^T_j)

) by applying the formula of the original scalar-valued Liu-Lu index to each of the 9 matrices: Apparently, matrix

\text(Z)

is positive. Therefore, the NM-method is defined. Solving each of the 9 problems of the 2-by-2 form yields 9 entries of the

X

matrix. Its other 7 entries are uniquely determined by the target row totals and column totals. The solution for

\boldsymbol

is:

Implementation

The NM-method is implemented in Excel, Visual Basic, and R. It can be downloaded from Mendeley (currently in version 2).

Applications

The NM-method can be applied to study various phenomena including

assortative mating Assortative mating (also referred to as positive assortative mating or homogamy) is a mating pattern and a form of sexual selection in which individuals with similar phenotypes or genotypes mate with one another more frequently than would be ex ...

, intergenerational mobility as a type of social mobility,

residential segregation Residential segregation in the United States is the physical separation of two or more groups into different neighborhoods—a form of segregation that "sorts population groups into various neighborhood contexts and shapes the living environment a ...

, recruitment and

talent management Talent management (TM) refers to the anticipation of required human capital for an organization and the planning to meet those needs. The field has been growing in significance and gaining interest among practitioners as well as in the scholarl ...

. In all of these applications, matrices

X

Y

, and

Z

represent joint distributions of one-to-one matched entities (e.g. husbands and wives, or first born children and mothers, or dwellings and main tenants, or CEOs and companies, or chess instructors and their best students) characterized either by a dichotomous categorical variable (e.g. taking values vegetarian/non-vegetarian, Grandmaster/or not), or an ordered multinomial categorical variable (e.g. level of final educational attainment, skiers' ability level, income bracket, category of rental fee,

credit rating A credit rating is an evaluation of the credit risk of a prospective debtor (an individual, a business, company or a government), predicting their ability to pay back the debt, and an implicit forecast of the likelihood of the debtor defaulting. ...

FIDE titles FIDE titles are awarded by the international chess governing body FIDE (''Fédération Internationale des Échecs'') for outstanding performance. The highest such title is Grandmaster (GM). Titles generally require a combination of Elo rating and ...

). Although the NM-method has a wide range of applicability, all the examples to be presented next are about assortative mating along the education level. In these applications, the two preconditions (of ordered trait variable, and positive assortative mating in all educational groups) are not debated to be met. Assume that matrix

Z

characterizes the joint educational distribution of husbands and wives in Zimbabwe, while matrix

Y

characterizes the same in Yemen. Matrix

X

to be constructed with the NM-method tells us what would be the joint educational distribution of couples in Zimbabwe, if the educational distributions of husbands and wives were the same as in Yemen, while the overall desire for homogamy (also called as aggregate marital preferences in economics, or marital matching

social norms Social norms are shared standards of acceptable behavior by groups. Social norms can both be informal understandings that govern the behavior of members of a society, as well as be codified into rules and laws. Social normative influences or soci ...

/social barriers in sociology) were unchanged. In a second application, matrices

Z

and

Y

characterize the same country in two different years. Matrix

Z

is the joint educational distribution of American newlyweds in 2040, where the husbands are from Generation Z and being young adults when observed. Matrix

Y

is the same but for Generation Y observed in year 2024. By constructing matrix

X

, one can study in the future what would be the educational distribution among the just married American young couples if they sorted into marriages the same way as the males in Generation Z and their partners do, while the education level were the same as among the males in Generation Y and their partners. In a third application, matrices

Z

and

Y

characterize again the same country in two different years. In this application, matrix

Z

is the joint educational distribution of Portuguese young couples (where the male partners' age is between 30 and 34 years) in 2011. And

Y

is the same but it is observed in year 1981. One may aim to construct matrix

X

in order to study what would have been the educational distribution of Portuguese young couples if they had sorted into marriages like their peers did in 2011, while their gender-specific educational distributions were the same as in 1981. In each of the first two applications, matrix

X

represents a counterfactual joint distribution. It can be used to quantify certain ceteris paribus effects. More precisely, to quantify on a cardinal scale the difference between the directly unobservable degree of marital sorting in Zimbabwe and Yemen, or in Generation Z and Generation Y with a counterfactual decomposition. For the decomposition, the counterfactual table

X

is used to calculate the contribution of each of the driving forces (i.e., the observed structural availability of potential partners with various education levels determining the opportunities at the population level; and the unobservable aggregate matching preferences/desires/norms/barriers) and that of their interaction (i.e., the effect of changes in aggregate preferences/desires/norms/barriers due to changes in structural availability) to an observable cardinal scaled statistics (e.g. the share of educationally homogamous couples). The third application was used by Naszodi and Mendonca (2021) as an example for a non-sense counterfactual: the education level has changed so drastically in Portugal over the three decades studied that this counterfactual is

impossible Impossible, Imposible or Impossibles may refer to: Music * ''ImPossible'' (album), a 2016 album by Divinity Roxx * ''The Impossible'' (album) Groups * The Impossibles (American band), a 1990s indie-ska group from Austin, Texas * The Impossibl ...

to be obtained.

Some features of the NM-method

First, the NM-method does not yield a meaningful solution if it reaches the limit of its applicability. For instance, in the third application, the NM-method signals with a negative entry in matrix

X

that the counterfactual is impossible (see: AlternativeMethod_US_1980s_2010s_age3035_main.xls Sheet PT_A1981_P2011_Not_meaningful). In this respect, the NM-method is similar to the

linear probability model In statistics, a linear probability model (LPM) is a special case of a binary regression model. Here the dependent variable for each observation takes values which are either 0 or 1. The probability of observing a 0 or 1 in any one case is treated ...

that signals the same with a predicted probabiity outside the

unit interval In mathematics, the unit interval is the closed interval , that is, the set of all real numbers that are greater than or equal to 0 and less than or equal to 1. It is often denoted ' (capital letter ). In addition to its role in real analysis ...

,1

. Second, the NM-method commutes with merging neighboring categories of the row variable and that of the column variable:

\text(M_r Z, M_r Y e^T_m, M_r e_nY)=M_r \text(Z, Y e^T_m, e_nY)

, where

M_r

is the row merging matrix of size

(n-1) \times n

; and

\text(Z M_c, Y e^T_m M_c, e_n Y M_c)=\text(Z, Y e^T_m, e_nY) M_c

, where

M_c

is the column merging matrix of size

m \times (m-1)

. Third, the NM-method works even if there are zero entries in matrix

Z

Comparison with the IPF

The

iterative proportional fitting The iterative proportional fitting procedure (IPF or IPFP, also known as biproportional fitting or biproportion in statistics or economics (input-output analysis, etc.), RAS algorithm in economics, raking in survey statistics, and matrix scaling in ...

procedure (IPF) is also a function:

\text(Z, Y e^T_m, e_nY): \mathbb^ \times \mathbb^ \times \mathbb^ \mapsto \mathbb^

. It is the operation of finding the fitted matrix

\boldsymbol

(

F \in \mathbb^

) which fulfills a set of conditions similar to those met by matrix

X

constructed with the NM-method. E.g., matrix

F

is the closest to matrix

\boldsymbol

but with the row and column totals of the target matrix

\boldsymbol

. However, there are differences between the IPF and the NM-method. The IPF defines closeness of matrices of the same size by the

cross-entropy In information theory, the cross-entropy between two probability distributions p and q over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set i ...

, or the Kullback-Leibler divergence. Accordingly, the IPF compatible concept of distance between the 2-by-2 matrices

F

and

Z

is zero, if their crossproduct ratios (also known as the

odds ratio An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of A in the presence of B and the odds of A in the absence of B, or equivalently (due ...

) are the same:

/  =/

. To recall, the NM-method's condition for equal ranking of matrices

X

and

Z

\text(X)=\frac =\frac =\text(Z)

. The following numerical example highlights that the IPF and the NM-method are not identical:

\text(Z, Y e^T_m, e_nY) \neq  \text(Z, Y e^T_m, e_nY)

. Consider the

\colorZ

with its : The NM-method yields the following matrix

X

: Whereas the solution for matrix

F

obtained with the IPF is: The IPF is equivalent to the

maximum likelihood estimator In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statis ...

of a joint population distribution, where matrix

F

(the estimate for the joint population distribution) is calculated from matrix

Z

, the observed joint distribution in a random sample taken from the population characterized by the row totals and column totals of matrix

Y

. In contrast to the problem solved by the IPF, matrix

\boldsymbol

is not sampled from this population in the problem that the NM-method was developed to solve. In fact, in the NM-problem, matrices

Z

and

Y

characterize two different populations (either observed simultaneously like in the application for Zimbabwe and Yemen, or observed in two different points in time like in its application for the populations of Generation Z and Generation Y). This difference facilitates the choice between the NM-method and the IPF in empirical applications. Deming and Stephan(1940), the inventors of the IPF, illustrated the application of their method on a classic maximum likelihood estimation problem, where matrix

Z

was sampled from the population characterized by the row totals and column totals of matrix

Y

. They were aware of the fact that in general, the IPF is not suitable for counterfactual predictions: they explicitly warned that their algorithm is “not by itself useful for prediction” (see Stephan and Deming 1940 p. 444). Finally, the domains are different for which the IPF and the NM-method yield solutions. First, unlike the NM-method, the IPF does not provide a solution for all seed tables

with zero entries (Csiszár (1975) found necessary and sufficient conditions for general tables having zero entries). Second, unlike the IPF, the NM-method does not provide a meaningful solution for pairs of matrices

and

defining impossible counterfactuals. Third, the precondition of the NM-method (of either

\boldsymbol

\boldsymbol

) is not a precondition for the applicability of the IPF.

References

{{reflist Contingency table

Definition of matrix ranking

Constraints on the row totals and column totals

Solution

A numerical example

Implementation

Applications

Some features of the NM-method

Comparison with the IPF

See also

References