Relationship to stratified sampling
Benchmarking is sometimes referred to as 'post-stratification' because of its similarities to stratified sampling. The difference between the two is that in stratified sampling, we decide ''in advance'' how many units will be sampled from each stratum (equivalent to benchmarking cells); in benchmarking, we select units from the broader population, and the number chosen from each cell is a matter of chance. The advantage of stratified sampling is that the sample numbers in each stratum can be controlled for desired accuracy outcomes. Without this control, we may end up with too much sample in one stratum and not enough in another - indeed, it's possible that a sample will contain ''no'' members from a certain cell, in which case benchmarking fails because , leading to a divide-by-zero problem. In such cases, it is necessary to 'collapse' cells together so that each remaining cell has an adequate sample size. For this reason, benchmarking is generally used in situations where stratified sampling is impractical. For instance, when selecting people from a telephone directory, we can't tell what age they are so we can't easily stratify the sample by age. However, we can collect this information from the people sampled, allowing us to benchmark against demographic information. Sampling (statistics)