HOME

TheInfoList



OR:

The International HapMap Project was an organization that aimed to develop a
haplotype A haplotype ( haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA or ...
map A map is a symbolic depiction emphasizing relationships between elements of some space, such as objects, regions, or themes. Many maps are static, fixed to paper or some other durable medium, while others are dynamic or interactive. Although ...
(HapMap) of the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the n ...
, to describe the common patterns of human
genetic variation Genetic variation is the difference in DNA among individuals or the differences between populations. The multiple sources of genetic variation include mutation and genetic recombination. Mutations are the ultimate sources of genetic variation, ...
. HapMap is used to find genetic variants affecting health, disease and responses to drugs and environmental factors. The information produced by the project is made freely available for research. The International HapMap Project is a collaboration among researchers at academic centers, non-profit biomedical research groups and private companies in
Canada Canada is a country in North America. Its ten provinces and three territories extend from the Atlantic Ocean to the Pacific Ocean and northward into the Arctic Ocean, covering over , making it the world's second-largest country by tot ...
,
China China, officially the People's Republic of China (PRC), is a country in East Asia. It is the world's most populous country, with a population exceeding 1.4 billion, slightly ahead of India. China spans the equivalent of five time zones and ...
(including
Hong Kong Hong Kong ( (US) or (UK); , ), officially the Hong Kong Special Administrative Region of the People's Republic of China ( abbr. Hong Kong SAR or HKSAR), is a city and special administrative region of China on the eastern Pearl River Delt ...
),
Japan Japan ( ja, 日本, or , and formally , ''Nihonkoku'') is an island country in East Asia. It is situated in the northwest Pacific Ocean, and is bordered on the west by the Sea of Japan, while extending from the Sea of Okhotsk in the north ...
,
Nigeria Nigeria ( ), , ig, Naìjíríyà, yo, Nàìjíríà, pcm, Naijá , ff, Naajeeriya, kcg, Naijeriya officially the Federal Republic of Nigeria, is a country in West Africa. It is situated between the Sahel to the north and the Gulf o ...
, the
United Kingdom The United Kingdom of Great Britain and Northern Ireland, commonly known as the United Kingdom (UK) or Britain, is a country in Europe, off the north-western coast of the continental mainland. It comprises England, Scotland, Wales and North ...
, and the
United States The United States of America (U.S.A. or USA), commonly known as the United States (U.S. or US) or America, is a country primarily located in North America. It consists of 50 states, a federal district, five major unincorporated territorie ...
. It officially started with a meeting on October 27 to 29, 2002, and was expected to take about three years. It comprises two phases; the complete data obtained in Phase I were published on 27 October 2005. The analysis of the Phase II dataset was published in October 2007. The Phase III dataset was released in spring 2009 and the publication presenting the final results published in September 2010.


Background

Unlike with the rarer
Mendelian Mendelian inheritance (also known as Mendelism) is a type of biology, biological Heredity, inheritance following the principles originally proposed by Gregor Mendel in 1865 and 1866, re-discovered in 1900 by Hugo de Vries and Carl Correns, an ...
diseases, combinations of different
genes In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
and the environment play a role in the development and progression of common diseases (such as
diabetes Diabetes, also known as diabetes mellitus, is a group of metabolic disorders characterized by a high blood sugar level ( hyperglycemia) over a prolonged period of time. Symptoms often include frequent urination, increased thirst and increased ap ...
,
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
,
heart disease Cardiovascular disease (CVD) is a class of diseases that involve the heart or blood vessels. CVD includes coronary artery diseases (CAD) such as angina and myocardial infarction (commonly known as a heart attack). Other CVDs include stroke, hea ...
,
stroke A stroke is a medical condition in which poor blood flow to the brain causes cell death. There are two main types of stroke: ischemic, due to lack of blood flow, and hemorrhagic, due to bleeding. Both cause parts of the brain to stop functionin ...
, depression, and
asthma Asthma is a long-term inflammatory disease of the airways of the lungs. It is characterized by variable and recurring symptoms, reversible airflow obstruction, and easily triggered bronchospasms. Symptoms include episodes of wheezing, cou ...
), or in the individual response to
pharmacological Pharmacology is a branch of medicine, biology and pharmaceutical sciences concerned with drug or medication action, where a drug may be defined as any artificial, natural, or endogenous (from within the body) molecule which exerts a biochemica ...
agents. To find the genetic factors involved in these diseases, one could in principle do a
genome-wide association study In genomics, a genome-wide association study (GWA study, or GWAS), also known as whole genome association study (WGA study, or WGAS), is an observational study of a genome-wide set of Single-nucleotide polymorphism, genetic variants in different i ...
: obtain the complete genetic sequence of several individuals, some with the disease and some without, and then search for differences between the two sets of genomes. At the time, this approach was not feasible because of the cost of
full genome sequencing Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a ...
. The HapMap project proposed a shortcut. Although any two unrelated people share about 99.5% of their DNA sequence, their
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
s differ at specific
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
locations. Such sites are known as
single nucleotide polymorphisms In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently larg ...
(SNPs), and each of the possible resulting gene forms is called an
allele An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chro ...
. The HapMap project focuses only on common SNPs, those where each allele occurs in at least 1% of the population. Each person has two copies of all
chromosomes A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins are ...
, except the
sex chromosomes A sex chromosome (also referred to as an allosome, heterotypical chromosome, gonosome, heterochromosome, or idiochromosome) is a chromosome that differs from an ordinary autosome in form, size, and behavior. The human sex chromosomes, a typical ...
in
male Male (symbol: ♂) is the sex of an organism that produces the gamete (sex cell) known as sperm, which fuses with the larger female gamete, or ovum, in the process of fertilization. A male organism cannot reproduce sexually without access to ...
s. For each SNP, the combination of alleles a person has is called a
genotype The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a ...
.
Genotyping Genotyping is the process of determining differences in the genetic make-up ( genotype) of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. ...
refers to uncovering what genotype a person has at a particular site. The HapMap project chose a sample of 269 individuals and selected several million well-defined SNPs, genotyped the individuals for these SNPs, and published the results. The alleles of nearby SNPs on a single chromosome are correlated. Specifically, if the allele of one SNP for a given individual is known, the alleles of nearby SNPs can often be predicted, a process known as ''genotype imputation''. This is because each SNP arose in evolutionary history as a single point
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
, and was then passed down on the chromosome surrounded by other, earlier, point mutations. SNPs that are separated by a large distance on the chromosome are typically not very well correlated, because recombination occurs in each generation and mixes the allele sequences of the two chromosomes. A sequence of consecutive alleles on a particular chromosome is known as a
haplotype A haplotype ( haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA or ...
. To find the genetic factors involved in a particular disease, one can proceed as follows. First a certain region of interest in the genome is identified, possibly from earlier inheritance studies. In this region one locates a set of
tag SNP A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium that represents a group of SNPs called a haplotype. It is possible to identify genetic variation and association to phenot ...
s from the HapMap data; these are SNPs that are very well correlated with all the other SNPs in the region. Using these, genotype imputation can be used to determine (impute) the other SNPs and thus the entire haplotype with high confidence. Next, one determines the genotype for these tag SNPs in several individuals, some with the disease and some without. By comparing the two groups, one determines the likely locations and haplotypes that are involved in the disease.


Samples used

Haplotypes A haplotype (haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA org ...
are generally shared between populations, but their frequency can differ widely. Four populations were selected for inclusion in the HapMap: 30 adult-and-both-parents
Yoruba The Yoruba people (, , ) are a West African ethnic group that mainly inhabit parts of Nigeria, Benin, and Togo. The areas of these countries primarily inhabited by Yoruba are often collectively referred to as Yorubaland. The Yoruba constitute ...
trios from
Ibadan Ibadan (, ; ) is the capital and most populous city of Oyo State, in Nigeria. It is the third-largest city by population in Nigeria after Lagos and Kano, with a total population of 3,649,000 as of 2021, and over 6 million people within its me ...
,
Nigeria Nigeria ( ), , ig, Naìjíríyà, yo, Nàìjíríà, pcm, Naijá , ff, Naajeeriya, kcg, Naijeriya officially the Federal Republic of Nigeria, is a country in West Africa. It is situated between the Sahel to the north and the Gulf o ...
(YRI), 30 trios of Utah residents of northern and western
European European, or Europeans, or Europeneans, may refer to: In general * ''European'', an adjective referring to something of, from, or related to Europe ** Ethnic groups in Europe ** Demographics of Europe ** European cuisine, the cuisines of Europe ...
ancestry (CEU), 44 unrelated Japanese individuals from
Tokyo Tokyo (; ja, 東京, , ), officially the Tokyo Metropolis ( ja, 東京都, label=none, ), is the capital and largest city of Japan. Formerly known as Edo, its metropolitan area () is the most populous in the world, with an estimated 37.468 ...
,
Japan Japan ( ja, 日本, or , and formally , ''Nihonkoku'') is an island country in East Asia. It is situated in the northwest Pacific Ocean, and is bordered on the west by the Sea of Japan, while extending from the Sea of Okhotsk in the north ...
(JPT) and 45 unrelated
Han Chinese The Han Chinese () or Han people (), are an East Asian ethnic group native to China. They constitute the world's largest ethnic group, making up about 18% of the global population and consisting of various subgroups speaking distinctive va ...
individuals from
Beijing } Beijing ( ; ; ), alternatively romanized as Peking ( ), is the capital of the People's Republic of China. It is the center of power and development of the country. Beijing is the world's most populous national capital city, with over 21 ...
,
China China, officially the People's Republic of China (PRC), is a country in East Asia. It is the world's most populous country, with a population exceeding 1.4 billion, slightly ahead of India. China spans the equivalent of five time zones and ...
(CHB). Although the haplotypes revealed from these populations should be useful for studying many other populations, parallel studies are currently examining the usefulness of including additional populations in the project. All samples were collected through a community engagement process with appropriate informed consent. The community engagement process was designed to identify and attempt to respond to culturally specific concerns and give participating communities input into the informed consent and sample collection processes. In phase III, 11 global ancestry groups have been assembled: ASW (African ancestry in Southwest USA); CEU (Utah residents with Northern and Western European ancestry from the CEPH collection); CHB (Han Chinese in Beijing, China); CHD (Chinese in Metropolitan Denver, Colorado); GIH (Gujarati Indians in Houston, Texas); JPT (Japanese in Tokyo, Japan); LWK (Luhya in Webuye, Kenya); MEX (Mexican ancestry in Los Angeles, California); MKK (Maasai in Kinyawa, Kenya); TSI (Tuscans in Italy); YRI (Yoruba in Ibadan, Nigeria).International HapMap consortium et al. (2010). Integrating common and rare genetic variation in diverse human populations. ''Nature'', 467, 52-8
doi
/ref> Three combined panels have also been created, which allow better identification of SNPs in groups outside the nine homogenous samples: CEU+TSI (Combined panel of Utah residents with Northern and Western European ancestry from the CEPH collection and Tuscans in Italy); JPT+CHB (Combined panel of Japanese in Tokyo, Japan and Han Chinese in Beijing, China) and JPT+CHB+CHD (Combined panel of Japanese in Tokyo, Japan, Han Chinese in Beijing, China and Chinese in Metropolitan Denver, Colorado). CEU+TSI, for instance, is a better model of UK British individuals than is CEU alone.


Scientific strategy

It was expensive in the 1990s to sequence patients’ whole genomes. So the
National Institutes of Health The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
embraced the idea for a "shortcut", which was to look just at sites on the genome where many people have a variant DNA unit. The theory behind the shortcut was that, since the major diseases are common, so too would be the genetic variants that caused them.
Natural selection Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. It is a key mechanism of evolution, the change in the heritable traits characteristic of a population over generations. Charle ...
keeps the human genome free of variants that damage health before children are grown, the theory held, but fails against variants that strike later in life, allowing them to become quite common (In 2002 the
National Institutes of Health The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
started a $138 million project called the
HapMap The International HapMap Project was an organization that aimed to develop a haplotype map (HapMap) of the human genome, to describe the common patterns of human genetic variation. HapMap is used to find genetic variants affecting health, disease a ...
to catalog the common variants in European, East Asian and African genomes). For the Phase I, one common SNP was genotyped every 5,000 bases. Overall, more than one million SNPs were genotyped. The genotyping was carried out by 10 centres using five different genotyping technologies. Genotyping quality was assessed by using duplicate or related samples and by having periodic quality checks where centres had to genotype common sets of SNPs. The Canadian team was led by Thomas J. Hudson at
McGill University McGill University (french: link=no, Université McGill) is an English-language public research university located in Montreal, Quebec, Canada. Founded in 1821 by royal charter granted by King George IV,Frost, Stanley Brice. ''McGill Universit ...
in
Montreal Montreal ( ; officially Montréal, ) is the List of the largest municipalities in Canada by population, second-most populous city in Canada and List of towns in Quebec, most populous city in the Provinces and territories of Canada, Canadian ...
and focused on chromosomes 2 and 4p. The Chinese team was led by
Huanming Yang Yang Huanming (; born 6 October 1952), also known as Henry Yang, is a Chinese biologist, businessman and one of China's leading genetics researchers. He is Chairman and co-founder of the Beijing Genomics Institute, formerly of the Chinese Acade ...
in
Beijing } Beijing ( ; ; ), alternatively romanized as Peking ( ), is the capital of the People's Republic of China. It is the center of power and development of the country. Beijing is the world's most populous national capital city, with over 21 ...
and
Shanghai Shanghai (; , , Standard Mandarin pronunciation: ) is one of the four direct-administered municipalities of the People's Republic of China (PRC). The city is located on the southern estuary of the Yangtze River, with the Huangpu River flow ...
, and
Lap-Chee Tsui Lap-Chee Tsui (; born 21 December 1950) is a Chinese-born Canadian geneticist and served as the 14th Vice-Chancellor and President of the University of Hong Kong. Personal life Tsui was born in Shanghai. He grew up in Kowloon, Hong Kong and att ...
in
Hong Kong Hong Kong ( (US) or (UK); , ), officially the Hong Kong Special Administrative Region of the People's Republic of China ( abbr. Hong Kong SAR or HKSAR), is a city and special administrative region of China on the eastern Pearl River Delt ...
and focused on chromosomes 3, 8p and 21. The Japanese team was led by Yusuke Nakamura at the
University of Tokyo , abbreviated as or UTokyo, is a public research university located in Bunkyō, Tokyo, Japan. Established in 1877, the university was the first Imperial University and is currently a Top Type university of the Top Global University Project by ...
and focused on chromosomes 5, 11, 14, 15, 16, 17 and 19. The British team was led by David R. Bentley at the
Sanger Institute The Wellcome Sanger Institute, previously known as The Sanger Centre and Wellcome Trust Sanger Institute, is a non-profit organisation, non-profit British genomics and genetics research institute, primarily funded by the Wellcome Trust. It is l ...
and focused on chromosomes 1, 6, 10, 13 and 20. There were four United States' genotyping centres: a team led by
Mark Chee Mark may refer to: Currency * Bosnia and Herzegovina convertible mark, the currency of Bosnia and Herzegovina * East German mark, the currency of the German Democratic Republic * Estonian mark, the currency of Estonia between 1918 and 1927 * Fin ...
and Arnold Oliphant at Illumina Inc. in
San Diego San Diego ( , ; ) is a city on the Pacific Ocean coast of Southern California located immediately adjacent to the Mexico–United States border. With a 2020 population of 1,386,932, it is the List of United States cities by population, eigh ...
(studying chromosomes 8q, 9, 18q, 22 and X), a team led by David Altshuler and Mark Daly at the
Broad Institute The Eli and Edythe L. Broad Institute of MIT and Harvard (IPA: , pronunciation respelling: ), often referred to as the Broad Institute, is a biomedical and genomic research center located in Cambridge, Massachusetts, Cambridge, Massachusetts, U ...
in Cambridge, USA (chromosomes 4q, 7q, 18p, Y and
mitochondrion A mitochondrion (; ) is an organelle found in the cells of most Eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is used ...
), a team led by Richard Gibbs at the
Baylor College of Medicine Baylor College of Medicine (BCM) is a medical school and research center in Houston, Texas, within the Texas Medical Center, the world's largest medical center. BCM is composed of four academic components: the School of Medicine, the Graduate Sc ...
in
Houston Houston (; ) is the most populous city in Texas, the most populous city in the Southern United States, the fourth-most populous city in the United States, and the sixth-most populous city in North America, with a population of 2,304,580 in ...
(chromosome 12), and a team led by Pui-Yan Kwok at the
University of California, San Francisco The University of California, San Francisco (UCSF) is a public land-grant research university in San Francisco, California. It is part of the University of California system and is dedicated entirely to health science and life science. It cond ...
(chromosome 7p). To obtain enough SNPs to create the Map, the Consortium funded a large re-sequencing project to discover millions of additional SNPs. These were submitted to the public dbSNP database. As a result, by August 2006, the database included more than ten million SNPs, and more than 40% of them were known to be polymorphic. By comparison, at the start of the project, fewer than 3 million SNPs were identified, and no more than 10% of them were known to be polymorphic. During Phase II, more than two million additional SNPs were genotyped throughout the genome by David R. Cox,
Kelly A. Frazer Kelly A Frazer is a Professor of Pediatrics in the Medical School at the University of California, San Diego, Chief of the Division of Genome Information Sciences and Director of the Institute for Genomic Medicine. Education Frazer did her un ...
and others at Perlegen Sciences and 500,000 by the company
Affymetrix Affymetrix is now Applied Biosystems, a brand of DNA microarray products sold by Thermo Fisher Scientific that originated with an American biotechnology research and development and manufacturing company of the same name. The Santa Clara, Califor ...
.


Data access

All of the data generated by the project, including SNP frequencies, genotypes and haplotypes, were placed in the public domain and are available for download. This website also contains a genome browser which allows to find SNPs in any region of interest, their allele frequencies and their association to nearby SNPs. A tool that can determine tag SNPs for a given region of interest is also provided. These data can also be directly accessed from the widely used
Haploview Haploview is a commonly used bioinformatics software which is designed to analyze and visualize patterns of linkage disequilibrium (LD) in genetic data. Haploview can also perform association studies, choosing tagSNPs{{cite journal, author1=de Bakk ...
program.


Publications

* * * * * * * * * Secko, David (2005)
"Phase I of the HapMap Complete"
The Scientist


See also

* Genealogical DNA test * The 1000 Genomes Project *
Population groups in biomedicine Race and health refers to how being identified with a specific race influences health. Race is a complex concept that has changed across chronological eras and depends on both self-identification and social recognition. In the study of race and ...
*
Human Variome Project The Human Variome Project (HVP) is the global initiative to collect and curate all human genetic variation affecting human health. Its mission is to improve health outcomes by facilitating the unification of data on human genetic variation and it ...
*
Human genetic variation Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population (alleles), a situation called polymorphism. No two humans are genetically identical. Even m ...


References


External links


International HapMap Project (HapMap Homepage)

National Human Genome Research Institute (NHGRI) HapMap Page

Browsing HapMap Data Using the Genome Browser

The Mexican Genome Diversity Project
{{Authority control Human genome projects Genetic genealogy projects Genealogy websites Biological databases Open science Single-nucleotide polymorphisms Genome projects