HOME

TheInfoList



OR:

The 1000 Genomes Project (abbreviated as 1KGP), launched in January 2008, was an international research effort to establish by far the most detailed catalogue of
human genetic variation Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population (alleles), a situation called polymorphism. No two humans are genetically identical. Even ...
. Scientists planned to
sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
s of at least one thousand anonymous participants from a number of different ethnic groups within the following three years, using newly developed technologies which were faster and less expensive. In 2010, the project finished its pilot phase, which was described in detail in a publication in the journal ''
Nature Nature, in the broadest sense, is the physical world or universe. "Nature" can refer to the phenomena of the physical world, and also to life in general. The study of nature is a large, if not the only, part of science. Although humans are ...
''. In 2012, the sequencing of 1092 genomes was announced in a ''Nature'' publication. In 2015, two papers in ''Nature'' reported results and the completion of the project and opportunities for future research. Many rare variations, restricted to closely related groups, were identified, and eight structural-variation classes were analyzed. The project unites multidisciplinary research teams from institutes around the world, including
China China, officially the People's Republic of China (PRC), is a country in East Asia. It is the world's List of countries and dependencies by population, most populous country, with a Population of China, population exceeding 1.4 billion, slig ...
,
Italy Italy ( it, Italia ), officially the Italian Republic, ) or the Republic of Italy, is a country in Southern Europe. It is located in the middle of the Mediterranean Sea, and its territory largely coincides with the homonymous geographical ...
,
Japan Japan ( ja, 日本, or , and formally , ''Nihonkoku'') is an island country in East Asia. It is situated in the northwest Pacific Ocean, and is bordered on the west by the Sea of Japan, while extending from the Sea of Okhotsk in the n ...
,
Kenya ) , national_anthem = " Ee Mungu Nguvu Yetu"() , image_map = , map_caption = , image_map2 = , capital = Nairobi , coordinates = , largest_city = Nairobi , ...
,
Nigeria Nigeria ( ), , ig, Naìjíríyà, yo, Nàìjíríà, pcm, Naijá , ff, Naajeeriya, kcg, Naijeriya officially the Federal Republic of Nigeria, is a country in West Africa. It is situated between the Sahel to the north and the Gulf o ...
,
Peru , image_flag = Flag of Peru.svg , image_coat = Escudo nacional del Perú.svg , other_symbol = Great Seal of the State , other_symbol_type = National seal , national_motto = "Firm and Happy f ...
, the
United Kingdom The United Kingdom of Great Britain and Northern Ireland, commonly known as the United Kingdom (UK) or Britain, is a country in Europe, off the north-western coast of the continental mainland. It comprises England, Scotland, Wales and ...
, and the
United States The United States of America (U.S.A. or USA), commonly known as the United States (U.S. or US) or America, is a country Continental United States, primarily located in North America. It consists of 50 U.S. state, states, a Washington, D.C., ...
. Each will contribute to the enormous sequence dataset and to a refined
human genome map The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as th ...
, which will be freely accessible through public databases to the scientific community and the general public alike. By providing an overview of all human genetic variation, the consortium will generate a valuable tool for all fields of biological science, especially in the disciplines of
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar work ...
,
medicine Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care pr ...
,
pharmacology Pharmacology is a branch of medicine, biology and pharmaceutical sciences concerned with drug or medication action, where a drug may be defined as any artificial, natural, or endogenous (from within the body) molecule which exerts a biochemica ...
,
biochemistry Biochemistry or biological chemistry is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology and ...
, and
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
.G Spencer, International Consortium Announces the 1000 Genomes Project, EMBARGOED (2008) http://www.nih.gov/news/health/jan2008/nhgri-22.htm __TOC__


Background

Since the completion of the
Human Genome Project The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both ...
advances in human
population genetics Population genetics is a subfield of genetics that deals with genetic differences within and between populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and po ...
and
comparative genomics Comparative genomics is a field of biological research in which the genomic features of different organisms are compared. The genomic features may include the DNA sequence, genes, gene order, regulatory sequences, and other genomic structural ...
have made it possible to gain increasing insight into the nature of genetic diversity. However, we are just beginning to understand how processes like the random sampling of
gamete A gamete (; , ultimately ) is a haploid cell that fuses with another haploid cell during fertilization in organisms that reproduce sexually. Gametes are an organism's reproductive cells, also referred to as sex cells. In species that produce ...
s, structural variations (insertions/deletions (
indel Indel is a molecular biology term for an insertion or deletion of bases in the genome of an organism. It is classified among small genetic variations, measuring from 1 to 10 000 base pairs in length, including insertion and deletion events that ...
s), copy number variations (CNV), retroelements),
single-nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently ...
s (SNPs), and
natural selection Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. It is a key mechanism of evolution, the change in the heritable traits characteristic of a population over generations. Cha ...
have shaped the level and pattern of variation within
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriat ...
and also between species.JC Long, Human Genetic Variation: The mechanisms and results of microevolution, American Anthropological Association (2004)


Human genetic variation

The random sampling of gametes during sexual reproduction leads to
genetic drift Genetic drift, also known as allelic drift or the Wright effect, is the change in the frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene variants to disappear completely and there ...
— a random fluctuation in the population frequency of a trait — in subsequent generations and would result in the loss of all variation in the absence of external influence. It is postulated that the rate of genetic drift is inversely proportional to population size, and that it may be accelerated in specific situations such as bottlenecks, where the population size is reduced for a certain period of time, and by the
founder effect In population genetics, the founder effect is the loss of genetic variation that occurs when a new population is established by a very small number of individuals from a larger population. It was first fully outlined by Ernst Mayr in 1942, us ...
(individuals in a population tracing back to a small number of founding individuals). Anzai et al. demonstrated that indels account for 90.4% of all observed variations in the sequence of the major histocompatibility locus (MHC) between
humans" \n\n\n\n\nThe robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the site they are allowed to visi ...
and chimpanzees. After taking multiple indels into consideration, the high degree of genomic similarity between the two species (98.6% nucleotide sequence identity) drops to only 86.7%. For example, a large deletion of 95 kilobases (kb) between the loci of the human ''
MICA Micas ( ) are a group of silicate minerals whose outstanding physical characteristic is that individual mica crystals can easily be split into extremely thin elastic plates. This characteristic is described as perfect basal cleavage. Mica is ...
'' and '' MICB''
genes In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
, results in a single hybrid chimpanzee ''MIC'' gene, linking this region to a species-specific handling of several
retroviral A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. Once inside the host cell's cytoplasm, the virus uses its own reverse transcriptase ...
infections and the resultant susceptibility to various
autoimmune diseases An autoimmune disease is a condition arising from an abnormal immune response to a functioning body part. At least 80 types of autoimmune diseases have been identified, with some evidence suggesting that there may be more than 100 types. Nearly a ...
. The authors conclude that instead of more subtle SNPs, indels were the driving mechanism in primate speciation. Besides
mutations In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, m ...
, SNPs and other structural variants such as copy-number variants (CNVs) are contributing to the genetic diversity in human populations. Using microarrays, almost 1,500 copy number variable regions, covering around 12% of the genome and containing hundreds of genes, disease loci, functional elements and
segmental duplication Low copy repeats (LCRs), also known as segmental duplications (SDs), are highly homologous sequence elements within the eukaryotic genome. Repeats The repeats, or duplications, are typically 10–300 kb in length, and bear greater than 95% sequ ...
s, have been identified in the HapMap sample collection. Although the specific function of CNVs remains elusive, the fact that CNVs span more nucleotide content per genome than SNPs emphasizes the importance of CNVs in genetic diversity and evolution. Investigating human genomic variations holds great potential for identifying genes that might underlie differences in disease resistance (e.g. MHC region) or drug metabolism.


Natural selection

Natural selection Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. It is a key mechanism of evolution, the change in the heritable traits characteristic of a population over generations. Cha ...
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
of a trait can be divided into three classes. Directional or positive selection refers to a situation where a certain allele has a greater fitness than other
alleles An allele (, ; ; modern formation from Greek ἄλλος ''állos'', "other") is a variation of the same sequence of nucleotides at the same place on a long DNA molecule, as described in leading textbooks on genetics and evolution. ::"The chrom ...
, consequently increasing its population frequency (e.g.
antibiotic resistance Antimicrobial resistance (AMR) occurs when microbes evolve mechanisms that protect them from the effects of antimicrobials. All classes of microbes can evolve resistance. Fungi evolve antifungal resistance. Viruses evolve antiviral resistance. ...
of bacteria). In contrast, stabilizing or negative selection (also known as purifying selection) lowers the frequency or even removes alleles from a population due to disadvantages associated with it with respect to other alleles. Finally, a number of forms of balancing selection exist; those increase genetic variation within a species by being overdominant (
heterozygous Zygosity (the noun, zygote, is from the Greek "yoked," from "yoke") () is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism. ...
individuals are fitter than
homozygous Zygosity (the noun, zygote, is from the Greek "yoked," from "yoke") () is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism. Mo ...
individuals, e.g. '' G6PD'', a gene that is involved in both Hemolytic anaemia and
malaria Malaria is a mosquito-borne infectious disease that affects humans and other animals. Malaria causes symptoms that typically include fever, tiredness, vomiting, and headaches. In severe cases, it can cause jaundice, seizures, coma, or death. ...
resistance) or can vary spatially within a species that inhabits different niches, thus favouring different alleles.EE Harris et al., The molecular signature of selection underlying human adaptations, Yearbook of Physical Anthropology 49: 89-130 (2006) Some genomic differences may not affect fitness. Neutral variation, previously thought to be “junk” DNA, is unaffected by natural selection resulting in higher genetic variation at such sites when compared to sites where variation does influence fitness. It is not fully clear how natural selection has shaped population differences; however, genetic candidate regions under selection have been identified recently. Patterns of DNA polymorphisms can be used to reliably detect signatures of selection and may help to identify genes that might underlie variation in disease resistance or drug metabolism. Barreiro et al. found evidence that negative selection has reduced population differentiation at the
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
–altering level (particularly in disease-related genes), whereas, positive selection has ensured regional adaptation of human populations by increasing population differentiation in gene regions (mainly nonsynonymous and 5'-untranslated region variants). It is thought that most complex and
Mendelian diseases A genetic disorder is a health problem caused by one or more abnormalities in the genome. It can be caused by a mutation in a single gene (monogenic) or multiple genes (polygenic) or by a chromosomal abnormality. Although polygenic disorders ...
(except diseases with late onset, assuming that older individuals no longer contribute to the fitness of their offspring) will have an effect on survival and/or reproduction, thus, genetic factors underlying those diseases should be influenced by natural selection. Although, diseases that have late onset today could have been childhood diseases in the past as genes delaying disease progression could have undergone selection. Gaucher disease (mutations in the '' GBA'' gene),
Crohn's disease Crohn's disease is a type of inflammatory bowel disease (IBD) that may affect any segment of the gastrointestinal tract. Symptoms often include abdominal pain, diarrhea (which may be bloody if inflammation is severe), fever, abdominal distensi ...
(mutation of ''
NOD2 Nucleotide-binding oligomerization domain-containing protein 2 (NOD2), also known as caspase recruitment domain-containing protein 15 (CARD15) or inflammatory bowel disease protein 1 (IBD1), is a protein that in humans is encoded by the ''NOD2'' ...
'') and familial hypertrophic cardiomyopathy (mutations in '' MYH7'', '' TNNT2'', '' TPM1'' and ''
MYBPC3 The myosin-binding protein C, cardiac-type is a protein that in humans is encoded by the ''MYBPC3'' gene. This isoform is expressed exclusively in heart muscle during human and mouse development, and is distinct from those expressed in slow skele ...
'') are all examples of negative selection. These disease mutations are primarily recessive and segregate as expected at a low frequency, supporting the hypothesized negative selection. There is evidence that the genetic-basis of
Type 1 Diabetes Type 1 diabetes (T1D), formerly known as juvenile diabetes, is an autoimmune disease that originates when cells that make insulin (beta cells) are destroyed by the immune system. Insulin is a hormone required for the cells to use blood sugar f ...
may have undergone positive selection. Few cases have been reported, where disease-causing mutations appear at the high frequencies supported by balanced selection. The most prominent example is mutations of the ''G6PD'' locus where, if homozygous G6PD
enzyme Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as products ...
deficiency and consequently Hemolytic anaemia results, but in the heterozygous state are partially protective against
malaria Malaria is a mosquito-borne infectious disease that affects humans and other animals. Malaria causes symptoms that typically include fever, tiredness, vomiting, and headaches. In severe cases, it can cause jaundice, seizures, coma, or death. ...
. Other possible explanations for segregation of disease alleles at moderate or high frequencies include genetic drift and recent alterations towards positive selection due to environmental changes such as diet or genetic hitch-hiking. Genome-wide comparative analyses of different human populations, as well as between species (e.g. human versus chimpanzee) are helping us to understand the relationship between diseases and selection and provide evidence of mutations in constrained genes being disproportionally associated with heritable disease
phenotypes In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological prop ...
. Genes implicated in complex disorders tend to be under less negative selection than Mendelian disease genes or non-disease genes.


Project description


Goals

There are two kinds of genetic variants related to disease. The first are rare genetic variants that have a severe effect predominantly on simple traits (e.g.
Cystic fibrosis Cystic fibrosis (CF) is a rare genetic disorder that affects mostly the lungs, but also the pancreas, liver, kidneys, and intestine. Long-term issues include difficulty breathing and coughing up mucus as a result of frequent lung infections. Ot ...
, Huntington disease). The second, more common, genetic variants have a mild effect and are thought to be implicated in complex traits (e.g.
Cognition Cognition refers to "the mental action or process of acquiring knowledge and understanding through thought, experience, and the senses". It encompasses all aspects of intellectual functions and processes such as: perception, attention, though ...
,
Diabetes Diabetes, also known as diabetes mellitus, is a group of metabolic disorders characterized by a high blood sugar level ( hyperglycemia) over a prolonged period of time. Symptoms often include frequent urination, increased thirst and increased ...
,
Heart Disease Cardiovascular disease (CVD) is a class of diseases that involve the heart or blood vessels. CVD includes coronary artery diseases (CAD) such as angina and myocardial infarction (commonly known as a heart attack). Other CVDs include stroke, h ...
). Between these two types of genetic variants lies a significant gap of knowledge, which the 1000 Genomes Project is designed to address. The primary goal of this project is to create a complete and detailed catalogue of
human genetic variation Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population (alleles), a situation called polymorphism. No two humans are genetically identical. Even ...
s, which in turn can be used for
association studies Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence. Studies of genetic association aim to test whether single-locus alleles or genotype fre ...
relating genetic variation to disease. By doing so the consortium aims to discover >95 % of the variants (e.g. SNPs, CNVs, indels) with minor allele frequencies as low as 1% across the genome and 0.1-0.5% in gene regions, as well as to estimate the population frequencies,
haplotype A haplotype ( haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA o ...
backgrounds and linkage disequilibrium patterns of variant alleles.Meeting Report: A Workshop to Plan a Deep Catalog of Human Genetic Variation, (2007) http://www.1000genomes.org/sites/1000genomes.org/files/docs/1000Genomes-MeetingReport.pdf Secondary goals will include the support of better SNP and probe selection for
genotyping Genotyping is the process of determining differences in the genetic make-up (genotype) of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. ...
platforms in future studies and the improvement of the human reference sequence. Furthermore, the completed database will be a useful tool for studying regions under selection, variation in multiple populations and understanding the underlying processes of mutation and recombination.


Outline

The human genome consists of approximately 3 billion DNA base pairs and is estimated to carry around 20,000
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
coding
genes In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
. In designing the study the consortium needed to address several critical issues regarding the project metrics such as technology challenges, data quality standards and sequence coverage. Over the course of the next three years, scientists at the Sanger Institute, BGI Shenzhen and the National Human Genome Research Institute’s Large-Scale Sequencing Network are planning to sequence a minimum of 1,000 human genomes. Due to the large amount of sequence data that need to be generated and analyzed it is possible that other participants may be recruited over time. Almost 10 billion bases will be sequenced per day over a period of the two year production phase. This equates to more than two human genomes every 24 hours; a groundbreaking capacity. Challenging the leading experts of
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
and statistical genetics, the sequence dataset will comprise 6 trillion DNA bases, 60-fold more sequence data than what has been published in DNA databases over the past 25 years. To determine the final design of the full project three pilot studies were designed and will be carried out within the first year of the project. The first pilot intends to genotype 180 people of 3 major geographic groups at low coverage (2x). For the second pilot study, the genomes of two nuclear families (both parents and an adult child) are going to be sequenced with deep coverage (20x per genome). The third pilot study involves sequencing the coding regions (
exon An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequen ...
s) of 1,000 genes in 1,000 people with deep coverage (20x). It has been estimated that the project would likely cost more than $500 million if standard DNA sequencing technologies were used. Therefore, several new technologies (e.g. Solexa, 454,
SOLiD Solid is one of the four fundamental states of matter (the others being liquid, gas, and plasma). The molecules in a solid are closely packed together and contain the least amount of kinetic energy. A solid is characterized by structur ...
) will be applied, lowering the expected costs to between $30 million and $50 million. The major support will be provided by the
Wellcome Trust Sanger Institute The Wellcome Sanger Institute, previously known as The Sanger Centre and Wellcome Trust Sanger Institute, is a non-profit British genomics and genetics research institute, primarily funded by the Wellcome Trust. It is located on the Wellcome G ...
in Hinxton, England; the
Beijing Genomics Institute BGI Group, formerly Beijing Genomics Institute, is a Chinese genomics company with headquarters in Yantian District, Shenzhen. The company was originally formed in 1999 as a genetics research center to participate in the Human Genome Project. It ...
, Shenzhen (BGI Shenzhen), China; and the
NHGRI The National Human Genome Research Institute (NHGRI) is an institute of the National Institutes of Health, located in Bethesda, Maryland. NHGRI began as the Office of Human Genome Research in The Office of the Director in 1988. This Office transi ...
, part of the National Institutes of Health (NIH). In keeping wit
Fort Lauderdale principles
, all genome sequence data (including variant calls) is freely available as the project progresses and can be downloaded via ftp from th
1000 genomes project webpage


Human genome samples

Based on the overall goals for the project, the samples will be chosen to provide power in populations where
association studies Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence. Studies of genetic association aim to test whether single-locus alleles or genotype fre ...
for common diseases are being carried out. Furthermore, the samples do not need to have medical or phenotype information since the proposed catalogue will be a basic resource on human variation. For the pilot studies human genome samples from the HapMap collection will be sequenced. It will be useful to focus on samples that have additional data available (such as ENCODE sequence, genome-wide genotypes, fosmid-end sequence, structural variation assays, and
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
) to be able to compare the results with those from other projects. Complying with extensive ethical procedures, the 1000 Genomes Project will then use samples from volunteer donors. The following populations will be included in the study:
Yoruba The Yoruba people (, , ) are a West African ethnic group that mainly inhabit parts of Nigeria, Benin, and Togo. The areas of these countries primarily inhabited by Yoruba are often collectively referred to as Yorubaland. The Yoruba constitute ...
in
Ibadan Ibadan (, ; ) is the capital and most populous city of Oyo State, in Nigeria. It is the third-largest city by population in Nigeria after Lagos and Kano, with a total population of 3,649,000 as of 2021, and over 6 million people within its ...
(YRI),
Nigeria Nigeria ( ), , ig, Naìjíríyà, yo, Nàìjíríà, pcm, Naijá , ff, Naajeeriya, kcg, Naijeriya officially the Federal Republic of Nigeria, is a country in West Africa. It is situated between the Sahel to the north and the Gulf o ...
; Japanese in
Tokyo Tokyo (; ja, 東京, , ), officially the Tokyo Metropolis ( ja, 東京都, label=none, ), is the capital and largest city of Japan. Formerly known as Edo, its metropolitan area () is the most populous in the world, with an estimated 37.46 ...
(JPT); Chinese in
Beijing } Beijing ( ; ; ), Chinese postal romanization, alternatively romanized as Peking ( ), is the Capital city, capital of the China, People's Republic of China. It is the center of power and development of the country. Beijing is the world's Li ...
(CHB);
Utah Utah ( , ) is a state in the Mountain West subregion of the Western United States. Utah is a landlocked U.S. state bordered to its east by Colorado, to its northeast by Wyoming, to its north by Idaho, to its south by Arizona, and to its ...
residents with ancestry from northern and western
Europe Europe is a large peninsula conventionally considered a continent in its own right because of its great physical size and the weight of its history and traditions. Europe is also considered a Continent#Subcontinents, subcontinent of Eurasia ...
(CEU);
Luhya Luhya or Abaluyia may refer to: * Luhya people * Luhya language Luhya (; also Luyia, Luhia or Luhiya) is a Bantu language of western Kenya. Dialects The various Luhya tribes speak several related languages and dialects, though some of them ar ...
in
Webuye Webuye, previously named Broderick Falls, is an industrial town in Bungoma County, Kenya. Located on the main road to Uganda, the town is home to the Pan African Paper Mills, the largest paper factory in the region, as well as a number of heav ...
,
Kenya ) , national_anthem = " Ee Mungu Nguvu Yetu"() , image_map = , map_caption = , image_map2 = , capital = Nairobi , coordinates = , largest_city = Nairobi , ...
(LWK); Maasai in Kinyawa, Kenya (MKK); Toscani in
Italy Italy ( it, Italia ), officially the Italian Republic, ) or the Republic of Italy, is a country in Southern Europe. It is located in the middle of the Mediterranean Sea, and its territory largely coincides with the homonymous geographical ...
(TSI); Peruvians in
Lima Lima ( ; ), originally founded as Ciudad de Los Reyes (City of The Kings) is the capital and the largest city of Peru. It is located in the valleys of the Chillón, Rímac and Lurín Rivers, in the desert zone of the central coastal part of ...
,
Peru , image_flag = Flag of Peru.svg , image_coat = Escudo nacional del Perú.svg , other_symbol = Great Seal of the State , other_symbol_type = National seal , national_motto = "Firm and Happy f ...
(PEL); Gujarati Indians in
Houston Houston (; ) is the most populous city in Texas, the most populous city in the Southern United States, the fourth-most populous city in the United States, and the sixth-most populous city in North America, with a population of 2,304,580 ...
(GIH); Chinese in metropolitan
Denver Denver () is a consolidated city and county, the capital, and most populous city of the U.S. state of Colorado. Its population was 715,522 at the 2020 census, a 19.22% increase since 2010. It is the 19th-most populous city in the Unit ...
(CHD); people of
Mexican Mexican may refer to: Mexico and its culture *Being related to, from, or connected to the country of Mexico, in North America ** People *** Mexicans, inhabitants of the country Mexico and their descendants *** Mexica, ancient indigenous people ...
ancestry in
Los Angeles Los Angeles ( ; es, Los Ángeles, link=no , ), often referred to by its initials L.A., is the largest city in the state of California and the second most populous city in the United States after New York City, as well as one of the world ...
(MXL); and people of African ancestry in the southwestern
United States The United States of America (U.S.A. or USA), commonly known as the United States (U.S. or US) or America, is a country Continental United States, primarily located in North America. It consists of 50 U.S. state, states, a Washington, D.C., ...
(ASW). * Population that was collected in diaspora


Community meeting

Data generated by the 1000 Genomes Project is widely used by the genetics community, making the first 1000 Genomes Project one of the most cited papers in biology.C. King (2012) The Hottest Research of 2011. ''Science Watch'' http://archive.sciencewatch.com/newsletter/2012/201203/hottest_research_2012/ To support this user community, the project held a community analysis meeting in July 2012 that included talks highlighting key project discoveries, their impact on population genetics and human disease studies, and summaries of other large-scale sequencing studies.1000 Genomes Project Community Analysis Meeting http://1000gconference.sph.umich.edu/


Project findings


Pilot phase

The pilot phase consisted of three projects: * low-coverage whole-genome sequencing of 179 individuals from 4 populations * high-coverage sequencing of 2 trios (mother-father-child) * exon-targeted sequencing of 697 individuals from 7 populations It was found that on average, each person carries around 250–300 loss-of-function variants in annotated genes and 50-100 variants previously implicated in inherited disorders. Based on the two trios, it is estimated that the rate of de novo germline mutation is approximately 10−8 per base per generation.


See also

*
Human Genome Project The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both ...
* HapMap Project * Personal genomics * Population groups in biomedicine * 1000 Plant Genomes Project *
List of biological databases Biological databases are stores of biological information. The journal '' Nucleic Acids Research'' regularly publishes special issues on biological databases and has a list of such databases. The 2018 issue has a list of about 180 such databases a ...


References


External links


1000 Genomes
- A Deep Catalog of Human Genetic Variation - official web page
International HapMap Project
- official web page
Human Genome Project Information
{{DEFAULTSORT:1000 Genomes Project, The Human genome projects Population genetics organizations Single-nucleotide polymorphisms Genome projects Genomics Bioinformatics