GenePattern
   HOME

TheInfoList



OR:

GenePattern is a freely available
computational biology Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
open-source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Op ...
package originally created and developed at the
Broad Institute The Eli and Edythe L. Broad Institute of MIT and Harvard (IPA: , pronunciation respelling: ), often referred to as the Broad Institute, is a biomedical and genomic research center located in Cambridge, Massachusetts, Cambridge, Massachusetts, U ...
for the analysis of
genomic Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
data. Designed to enable researchers to develop, capture, and reproduce genomic analysis methodologies, GenePattern was first released in 2004. GenePattern is currently developed at the
University of California, San Diego The University of California, San Diego (UC San Diego or colloquially, UCSD) is a public university, public Land-grant university, land-grant research university in San Diego, California. Established in 1960 near the pre-existing Scripps Insti ...
.


Functionality

GenePattern is a powerful
scientific workflow system A scientific workflow system is a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or workflow, in a scientific application. Applications Distribute ...
that provides access to hundreds of genomic analysis tools. Use these analysis tools as building blocks to design sophisticated analysis pipelines that capture the methods, parameters, and data used to produce analysis results. Pipelines can be used to create, edit and share reproducible in silico results.


Project Objectives

# Accessibility: Run over 200 regularly updated analysis and visualization tools (that support
data preprocessing Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to ...
,
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
analysis,
proteomics Proteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In ...
,
Single nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently larg ...
(SNP) analysis,
flow cytometry Flow cytometry (FC) is a technique used to detect and measure physical and chemical characteristics of a population of cells or particles. In this process, a sample containing cells or particles is suspended in a fluid and injected into the flo ...
, and
next-generation sequencing Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation s ...
) and create analytic workflows without any programming through a point and click user interface. # Reproducibility: Automated history and provenance tracking with versioning so that any user can share, repeat and understand a complete computational analysis # Extensibility: Computational users can import their methods and code for sharing using tools that support easy creation and integration # Multiple interfaces: Web browser, application, and programmatic interfaces make analysis modules and pipelines available to a broad range of users; public hosted server


Features

* A regularly updated repository of hundreds of computational analysis modules that support data preprocessing,
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
analysis,
proteomics Proteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In ...
,
single nucleotide polymorphism In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently larg ...
(SNP) analysis,
flow cytometry Flow cytometry (FC) is a technique used to detect and measure physical and chemical characteristics of a population of cells or particles. In this process, a sample containing cells or particles is suspended in a fluid and injected into the flo ...
, and short-read sequencing. * A programmatic interface that makes analysis modules available to computational biologists and developers from Python, Java, MATLAB, and R. * The GenePattern Notebook Environment: Built on the
Jupyter Notebook Project Jupyter () is a project with goals to develop open-source software, open standards, and services for interactive computing across multiple programming languages. It was spun off from IPython in 2014 by Fernando Pérez and Brian Granger ...
environment, GenePattern Notebook allows researchers to run GenePattern analyses within notebooks that interleave text, graphics, and executable code, creating a single "research narrative." * GParc: Repository and community for GenePattern users to share and discuss their own GenePattern modules


Availability

GenePattern is available: # As a free public web application, hosted on Amazon Web Services. Users can create accounts, perform analyses, and create pipelines on the server. # As
open-source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Op ...
that can be downloaded and installed locally. # Public web servers hosted by other organizations.


Notes


References

* The GenePattern Notebook Environment] Reich M, Tabor T, Liefeld T, Thorvaldsdóttir H, Hill B, Tamayo P, Mesirov JP. ''Cell Syst''. 2017 Aug 23;5(2):149-151.e1. . Epub 2017 Aug 16. ; . * Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace Qu K, Garamszegi S, Wu F, Thorvaldsdottir H, Liefeld T, Ocana M, Borges-Rivera D, Pochet N, Robinson JT, Demchak B, Hull T, Ben-Artzi G, Blankenberg D, Barber GP, Lee BT, Kuhn RM, Nekrutenko A, Segal E, Ideker T, Reich M, Regev A, Howard Y. Chang, Chang HY, Mesirov JP. ''Nat Methods''. 2016 Mar;13(3):245-247. . Epub 2016 Jan 18. ; . * Using GenePattern for Gene Expression Analysis] Kuehn, H., Liberzon, A., Reich, M. and Mesirov, J. P. ''Current Protocols in Bioinformatics''. 2008. 22:7.12:7.12.1–7.12.39. * GenePattern 2.0 Michael Reich, Ted Liefeld, Joshua Gould, Jim Lerner, Pablo Tamayo &
Jill P. Mesirov Jill P. Mesirov is an American mathematician, computer scientist, and Computational biology, computational biologist who is the Associate Vice Chancellor for Computational Health Sciences at the University of California, San Diego. She previously ...
. ''Nature Genetics'' - 38, 500 - 501 (2006)


External links


Official GenePattern website

Official GenePattern Notebook website

GParc
*https://github.com/genepattern/genepattern-server Related software:
GenomeSpace

Integrative Genomics Viewer
{{DEFAULTSORT:Genepattern Bioinformatics software DNA sequencing Gene expression