Translational bioinformatics (TBI) is a field that emerged in the 2010s to study
health informatics, focused on the convergence of molecular
bioinformatics,
biostatistics, statistical genetics and clinical informatics. Its focus is on applying informatics methodology to the increasing amount of biomedical and genomic data to formulate knowledge and medical tools, which can be utilized by scientists, clinicians, and patients. Furthermore, it involves applying biomedical research to improve human health through the use of computer-based information system.
TBI employs
data mining and analyzing biomedical informatics in order to generate clinical knowledge for application. Clinical knowledge includes finding similarities in patient populations, interpreting biological information to suggest therapy treatments and predict health outcomes.
History
Translational bioinformatics is a relatively young field within translational research.
Google trends
Google Trends is a website by Google that analyzes the popularity of top web search query, search queries in Google Search across various regions and languages. The website uses graphs to compare the search volume of different queries over time. ...
indicate the use of "
bioinformatics" has decreased since the mid-1990s when it was suggested as a transformative approach to biomedical research.
[ It was coined, however, close to ten years earlier.] TBI was then presented as means to facilitate data organization, accessibility and improved interpretation of the available biomedical research. It was considered a decision support tool that could integrate biomedical information into decision-making processes that otherwise would have been omitted due to the nature of human memory and thinking patterns.[
Initially, the focus of TBI was on ]ontology
In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.
Ontology addresses questions like how entities are grouped into categories and which of these entities exi ...
and vocabulary designs for searching the mass data stores. However, this attempt was largely unsuccessful as preliminary attempts for automation
Automation describes a wide range of technologies that reduce human intervention in processes, namely by predetermining decision criteria, subprocess relationships, and related actions, as well as embodying those predeterminations in machines ...
resulted in misinformation. TBI needed to develop a baseline for cross-referencing data with higher order algorithms in order to link data, structures and functions in networks.[ This went hand in hand with a focus on developing curriculum for graduate level programs and capitalization for funding on the growing public acknowledgement of the potential opportunity in TBI.][
When the first draft of the ]human genome
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the ...
was completed in the early 2000s, TBI continued to grow and demonstrate prominence as a means to bridge biological findings with clinical informatics
Health informatics is the field of science
Science is a systematic endeavor that Scientific method, builds and organizes knowledge in the form of Testability, testable explanations and predictions about the universe.
Science may be as ...
, impacting the opportunities for both industries of biology and healthcare. Expression profiling, text mining
Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
for trends analysis, population-based data mining providing biomedical insights, and ontology development has been explored, defined and established as important contributions to TBI. Achievements of the field that have been used for knowledge discovery include linking clinical records to genomics data, linking drugs with ancestry, whole genome sequencing for a group with a common disease, and semantics in literature mining.[ There has been discussion of cooperative efforts to create cross-jurisdictional strategies for TBI, particularly in Europe. The past decade has also seen the development of personalized medicine and data sharing in ]pharmacogenomics
Pharmacogenomics is the study of the role of the genome in drug response. Its name ('' pharmaco-'' + ''genomics'') reflects its combining of pharmacology and genomics. Pharmacogenomics analyzes how the genetic makeup of an individual affects the ...
. These accomplishments have solidified public interest, generated funds for investment in training and further curriculum development, increased demand for skilled personnel in the field and pushed ongoing TBI research and development.[
]
Benefits and opportunities
At present, TBI research spans multiple disciplines; however, the application of TBI in clinical settings remains limited. Currently, it is partially deployed in drug development
Drug development is the process of bringing a new pharmaceutical drug to the market once a lead compound has been identified through the process of drug discovery. It includes preclinical research on microorganisms and animals, filing for re ...
, regulatory review, and clinical medicine
Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care pr ...
.[ The opportunity for application of TBI is much broader as increasingly medical journals are mentioning the term "informatics" and discussing bioinformatics related topics.] TBI research draws on four main areas of discourse: clinical genomics, genomic medicine, pharmacogenomics, and genetic epidemiology.[ There are increasing numbers of conferences and forums focused on TBI to create opportunities for knowledge sharing and field development. General topics that appear in recent conferences include: (1) personal genomics and genomic infrastructure, (2) drug and gene research for adverse events, interactions and repurposing of drugs, (3) biomarkers and phenotype representation, (4) sequencing, science and systems medicine, (5) computational and analytical methodologies for TBI, and (6) application of bridging genetic research and clinical practice.][
With the help of bioinformaticians, biologists are able to analyze complex data, set up websites for experimental measurements, facilitate sharing of the measurements, and correlate findings to clinical outcomes.] Translational bioinformaticians studying a particular disease would have more sample data regarding a given disease than an individual biologist studying the disease alone.
Since the completion of the human genome, new projects are now attempting to systematically analyze all the gene alterations in a disease like cancer
Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
rather than focusing on a few genes at a time. In the future, large-scale data will be integrated from different sources in order to extract functional information. The availability of a large number of human genomes will allow for statistical mining of their relation to lifestyles, drug interactions, and other factors. Translational bioinformatics is therefore transforming the search for disease genes and is becoming a crucial component of other areas of medical research including pharmacogenomics.
In a study evaluating the computational and economic characteristics of cloud computing
Cloud computing is the on-demand availability of computer system resources, especially data storage ( cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over mu ...
in performing a large-scale data integration and analysis of genomic medicine, cloud-based analysis had similar cost and performance in comparison to a local computational cluster. This suggests that cloud-computing technologies might be a valuable and economical technology for facilitating large-scale translational research
Translational research (also called translation research, translational science, or, when the context is clear, simply translation) is research aimed at translating (converting) results in basic research into results that directly benefit humans. ...
in genomic medicine.
Methodologies
Storage
Vast amounts of bioinformatical data are currently available and continue to increase. For instance, the GenBank database, funded by the National Institute of Health (NIH), currently holds 82 billion nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecule ...
s in 78 million sequences coding for 270,000 species. The equivalent of GenBank for gene expression microarrays, known as the Gene Expression Omnibus (GEO), has over 183,000 samples from 7,200 experiments and this number doubles or triples each year. The European Bioinformatics Institute
The European Bioinformatics Institute (EMBL-EBI) is an Intergovernmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Wel ...
(EBI) has a similar database called ArrayExpress which has over 100,000 samples from over 3,000 experiments. All together, TBI has access to more than a quarter million microarray samples at present.
To extract relevant data from large data sets, TBI employs various methods such as data consolidation, data federation, and data warehousing
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. DWs are central repositories of integr ...
. In the data consolidation approach, data is extracted from various sources and centralized in a single database. This approach enables standardization of heterogeneous data and helps address issues in interoperability and compatibility among data sets. However, proponents of this method often encounter difficulties in updating their databases as it is based on a single data model. In contrast, the data federation approach links databases together and extracts data on a regular basis, then combines the data for queries. The benefit of this approach is that it enables the user to access real-time data on a single portal. However, the limitation of this is that data collected may not always be synchronized as it is derived from multiple sources. Data warehousing provides a single unified platform for data curation. Data warehousing ingrates data from multiple sources into a common format, and is typically used in bioscience exclusively for decision support purposes.
Analytics
Analytic techniques serve to translate biological data using high-throughput techniques into clinically relevant information. Currently, numerous software and methodologies for querying data exist, and this number continues to grow as more studies are conducted and published in bioinformatics journals such as ''Genome Biology
''Genome Biology'' is a peer-reviewed open access scientific journal covering research in genomics. It was established in 2000 and is published by BioMed Central. The chief editor is currently Andrew Cosgrove (BioMed Central, New York).
Abstractin ...
'', '' BMC Bioinformatics'', ''BMC Genomics'', and '' Bioinformatics''. To ascertain the best analytical technique, tools such as Weka have been created to cipher through the array of software's and select the most appropriate technique abstracting away the need to know a specific methodology.
Integration
Data integration
Data integration involves combining data residing in different sources and providing users with a unified view of them.
This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies ...
involves developing methods that use biological information for the clinical setting. Integrating data empowers clinician's with tools for data access, knowledge discovery
Knowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must r ...
, and decision support. Data integration serves to utilize the wealth of information available in bioinformatics to improve patient health and safety. An example of data integration is the use of decision support systems
A decision support system (DSS) is an information system that supports business or organizational decision-making activities. DSSs serve the management, operations and planning levels of an organization (usually mid and higher management) and ...
(DSS) based on translational bioinformatics. DSS used in this regard identify correlations in patient electronic medical records
An electronic health record (EHR) is the systematized collection of patient and population electronically stored health information in a digital format. These records can be shared across different health care settings. Records are shared throu ...
(EMR) and other clinical information systems to assist clinicians in their diagnoses.
Cost
Companies are now able to provide whole human genome sequencing
Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a ...
and analysis as a simple outsourced service. Second- and third-generation versions of sequencing systems are planned to increase the amount of genomes per day, per instrument, to 80. According to the CEO of Complete Genomics Cliff Reid, the total market for whole human genome sequencing around the world has increased five-fold during 2009 and 2010, and was estimated to be 15,000 genomes for 2011. Furthermore, if the price were to fall to $1,000 per genome, he maintained that the company would still be able to make a profit. The company is also working on process improvements to bring down the internal cost to around $100 per genome, excluding sample-prep and labor costs.
According to the National Human Genome Research Institute
The National Human Genome Research Institute (NHGRI) is an institute of the National Institutes of Health, located in Bethesda, Maryland.
NHGRI began as the Office of Human Genome Research in The Office of the Director in 1988. This Office transi ...
(NHGRI), the costs to sequence the entire genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding g ...
has significantly decreased from over $95 million in 2001 to $7,666 in January 2012. Similarly, the cost of determining one megabase (a million bases) has also decreased from over $5,000 in 2001 to $0.09 in 2012. In 2008, sequencing centers transitioned from Sanger-based (dideoxy chain termination sequencing) to 'second generation' (or 'next-generation') DNA sequencing technologies. This caused a significant drop in sequencing costs.
Future directions
TBI has the potential to play a significant role in medicine; however, many challenges still remain. The overarching goal for TBI is to "develop informatics approaches for linking across traditionally disparate data and knowledge sources enabling both the generation and testing of new hypotheses".[ Current applications of TBI face challenges due to a lack of standards resulting in diverse data collection methodologies. Furthermore, analytic and storage capabilities are hindered due to large volumes of data present in current research. This problem is projected to increase with personal genomics as it will create an even greater accumulation of data.][
Challenges also exist in the research of drugs and biomarkers, genomic medicine, protein design metagenomics, infectious disease discovery, ]data curation Data curation is the organization and integration of data collected from various sources. It involves annotation, publication and presentation of the data such that the value of the data is maintained over time, and the data remains available for re ...
, literature mining, and workflow development.[ Continued belief in the opportunity and benefits of TBI justifies further funding for infrastructure, intellectual property protection and accessibility policies.]
Available funding for TBI in the past decade has increased. The demand for translational bioinformatics research is in part due to the growth in numerous areas of bioinformatics and health informatics and in part due to the popular support of projects like the Human Genome Project. This growth and influx of funding has enabled the industry to produce assets such as a repository of gene expression data and genomic scale data while also making progress towards the concept of creating a $1000 genome and completing the Human Genome Project.[ It is believed by some that TBI will cause a cultural shift in the way scientific and clinical information are processed within the pharmaceutical industry, regulatory agencies, and clinical practice. It is also seen as a means to shift clinical trial designs away from case studies and towards EMR analysis.][
Leaders in the field have presented numerous predictions with regards to the direction TBI is, and should take. A collection of predictions is as follows:
#Lesko (2012) states that strategy must occur in the European Union to bridge the gap between academic and industry in the following ways – directly quoted:][
##Validate and publish informatics data and technology models to accepted standards in order to facilitate adoption,
##Transform electronic health records to make them more accessible and interoperable,
##Encourage information sharing, engage regulatory agencies, and
##Encourage increasing financial support to grow and develop TBI
#Altman (2011), at the 2011 AMIA Summit on TBI, predicts that:][
##Cloud computing will contribute to major biomedical discovery.
##Informatics applications to stem cell science will increase
##Immune genomics will emerge as powerful data
##Flow cytometry informatics will grow
##Molecular and expression data will combine for drug repurposing
##Exome sequencing will persist longer than expected Progress in interpreting ]non-coding DNA
Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and regula ...
variations
#Sarkar, Butte
__NOTOC__
In geomorphology, a butte () is an isolated hill with steep, often vertical sides and a small, relatively flat top; buttes are smaller landforms than mesas, plateaus, and tablelands. The word ''butte'' comes from a French word me ...
, Lussier, Tarczy-Hornoch and Ohno-Machado (2011) state that the future of TBI must establish a way to manage the large amount of available data and look to integrate findings from projects such as the eMERGE (Electronic Medical Records and Genomics) project funded by NIH, the Personal Genome Project, the Exome Project, the Million Veteran Program and the 1000 Genomes Project.[
"In an information-rich world, the wealth of information means a dearth of something else—a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that it might consume" (Herbert Simon, 1971).
]
Associations, conferences and journals
Below is a list of existing associations, conferences and journals that are specific to TBI. By no means is this an all-inclusive list, and should be developed as others are discovered.
;Associations
*American Medical Informatics Association:[http://www.amia.org/]
;Conferences *websites change yearly
*AMIA Annual Symposium hicago, 2012*AMIA Joint Summits on Translational Science an Francisco, 2013*AMIA Summit on Translational Bioinformatics (TBI) an Francisco, 2013*AMIA Summit Clinical Research Informatics (CRI) an Francisco, 2013br>TBC 2011
Translational Bioinformatics Conference eoul, Korea, 2011br>TBC 2012
Translational Bioinformatics Conference eju Island, Korea, 2012br>TBC/ISCB-Asia 2013
Translational Bioinformatics Conference eoul, Korea, 2013br>TBC/ISB 2014
Translational Bioinformatics Conference ingdao, China, 2014br>TBC2015
Translational Bioinformatics Conference okyo, Japan, 2015*IFP/IMIA Working Conference, Interfacing bio- and medical informatics msterdam, 2012;Journals
* ''Journal of the American Medical Informatics Association
The ''Journal of the American Medical Informatics Association'' is a peer-reviewed scientific journal covering research in the field of medical informatics published by the American Medical Informatics Association.
According to the ''Journal Ci ...
''
*''Journal of Biomedical Informatics
The ''Journal of Biomedical Informatics'' is a peer-reviewed scientific journal that covers research in health informatics or in translational bioinformatics. It is considered a premier methodology journal in the field of biomedical informatics. A ...
''
*''Journal of Clinical Bioinformatics''
;Special Journal Issues on Translational Bioinformatics
"Translational Bioinformatics", Lussier YA, Butte A, Hunter L, J Biomed Inform Volume 43, Issue 3 (2010)
"Translational Bioinformatics", Kann M, Lewitter F, Chen J, PLoS Compt Biol 2012
Training and certification
A non-exhaustive list of training and certification programs specific to TBI are listed below.
Masters In Translational Bioinformatics
University of Southern California,
Oregon Clinical and Translational Institute
Bioinformatics and Translational-Clinical Research Program
Boston University School of Medicine,
University of Pennsylvania, Smilow Center for Translational Research
/
Division of Biomedical Informatics
University of California San Diego
CPBMI (Certified Physician in Biomedical Informatics)
The Korean Society of Medical Informatics:
References
{{Health informatics
Health informatics