Systems biology is the
computational
A computation is any type of arithmetic or non-arithmetic calculation that is well-defined. Common examples of computation are mathematical equation solving and the execution of computer algorithms.
Mechanical or electronic devices (or, historic ...
and
mathematical
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
analysis and modeling of complex
biological system
A biological system is a complex Biological network inference, network which connects several biologically relevant entities. Biological organization spans several scales and are determined based different structures depending on what the system is ...
s. It is a
biology
Biology is the scientific study of life and living organisms. It is a broad natural science that encompasses a wide range of fields and unifying principles that explain the structure, function, growth, History of life, origin, evolution, and ...
-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach (
holism
Holism is the interdisciplinary idea that systems possess properties as wholes apart from the properties of their component parts. Julian Tudor Hart (2010''The Political Economy of Health Care''pp.106, 258
The aphorism "The whole is greater than t ...
instead of the more traditional
reductionism
Reductionism is any of several related philosophical ideas regarding the associations between phenomena which can be described in terms of simpler or more fundamental phenomena. It is also described as an intellectual and philosophical positi ...
) to biological research.
This multifaceted research domain necessitates the collaborative efforts of chemists, biologists, mathematicians, physicists, and engineers to decipher the biology of intricate living systems by merging various quantitative molecular measurements with carefully constructed mathematical models. It represents a comprehensive method for comprehending the complex relationships within biological systems. In contrast to conventional biological studies that typically center on isolated elements, systems biology seeks to combine different biological data to create models that illustrate and elucidate the dynamic interactions within a system. This methodology is essential for understanding the complex networks of genes, proteins, and metabolites that influence cellular activities and the traits of organisms. One of the aims of systems biology is to model and discover emergent properties, of cells, tissues and organisms functioning as a system whose theoretical description is only possible using techniques of systems biology.
By exploring how function emerges from dynamic interactions, systems biology bridges the gaps that exist between molecules and physiological processes.
As a
paradigm
In science and philosophy, a paradigm ( ) is a distinct set of concepts or thought patterns, including theories, research methods, postulates, and standards for what constitute legitimate contributions to a field. The word ''paradigm'' is Ancient ...
, systems biology is usually defined in antithesis to the so-called
reductionist
Reductionism is any of several related philosophical ideas regarding the associations between phenomena which can be described in terms of simpler or more fundamental phenomena. It is also described as an intellectual and philosophical posit ...
paradigm (
biological organisation
Biological organization is the organization of complex biological structures and systems that define life using a reductionistic approach. The traditional hierarchy, as detailed below, extends from atoms to biospheres. The higher levels of t ...
), although it is consistent with the
scientific method
The scientific method is an Empirical evidence, empirical method for acquiring knowledge that has been referred to while doing science since at least the 17th century. Historically, it was developed through the centuries from the ancient and ...
. The distinction between the two paradigms is referred to in these quotations: "the
reductionist
Reductionism is any of several related philosophical ideas regarding the associations between phenomena which can be described in terms of simpler or more fundamental phenomena. It is also described as an intellectual and philosophical posit ...
approach has successfully identified most of the components and many of the interactions but, unfortunately, offers no convincing concepts or methods to understand how system properties emerge ... the pluralism of causes and effects in biological networks is better addressed by observing, through quantitative measures, multiple components simultaneously and by rigorous data integration with mathematical models." (Sauer ''et al.'') "Systems biology ... is about putting together rather than taking apart, integration rather than reduction. It requires that we develop ways of thinking about integration that are as rigorous as our reductionist programmes, but different. ... It means changing our philosophy, in the full sense of the term." (
Denis Noble)

As a series of operational
protocols used for performing research, namely a cycle composed of theory,
analytic or
computational model
A computational model uses computer programs to simulate and study complex systems using an algorithmic or mechanistic approach and is widely used in a diverse range of fields spanning from physics, engineering, chemistry and biology to economics ...
ling to propose specific testable hypotheses about a biological system, experimental validation, and then using the newly acquired quantitative description of cells or cell processes to refine the computational model or theory. Since the objective is a model of the interactions in a system, the experimental techniques that most suit systems biology are those that are system-wide and attempt to be as complete as possible. Therefore,
transcriptomics
Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA, RNA transcripts. The information content of an organism is recorded in the DNA of its genome and Gene expression, expressed throu ...
,
metabolomics
Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerpri ...
,
proteomics
Proteomics is the large-scale study of proteins. Proteins are vital macromolecules of all living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replicatio ...
and
high-throughput techniques are used to collect quantitative data for the construction and validation of models.
A comprehensive systems biology approach necessitates: (i) a thorough characterization of an organism concerning its molecular components, the interactions among these molecules, and how these interactions contribute to cellular functions; (ii) a detailed spatio-temporal molecular characterization of a cell (for example, component dynamics, compartmentalization, and vesicle transport); and (iii) an extensive systems analysis of the cell's 'molecular response' to both external and internal perturbations. Furthermore, the data from (i) and (ii) should be synthesized into mathematical models to test knowledge by generating predictions (hypotheses), uncovering new biological mechanisms, assessing the system's behavior derived from (iii), and ultimately formulating rational strategies for controlling and manipulating cells. To tackle these challenges, systems biology must incorporate methods and approaches from various disciplines that have not traditionally interfaced with one another. The emergence of multi-omics technologies has transformed systems biology by providing extensive datasets that cover different biological layers, including genomics, transcriptomics, proteomics, and metabolomics. These technologies enable the large-scale measurement of biomolecules, leading to a more profound comprehension of biological processes and interactions. Increasingly, methods such as network analysis, machine learning, and pathway enrichment are utilized to integrate and interpret multi-omics data, thereby improving our understanding of biological functions and disease mechanisms.
History
Holism vs. Reductionism
It is challenging to trace the origins and beginnings of systems biology. A comprehensive perspective on the human body was central to the medical practices of Greek, Roman, and East Asian traditions, where physicians and thinkers like Hippocrates believed that health and illness were linked to the equilibrium or disruption of bodily fluids known as humors. This holistic perspective persisted in the Western world throughout the 19th and 20th centuries, with prominent physiologists viewing the body as controlled by various systems, including the nervous system, the gastrointestinal system, and the cardiovascular system. In the latter half of the 20th century, however, this way of thinking was largely supplanted by reductionism: To grasp how the body functions properly, one needed to comprehend the role of each component, from tissues and cells to the complete set of intracellular molecular building blocks.
In the 17th century, the triumphs of physics and the advancement of mechanical clockwork prompted a reductionist viewpoint in biology, interpreting organisms as intricate machines made up of simpler elements.
Jan Smuts (1870–1950), naturalist/philosopher and twice Prime Minister of South Africa, coined the commonly used term holism. Whole systems such as cells, tissues, organisms, and populations were proposed to have unique (emergent) properties. It was impossible to try and reassemble the behavior of the whole from the properties of the individual components, and new technologies were necessary to define and understand the behavior of systems.
Even though reductionism and holism are often contrasted with one another, they can be synthesized. One must understand how organisms are built (reductionism), while it is just as important to understand why they are so arranged (systems; holism). Each provides useful insights and answers different questions. However, the study of biological systems requires knowledge about control and design paradigms, as well as principles of structural stability, resilience, and robustness that are not directly inferred from mechanistic information. More profound insight will be gained by employing computer modeling to overcome the complexity in biological systems.
Nevertheless, this perspective was consistently balanced by thinkers who underscored the significance of organization and emergent traits in living systems. This reductionist perspective has achieved remarkable success, and our understanding of biological processes has expanded with incredible speed and intensity. However, alongside these extraordinary advancements, science gradually came to understand that possessing complete information about molecular components alone would not suffice to elucidate the workings of life: the individual components rarely illustrate the function of a complex system. It is now commonly recognized that we need approaches for reconstructing integrated systems from their constituent parts and processes if we are to comprehend biological phenomena and manipulate them in a thoughtful, focused way.
Origin of systems biology as a field

In 1968, the term "systems biology" was first introduced at a conference. Those within the discipline soon recognized—and this understanding gradually became known to the wider public—that computational approaches were necessary to fully articulate the concepts and potential of systems biology. Specifically, these techniques needed to view biological phenomena as complex, multi-layered, adaptive, and dynamic systems. They had to account for transformations and intricate nonlinearities, thereby allowing for the smooth integration of smaller models ("modules") into larger, well-organized assemblies of models within complex settings. It became clear that mathematics and computation were vital for these methods. An acceleration of systems understanding came with the publication of the first ground-breaking text compiling molecular, physiological, and anatomical individuality in animals, which has been described as a revolution.
Initially, the wider scientific community was reluctant to accept the integration of computational methods and control theory in the exploration of living systems, believing that "biology was too complex to apply mathematics." However, as the new millennium neared, this viewpoint underwent a significant and lasting transformation.
More scientists started working on integration of mathematical concepts to understand and solve biological problems. Now, Systems biology have been widely applied in several fields including agriculture and medicine.
Approaches to systems biology
Top-down approach
Top-down systems biology identifies molecular interaction networks by analyzing the correlated behaviors observed in large-scale 'omics' studies. With the advent of 'omics', this top-down strategy has become a leading approach. It begins with an overarching perspective of the system's behavior – examining everything at once – by gathering genome-wide experimental data and seeks to unveil and understand biological mechanisms at a more granular level – specifically, the individual components and their interactions. In this framework of 'top-down' systems biology, the primary goal is to uncover novel molecular mechanisms through a cyclical process that initiates with experimental data, transitions into data analysis and integration to identify correlations among molecule concentrations and concludes with the development of hypotheses regarding the co- and inter-regulation of molecular groups. These hypotheses then generate new predictions of correlations, which can be explored in subsequent experiments or through additional biochemical investigations.
The notable advantages of top-down systems biology lie in its potential to provide comprehensive (i.e., genome-wide) insights and its focus on the metabolome, fluxome, transcriptome, and/or proteome. Top-down methods prioritize overall system states as influencing factors in models and the computational (or optimality) principles that govern the dynamics of the global system. For instance, while the dynamics of motor control (neuro) emerge from the interactions of millions of neurons, one can also characterize the neural motor system as a sort of feedback control system, which directs a 'plant' (the body) and guides movement by minimizing 'cost functions' (e.g., achieving trajectories with minimal jerk).
Bottom-up approach
Bottom-up systems biology infers the functional characteristics that may arise from a subsystem characterized with a high degree of mechanistic detail using molecular techniques. This approach begins with the foundational elements by developing the interactive behavior (rate equation) of each component process (e.g., enzymatic processes) within a manageable portion of the system. It examines the mechanisms through which functional properties arise in the interactions of known components. Subsequently, these formulations are combined to understand the behavior of the system. The primary goal of this method is to integrate the pathway models into a comprehensive model representing the entire system - the top or whole. As research and understanding advance, these models are often expanded by incorporating additional processes with high mechanistic detail.
The bottom-up approach facilitates the integration and translation of drug-specific in vitro findings to the in vivo human context. This encompasses data collected during the early phases of drug development, such as safety evaluations. When assessing cardiac safety, a purely bottom-up modeling and simulation method entails reconstructing the processes that determine exposure, which includes the plasma (or heart tissue) concentration-time profiles and their electrophysiological implications, ideally incorporating hemodynamic effects and changes in contractility. Achieving this necessitates various models, ranging from single-cell to advanced three-dimensional (3D) multiphase models. Information from multiple in vitro systems that serve as stand-ins for the in vivo absorption, distribution, metabolism, and excretion (ADME) processes enables predictions of drug exposure, while in vitro data on drug-ion channel interactions support the translation of exposure to body surface potentials and the calculation of important electrophysiological endpoints. The separation of data related to the drug, system, and trial design, which is characteristic of the bottom-up approach, allows for predictions of exposure-response relationships considering both inter- and intra-individual variability, making it a valuable tool for evaluating drug effects at a population level. Numerous successful instances of applying physiologically based pharmacokinetic (PBPK) modeling in drug discovery and development have been documented in the literature.
Associated disciplines

According to the interpretation of systems biology as using large data sets using interdisciplinary tools, a typical application is
metabolomics
Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerpri ...
, which is the complete set of all the metabolic products,
metabolites
In biochemistry, a metabolite is an intermediate or end product of metabolism.
The term is usually used for small molecules. Metabolites have various functions, including fuel, structure, signaling, stimulatory and inhibitory effects on enzymes, c ...
, in the system at the organism, cell, or tissue level.
Items that may be a computer database include:
phenomics
Phenomics is the systematic study of traits that make up an organisms phenotype,
which changes over time, due to development and aging or through metamorphosis such as when a caterpillar changes into a butterfly. The term "phenomics" was coined ...
, organismal variation in
phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
as it changes during its life span;
genomics
Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, ...
, organismal
deoxyribonucleic acid
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of a ...
(DNA) sequence, including intra-organismal cell specific variation. (i.e.,
telomere
A telomere (; ) is a region of repetitive nucleotide sequences associated with specialized proteins at the ends of linear chromosomes (see #Sequences, Sequences). Telomeres are a widespread genetic feature most commonly found in eukaryotes. In ...
length variation);
epigenomics
Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epige ...
/
epigenetics
In biology, epigenetics is the study of changes in gene expression that happen without changes to the DNA sequence. The Greek prefix ''epi-'' (ἐπι- "over, outside of, around") in ''epigenetics'' implies features that are "on top of" or "in ...
, organismal and corresponding cell specific transcriptomic regulating factors not empirically coded in the genomic sequence. (i.e.,
DNA methylation
DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter (genetics), promoter, DNA methylati ...
,
Histone acetylation and deacetylation
Histone acetylation and deacetylation are the processes by which the lysine residues within the N-terminus, N-terminal tail protruding from the histone core of the nucleosome are acetylated and deacetylated as part of gene regulation.
Histone acet ...
, etc.);
transcriptomics
Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA, RNA transcripts. The information content of an organism is recorded in the DNA of its genome and Gene expression, expressed throu ...
, organismal, tissue or whole cell
gene expression
Gene expression is the process (including its Regulation of gene expression, regulation) by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, ...
measurements by
DNA microarray
A DNA microarray (also commonly known as a DNA chip or biochip) is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or t ...
s or
serial analysis of gene expression;
interferomics, organismal, tissue, or cell-level transcript correcting factors (i.e.,
RNA interference
RNA interference (RNAi) is a biological process in which RNA molecules are involved in sequence-specific suppression of gene expression by double-stranded RNA, through translational or transcriptional repression. Historically, RNAi was known by ...
),
proteomics
Proteomics is the large-scale study of proteins. Proteins are vital macromolecules of all living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replicatio ...
, organismal, tissue, or cell level measurements of proteins and peptides via
two-dimensional gel electrophoresis,
mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used ...
or multi-dimensional protein identification techniques (advanced
HPLC
High-performance liquid chromatography (HPLC), formerly referred to as high-pressure liquid chromatography, is a technique in analytical chemistry used to separate, identify, and quantify specific components in mixtures. The mixtures can origina ...
systems coupled with
mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used ...
). Sub disciplines include
phosphoproteomics,
glycoproteomics Glycoproteomics is a branch of proteomics that identifies, catalogs, and characterizes proteins containing carbohydrates as a result of post-translational modifications. Glycosylation is the most common post-translational modification of proteins, b ...
and other methods to detect chemically modified proteins;
glycomics
Glycomics is the comprehensive study of glycomes (the entire complement of sugars, whether free or present in more complex molecules of an organism), including genetic, physiologic, pathologic, and other aspects. Glycomics "is the systematic study ...
, organismal, tissue, or cell-level measurements of
carbohydrate
A carbohydrate () is a biomolecule composed of carbon (C), hydrogen (H), and oxygen (O) atoms. The typical hydrogen-to-oxygen atomic ratio is 2:1, analogous to that of water, and is represented by the empirical formula (where ''m'' and ''n'' ...
s;
lipidomics
Lipidomics is the large-scale study of pathways and networks of cellular lipids in biological systems. The word "lipidome" is used to describe the complete lipid profile within a cell, tissue, organism, or ecosystem and is a subset of the "metabo ...
, organismal, tissue, or cell level measurements of
lipids
Lipids are a broad group of organic compounds which include fats, waxes, sterols, fat-soluble vitamins (such as vitamins Vitamin A, A, Vitamin D, D, Vitamin E, E and Vitamin K, K), monoglycerides, diglycerides, phospholipids, and others. The fu ...
.
The molecular interactions within the cell are also studied, this is called
interactomics. A discipline in this field of study is
protein–protein interaction
Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and t ...
s, although interactomics includes the interactions of other molecules.
Neuroelectrodynamics, where the computer's or a brain's computing function as a dynamic system is studied along with its (bio)physical mechanisms; and
fluxomics
Fluxomics describes the various approaches that seek to determine the rates of metabolic reactions within a biological entity. While metabolomics can provide instantaneous information on the metabolites in a biological sample, metabolism is a dyna ...
, measurements of the rates of metabolic reactions in a biological system (cell, tissue, or organism).
In approaching a systems biology problem there are two main approaches. These are the top down and bottom up approach. The top down approach takes as much of the system into account as possible and relies largely on experimental results. The
RNA-Seq
RNA-Seq (named as an abbreviation of RNA sequencing) is a technique that uses next-generation sequencing to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also k ...
technique is an example of an experimental top down approach. Conversely, the bottom up approach is used to create detailed models while also incorporating experimental data. An example of the bottom up approach is the use of circuit models to describe a simple gene network.
Various technologies utilized to capture dynamic changes in mRNA, proteins, and post-translational modifications.
Mechanobiology, forces and physical properties at all scales, their interplay with other regulatory mechanisms;
biosemiotics
Biosemiotics (from the Ancient Greek, Greek βίος ''bios'', "life" and σημειωτικός ''sēmeiōtikos'', "observant of signs") is a field of semiotics (especially Neurosemiotics) and biology that studies the prelinguistic meaning-makin ...
, analysis of the system of
sign relation
A sign relation is the basic construct in the theory of signs, also known as semiotics, as developed by Charles Sanders Peirce.
Anthesis
Thus, if a sunflower, in turning towards the sun, becomes by that very act fully capable, without further ...
s of an organism or other biosystems;
Physiomics, a systematic study of
physiome in biology.
Cancer systems biology is an example of the systems biology approach, which can be distinguished by the specific object of study (
tumorigenesis
Carcinogenesis, also called oncogenesis or tumorigenesis, is the formation of a cancer, whereby normal cells are transformed into cancer cells. The process is characterized by changes at the cellular, genetic, and epigenetic levels and abn ...
and
treatment of cancer
Cancer treatments are a wide range of treatments available for the many different types of cancer, with each cancer type needing its own specific treatment. Treatments can include surgery, chemotherapy, radiation therapy, hormonal therapy (oncolo ...
). It works with the specific data (patient samples, high-throughput data with particular attention to characterizing
cancer genome in patient tumour samples) and tools (immortalized cancer
cell lines
An immortalised cell line is a population of cells from a multicellular organism that would normally not proliferate indefinitely but, due to mutation, have evaded normal cellular senescence and instead can keep undergoing division. The cells ...
,
mouse models of tumorigenesis,
xenograft models,
high-throughput sequencing
DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, thymine, cytosine, and guanine. The ...
methods, siRNA-based gene knocking down
high-throughput screening
High-throughput screening (HTS) is a method for scientific discovery especially used in drug discovery and relevant to the fields of biology, materials science and chemistry. Using robotics, data processing/control software, liquid handling device ...
s, computational modeling of the consequences of somatic
mutations
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosi ...
and
genome instability
Genome instability (also genetic instability or genomic instability) refers to a high frequency of mutations within the genome of a cellular lineage. These mutations can include changes in nucleic acid sequences, chromosomal rearrangements or ...
).
The long-term objective of the systems biology of cancer is ability to better diagnose cancer, classify it and better predict the outcome of a suggested treatment, which is a basis for
personalized cancer medicine and
virtual cancer patient in more distant prospective. Significant efforts in computational systems biology of cancer have been made in creating realistic multi-scale ''
in silico
In biology and other experimental sciences, an ''in silico'' experiment is one performed on a computer or via computer simulation software. The phrase is pseudo-Latin for 'in silicon' (correct ), referring to silicon in computer chips. It was c ...
'' models of various tumours.
[
]
The systems biology approach often involves the development of
mechanistic models, such as the reconstruction of
dynamic system
In mathematics, a dynamical system is a system in which a function describes the time dependence of a point in an ambient space, such as in a parametric curve. Examples include the mathematical models that describe the swinging of a clock ...
s from the quantitative properties of their elementary building blocks.
For instance, a cellular network can be modelled mathematically using methods coming from
chemical kinetics
Chemical kinetics, also known as reaction kinetics, is the branch of physical chemistry that is concerned with understanding the rates of chemical reactions. It is different from chemical thermodynamics, which deals with the direction in which a ...
and
control theory
Control theory is a field of control engineering and applied mathematics that deals with the control system, control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the applic ...
. Due to the large number of parameters, variables and constraints in cellular networks, numerical and computational techniques are often used (e.g.,
flux balance analysis).
Other aspects of computer science,
informatics
Informatics is the study of computational systems. According to the Association for Computing Machinery, ACM Europe Council and Informatics Europe, informatics is synonymous with computer science and computing as a profession, in which the centra ...
, and statistics are also used in systems biology. These include new forms of computational models, such as the use of
process calculi
In computer science, the process calculi (or process algebras) are a diverse family of related approaches for formally modelling concurrent systems. Process calculi provide a tool for the high-level description of interactions, communications, and ...
to model biological processes (notable approaches include stochastic
π-calculus
In theoretical computer science, the -calculus (or pi-calculus) is a process calculus. The -calculus allows channel names to be communicated along the channels themselves, and in this matter, it is able to describe concurrent computations whose ...
, BioAmbients, Beta Binders, BioPEPA, and Brane calculus) and
constraint-based modeling; integration of information from the literature, using techniques of
information extraction and
text mining
Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from differe ...
; development of online databases and repositories for sharing data and models, approaches to database integration and software interoperability via
loose coupling
In computing and systems design, a loosely coupled system is one
# in which components are weakly associated (have breakable relationships) with each other, and thus changes in one component least affect existence or performance of another comp ...
of software, websites and databases, or commercial suits; network-based approaches for analyzing high dimensional genomic data sets. For example,
weighted correlation network analysis is often used for identifying clusters (referred to as modules), modeling the relationship between clusters, calculating fuzzy measures of cluster (module) membership, identifying intramodular hubs, and for studying cluster preservation in other data sets; pathway-based methods for omics data analysis, e.g. approaches to identify and score pathways with differential activity of their gene, protein, or metabolite members.
Much of the analysis of genomic data sets also include identifying correlations. Additionally, as much of the information comes from different fields, the development of syntactically and semantically sound ways of representing biological models is needed.
Model and its types
What is a model?
A model serves as a conceptual depiction of objects or processes, highlighting certain characteristics of these items or activities. A model captures only certain facets of reality; however, when created correctly, this limited scope is adequate because the primary goal of modeling is to address specific inquiries. The saying, "essentially, all models are wrong, but some are useful," attributed to the statistician George Box, is a suitable principle for constructing models.
Types of models
* Boolean Models: These models are also known as logical models and represent biological systems using binary states, allowing for the analysis of gene regulatory networks and signaling pathways. They are advantageous for their simplicity and ability to capture qualitative behaviors.

* Petri nets (PN): A unique type of bipartite graph consisting of two types of nodes: places and transitions. When a transition is activated, a token is transferred from the input places to the output places; the process is asynchronous and non-deterministic.
* Polynomial dynamical systems (PDS)- An algebraically based approach that represents a specific type of sequential FDS (Finite Dynamical System) operating over a finite field. Each transition function is an element within a polynomial ring defined over the finite field. It employs advanced rapid techniques from computer algebra and computational algebraic geometry, originating from the Buchberger algorithm, to compute the Gröbner bases of ideals in these rings. An ideal consists of a set of polynomials that remain closed under polynomial combinations.
* Differential equation models (ODE and PDE)- Ordinary Differential Equations (ODEs) are commonly utilized to represent the temporal dynamics of networks, while Partial Differential Equations (PDEs) are employed to describe behaviors occurring in both space and time, enabling the modeling of pattern formation. These spatiotemporal Diffusion-Reaction Systems demonstrate the emergence of self-organizing patterns, typically articulated by the general local activity principle, which elucidates the factors contributing to complexity and self-organization observed in nature.
* Bayesian models: This kind of model is commonly referred to as dynamic models. It utilizes a probabilistic approach that enables the integration of prior knowledge through Bayes' Theorem. A challenge can arise when determining the direction of an interaction.
* Finite State Linear Model (FSML): This model integrates continuous variables (such as protein concentration) with discrete elements (like promoter regions that have a limited number of states) in modeling.
* Agent-based models (ABM): Initially created within the fields of social sciences and economics, it models the behavior of individual agents (such as genes, mRNAs (siRNA, miRNA, lncRNA), proteins, and transcription factors) and examines how their interactions influence the larger system, which in this case is the cell.
* Rule – based models: In this approach, molecular interactions are simulated using local rules that can be utilized even in the absence of a specific network structure, meaning that the step to infer the network is not required, allowing these network-free methods to avoid the complex challenges associated with network inference.
* Piecewise-linear differential equation models (PLDE): The model is composed of a piecewise-linear representation of differential equations using step functions, along with a collection of inequality restrictions for the parameter values.

* Stochastic models: Models utilizing the Gillespie algorithm for addressing the chemical master equation provide the likelihood that a particular molecular species will possess a defined molecular population or concentration at a specified future point in time. The Gillespie method is the most computationally intensive option available. In cases where the number of molecules is low or when modeling the effects of molecular crowding is desired, the stochastic approach is preferred.

* State Space Model (SSM): Linear or non-linear modeling techniques that utilize an abstract state space along with various algorithms, which include Bayesian and other statistical methods, autoregressive models, and Kalman filtering.
Creating biological models
Researchers begin by choosing a biological pathway and diagramming all of the protein, gene, and/or metabolic pathways. After determining all of the interactions,
mass action kinetics or
enzyme kinetic rate laws are used to describe the speed of the reactions in the system. Using mass-conservation, the
differential equations for the biological system can be constructed. Experiments or parameter fitting can be done to determine the parameter values to use in the
differential equations. These parameter values will be the various kinetic constants required to fully describe the model. This model determines the behavior of species in biological systems and bring new insight to the specific activities of systems. Sometimes it is not possible to gather all reaction rates of a system. Unknown reaction rates are determined by simulating the model of known parameters and target behavior which provides possible parameter values.
The use of constraint-based reconstruction and analysis (COBRA) methods has become popular among systems biologists to simulate and predict the metabolic phenotypes, using genome-scale models. One of the methods is the
flux balance analysis (FBA) approach, by which one can study the biochemical networks and analyze the flow of metabolites through a particular metabolic network, by optimizing the objective function of interest (e.g. maximizing biomass production to predict growth).
Tools and database
Applications in system biology
Systems biology, an interdisciplinary field that combines biology, data analysis, and mathematical modeling, has revolutionized various sectors, including medicine, agriculture, and environmental science. By integrating omics data (genomics, proteomics, metabolomics, etc.), systems biology provides a holistic understanding of complex biological systems, enabling advancements in drug discovery, crop improvement, and environmental impact assessment. This response explores the applications of systems biology across these domains, highlighting both industrial and academic research contributions. System biology is used in agriculture to identify the genetic and metabolic components of complex characteristics through trait dissection. It aids in the comprehension of plant-pathogen interactions in disease resistance. It is utilized in nutritional quality to enhance nutritional content through metabolic engineering.
Cancer
Approaches to cancer systems biology have made it possible to effectively combine experimental data with computer algorithms and, as an exception, to apply actionable targeted medicines for the treatment of cancer. In order to apply innovative cancer systems biology techniques and boost their effectiveness for customizing new, individualized cancer treatment modalities, comprehensive multi-omics data acquired through the sequencing of tumor samples and experimental model systems will be crucial.
Cancer systems biology has the potential to provide insights into intratumor heterogeneity and identify therapeutic options. In particular, enhanced cancer systems biology methods that incorporate not only multi-omics data from tumors, but also extensive experimental models derived from patients can assist clinicians in their decision-making processes, ultimately aiming to address treatment failures in cancer.
Drug development
Before the 1990s, phenotypic drug discovery formed the foundation of most research in drug discovery, utilizing cellular and animal disease models to find drugs without focusing on a specific molecular target. However, following the completion of the human genome project, target-based drug discovery has become the predominant approach in contemporary pharmaceutical research for various reasons. Gene knockout and transgenic models enable researchers to investigate and gain insights into the function of targets and the mechanisms by which drugs operate on a molecular level. Target-based assays lend themselves better to high-throughput screening, which simplifies the process of identifying second-generation drugs—those that improve upon first-in-class drugs in aspects such as potency, selectivity, and half-life, especially when combined with structure-based drug design. To do this, researchers utilize the three-dimensional structure of target proteins and computational models of interactions between small molecules and those targets to aid in the identification of superior compounds.
Cell systems biology represents a phenotypic drug discovery method that integrates the complexity of human disease biology with combinatorial design to develop assays. BioMAP® systems, founded on the principles of cell systems biology, consist of assays based on primary human cells that are designed to replicate intricate human disease and tissue biology in a feasible in vitro environment. Primary human cell types and co-cultures are activated using combinations of pathway activators to create cell signaling networks that align more closely with human disease. These systems are analyzed by assessing the levels of both secreted proteins and cell surface mediators. The distinct variations in protein readouts resulting from drug effects are recorded in a database that enables users to search for functional similarities (or biological 'read across'). In this method, inhibitors or activators targeting specific pathways are discovered to consistently affect the levels of multiple endpoints, often exhibiting a uniquely defined pattern, so that the resulting signatures can be linked to particular mechanisms of action.
Food safety and quality
The multi-omics technologies in system biology can be also be used in aspects of food quality and safety. High-throughput omics techniques, including genomics, proteomics, and metabolomics, offer valuable insights into the molecular composition of food products, facilitating the identification of critical elements that affect food quality and safety. For example, integrating omics data can enhance the understanding of the metabolic pathways and associated functional gene patterns that contribute to both the nutritional value and safety of food crops. This comprehensive approach guarantees the creation of food products that are both nutritious and safe, capable of satisfying the increasing global demand.
Environmental system biology
Genomics examines all genes as an evolving system over time, aiming to understand their interactions and effects on biological pathways, networks, and physiology in a broader context compared to genetics. As a result, genomics holds significant potential for discovering clusters of genes associated with complex disorders, aiding in the comprehension and management of diseases induced by environmental factors.
When exploring the interactions between the environment and the genome as contributors to complex diseases, it is clear that the genome itself cannot be altered for the time being. However, once these interactions are recognized, it is feasible to minimize exposure or adjust lifestyle factors related to the environmental aspect of the disease. Gene-environment interactions can occur through direct associations with active metabolites at certain locations within the genome, potentially leading to mutations that could cause human diseases. Indirect interactions with the human genome can take place through intracellular receptors that function as ligand-activated transcription factors, which modulate gene expression and maintain cellular balance, or with an environmental factor that may produce detrimental effects. This type of environmental-gene interaction could be more straightforward to investigate than direct interactions since there are numerous markers of this kind of interaction that are readily measurable before the disease manifests. Examples of this include the expression of cytochrome P450 genes following exposure to environmental substances, such as the polycyclic aromatic hydrocarbon benzo
yrene, which binds to the Ah receptor.
Technical challenges
One of the main challenges in systems biology is the connection between experimental descriptions, observations, data, models, and the assumptions that stem from them. In essence, systems biology must be understood within an information management framework that significantly encompasses experimental life sciences. Models are created using various languages or representation schemes, each suitable for conveying and reasoning about distinct sets of characteristics. There is no single universal language for systems biology that can adequately cover the diverse phenomena we aim to investigate. However, this intricate scenario overlooks two important aspects. Models can be developed in multiple versions over time and by different research teams. Conflicts can occur, and observations may be disputed. Various researchers might produce models in different versions and configurations. The unpredictable elements suggest that systems biology is not likely to yield a definitive collection of established models. Instead, we can expect a rich ecosystem of models to develop within a structure that fosters discussion and cooperation among participants. Challenges also exist in verifying the constraints and creating modeling frameworks with robust compositional strategies. This may create a need to handle models that may conflict with one another, whether between schemes or across different scales. In the end, the goal could involve the creation of personalized models that reflect differences in physiology, as opposed to universal models of biological processes.
Other challenges include the massive amount of data created by high-throughput omics technologies which presents considerable challenges in terms of computation and storage. Each analysis in omics can result in data files ranging from terabytes to petabytes, which requires strong computational systems and ample storage solutions to manage and process these datasets effectively. The computational requirements are made more difficult by the necessity for advanced algorithms that can integrate and analyze diverse, high-dimensional data. Approaches like deep learning and network-based methods have displayed potential in tackling these issues, but they also demand significant computational power.
Artificial intelligence (AI) in systems biology
Utilizing AI in Systems Biology enables scientists to uncover novel insights into the intricate relationships present within biological systems, such as those among genes, proteins, and cells. A significant focus within Systems Biology is the application of AI for the analysis of expansive and complex datasets, including multi-omics data produced by high-throughput methods like next-generation sequencing and proteomics. Approaches powered by AI can be employed to detect patterns and correlations within these datasets and to anticipate the behavior of biological systems under varying conditions.
For instance, artificial intelligence can identify genes that are expressed differently across various cancer types or detect small molecules linked to particular disease states. A key difficulty in analyzing multi-omics data is the integration of information from multiple sources. AI can create integrative models that consider the intricate interactions between different types of molecular data. These models may be utilized to uncover new biomarkers or therapeutic targets for diseases, as well as to enhance our understanding of fundamental biological processes. By significantly speeding up our comprehension of complex biological systems, AI has the potential to lead to new treatments and therapies for a range of diseases.
Structural systems biology is a multidisciplinary field that merges systems biology with structural biology to investigate biological systems at the molecular scale. This domain strives for a thorough understanding of how biological molecules interact and function within cells, tissues, and organisms. The integration of AI in structural systems biology has become increasingly vital for examining extensive and complex datasets and modeling the behavior of biological systems. AI facilitates the analysis of protein–protein interaction networks within structural systems biology. These networks can be explored using graph theory and various mathematical methods, uncovering key characteristics such as hubs and modules. AI can also assist in the discovery of new drugs or therapies by predicting the effect of a drug on a particular biological component or pathway.
See also
*
Biochemical systems equation
*
BioSystems (journal)
*
Computational systems biology
*
Interactome
In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules (such as those among proteins, also known as protein–protein interactions ...
*
List of omics topics in biology
Inspired by the terms genome and genomics, other words to describe complete Biology, biological datasets, mostly sets of biomolecules originating from one organism, have been coined with the suffix ''-ome'' and ''-omics''. Some of these terms are ...
*
List of systems biology modeling software
*
Metabolic Control Analysis
In biochemistry, metabolic control analysis (MCA) is a mathematical framework for describing
Metabolic pathway, metabolic, Cell signaling#Signaling pathways, signaling, and genetic pathways. MCA quantifies how variables, such as fluxes and Chemi ...
*
Metabolic network modelling
Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome wi ...
*
Modelling biological systems
*
Network biology
A biological network is a method of representing systems as complex sets of binary interactions or relations between various biological entities. In general, networks or graphs are used to capture relationships between entities or objects. A typ ...
*
Metabolic network
A metabolic network is the complete set of metabolic and physical processes that determine the physiological and biochemical properties of a cell. As such, these networks comprise the chemical reactions of metabolism, the metabolic pathways, as ...
*
SBML
The Systems Biology Markup Language (SBML) is a representation format, based on XML, for communicating and storing computational models of biological processes. It is a free and open standard with widespread software support and a community of us ...
References
Further reading
*
*
*
* provides a comparative review of three books:
*
*
*
*
External links
*
Biological Systems in bio-physics-wiki
{{DEFAULTSORT:Systems Biology
Bioinformatics
Computational fields of study