Folding@Home
   HOME

TheInfoList



OR:

Folding@home (FAH or F@h) is a
volunteer computing Volunteer computing is a type of distributed computing in which people donate their computers' unused resources to a research-oriented project, and sometimes in exchange for credit points. The fundamental idea behind it is that a modern desktop co ...
project aimed to help scientists develop new therapeutics for a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements of proteins, and is reliant on simulations run on volunteers' personal computers. Folding@home is currently based at the
University of Pennsylvania The University of Pennsylvania (also known as Penn or UPenn) is a private research university in Philadelphia. It is the fourth-oldest institution of higher education in the United States and is ranked among the highest-regarded universitie ...
and led by Greg Bowman, a former student of
Vijay Pande Vijay Satyanand Pande is a Trinidadian-American venture capitalist. Pande is the former director of the biophysics program and is best known for orchestrating the distributed computing disease research project known as Folding@home. His research ...
. The project utilizes
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s (GPUs),
central processing unit A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
s (CPUs), and
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
processors like those on the
Raspberry Pi Raspberry Pi () is a series of small single-board computers (SBCs) developed in the United Kingdom by the Raspberry Pi Foundation in association with Broadcom. The Raspberry Pi project originally leaned towards the promotion of teaching basic ...
for volunteer computing and scientific research. The project uses statistical
simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of Conceptual model, models; the model represents the key characteristics or behaviors of the selected system or proc ...
methodology that is a
paradigm shift A paradigm shift, a concept brought into the common lexicon by the American physicist and philosopher Thomas Kuhn, is a fundamental change in the basic concepts and experimental practices of a scientific discipline. Even though Kuhn restricted t ...
from traditional computing methods. As part of the
client–server model The client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over ...
network architecture Network architecture is the design of a computer network. It is a framework for the specification of a network's physical components and their functional organization and configuration, its operational principles and procedures, as well as commun ...
, the volunteered machines each receive pieces of a simulation (work units), complete them, and return them to the project's
database server A database server is a server which uses a database application that provides database services to other computer programs or to computers, as defined by the client–server model. Database management systems (DBMSs) frequently provide database-s ...
s, where the units are compiled into an overall simulation. Volunteers can track their contributions on the Folding@home website, which makes volunteers' participation competitive and encourages long-term involvement. Folding@home is one of the world's fastest computing systems. With heightened interest in the project as a result of the
COVID-19 pandemic The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identif ...
, the system achieved a speed of approximately 1.22
exaflops In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate mea ...
by late March 2020 and reached 2.43 exaflops by April 12, 2020, making it the world's first exaflop computing system. This level of performance from its large-scale computing network has allowed researchers to run computationally costly atomic-level simulations of protein folding thousands of times longer than formerly achieved. Since its launch on October 1, 2000, Folding@home was involved in the production of 226 scientific research papers. Results from the project's simulations agree well with experiments.


Background

Protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
s are an essential component to many biological functions and participate in virtually all processes within
biological cell The cell is the basic structural and functional unit of life forms. Every cell consists of a cytoplasm enclosed within a membrane, and contains many biomolecules such as proteins, DNA and RNA, as well as many small molecules of nutrients an ...
s. They often act as
enzyme Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as products. A ...
s, performing biochemical reactions including
cell signaling In biology, cell signaling (cell signalling in British English) or cell communication is the ability of a cell to receive, process, and transmit signals with its environment and with itself. Cell signaling is a fundamental property of all cellula ...
, molecular transportation, and cellular regulation. As structural elements, some proteins act as a type of skeleton for cells, and as
antibodies An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
, while other proteins participate in the
immune system The immune system is a network of biological processes that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to parasitic worms, as well as cancer cells and objects such as wood splinte ...
. Before a protein can take on these roles, it must fold into a functional three-dimensional structure, a process that often occurs spontaneously and is dependent on interactions within its
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
sequence and interactions of the amino acids with their surroundings. Protein folding is driven by the search to find the most energetically favorable conformation of the protein, i.e., its
native state In biochemistry, the native state of a protein or nucleic acid is its properly folded and/or assembled form, which is operative and functional. The native state of a biomolecule may possess all four levels of biomolecular structure, with the s ...
. Thus, understanding protein folding is critical to understanding what a protein does and how it works, and is considered a holy grail of
computational biology Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
. Despite folding occurring within a crowded cellular environment, it typically proceeds smoothly. However, due to a protein's chemical properties or other factors, proteins may
misfold Protein folding is the physical process by which a protein chain is translated to its native three-dimensional structure, typically a "folded" conformation by which the protein becomes biologically functional. Via an expeditious and reproduci ...
, that is, fold down the wrong pathway and end up misshapen. Unless cellular mechanisms can destroy or refold misfolded proteins, they can subsequently aggregate and cause a variety of debilitating diseases. Laboratory experiments studying these processes can be limited in scope and atomic detail, leading scientists to use physics-based computing models that, when complementing experiments, seek to provide a more complete picture of protein folding, misfolding, and aggregation. Due to the complexity of proteins' conformation or configuration space (the set of possible shapes a protein can take), and limits in computing power, all-atom molecular dynamics simulations have been severely limited in the timescales that they can study. While most proteins typically fold in the order of milliseconds, before 2010, simulations could only reach nanosecond to microsecond timescales. General-purpose
supercomputer A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
s have been used to simulate protein folding, but such systems are intrinsically costly and typically shared among many research groups. Further, because the computations in kinetic models occur serially, strong
scaling Scaling may refer to: Science and technology Mathematics and physics * Scaling (geometry), a linear transformation that enlarges or diminishes objects * Scale invariance, a feature of objects or laws that do not change if scales of length, energ ...
of traditional molecular simulations to these architectures is exceptionally difficult. Moreover, as protein folding is a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
(i.e., random) and can statistically vary over time, it is challenging computationally to use long simulations for comprehensive views of the folding process. Protein folding does not occur in one step. Instead, proteins spend most of their folding time, nearly 96% in some cases, ''waiting'' in various intermediate conformational states, each a local
thermodynamic free energy The thermodynamic free energy is a concept useful in the thermodynamics of chemical or thermal processes in engineering and science. The change in the free energy is the maximum amount of work that a thermodynamic system can perform in a process a ...
minimum in the protein's
energy landscape An energy landscape is a mapping of possible states of a system. The concept is frequently used in physics, chemistry, and biochemistry, e.g. to describe all possible conformations of a molecular entity, or the spatial positions of interacting m ...
. Through a process known as adaptive sampling, these conformations are used by Folding@home as starting points for a
set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...
of simulation trajectories. As the simulations discover more conformations, the trajectories are restarted from them, and a Markov state model (MSM) is gradually created from this cyclic process. MSMs are
discrete-time In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "po ...
master equation In physics, chemistry and related fields, master equations are used to describe the time evolution of a system that can be modelled as being in a probabilistic combination of states at any given time and the switching between states is determine ...
models which describe a biomolecule's conformational and energy landscape as a set of distinct structures and the short transitions between them. The adaptive sampling Markov state model method significantly increases the efficiency of simulation as it avoids computation inside the local energy minimum itself, and is amenable to volunteer computing (including on GPUGRID) as it allows for the statistical aggregation of short, independent simulation trajectories. The amount of time it takes to construct a Markov state model is inversely proportional to the number of parallel simulations run, i.e., the number of processors available. In other words, it achieves linear
parallelization Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different fo ...
, leading to an approximately four
orders of magnitude An order of magnitude is an approximation of the logarithm of a value relative to some contextually understood reference value, usually 10, interpreted as the base of the logarithm and the representative of values of magnitude one. Logarithmic dis ...
reduction in overall serial calculation time. A completed MSM may contain tens of thousands of sample states from the protein's
phase space In dynamical system theory, a phase space is a space in which all possible states of a system are represented, with each possible state corresponding to one unique point in the phase space. For mechanical systems, the phase space usually ...
(all the conformations a protein can take on) and the transitions between them. The model illustrates folding events and pathways (i.e., routes) and researchers can later use kinetic clustering to view a coarse-grained representation of the otherwise highly detailed model. They can use these MSMs to reveal how proteins misfold and to quantitatively compare simulations with experiments. Between 2000 and 2010, the length of the proteins Folding@home has studied have increased by a factor of four, while its timescales for protein folding simulations have increased by six orders of magnitude. In 2002, Folding@home used Markov state models to complete approximately a million CPU days of simulations over the span of several months, and in 2011, MSMs parallelized another simulation that required an aggregate 10 million CPU hours of computing. In January 2010, Folding@home used MSMs to simulate the dynamics of the slow-folding 32-
residue Residue may refer to: Chemistry and biology * An amino acid, within a peptide chain * Crop residue, materials left after agricultural processes * Pesticide residue, refers to the pesticides that may remain on or in food after they are applied ...
NTL9 protein out to 1.52 milliseconds, a timescale consistent with experimental folding rate predictions but a thousand times longer than formerly achieved. The model consisted of many individual trajectories, each two orders of magnitude shorter, and provided an unprecedented level of detail into the protein's energy landscape. In 2010, Folding@home researcher Gregory Bowman was awarded the Thomas Kuhn Paradigm Shift Award from the
American Chemical Society The American Chemical Society (ACS) is a scientific society based in the United States that supports scientific inquiry in the field of chemistry. Founded in 1876 at New York University, the ACS currently has more than 155,000 members at all d ...
for the development of the
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
MSMBuilder software and for attaining quantitative agreement between theory and experiment. For his work, Pande was awarded the 2012 Michael and Kate Bárány Award for Young Investigators for "developing field-defining and field-changing computational methods to produce leading theoretical models for protein and
RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
folding", and the 2006 Irving Sigal Young Investigator Award for his simulation results which "have stimulated a re-examination of the meaning of both ensemble and single-molecule measurements, making Pande's efforts pioneering contributions to simulation methodology."


Examples of application in biomedical research

Protein misfolding can result in a variety of diseases including Alzheimer's disease,
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
,
Creutzfeldt–Jakob disease Creutzfeldt–Jakob disease (CJD), also known as subacute spongiform encephalopathy or neurocognitive disorder due to prion disease, is an invariably fatal degenerative brain disorder. Early symptoms include memory problems, behavioral changes, ...
,
cystic fibrosis Cystic fibrosis (CF) is a rare genetic disorder that affects mostly the lungs, but also the pancreas, liver, kidneys, and intestine. Long-term issues include difficulty breathing and coughing up mucus as a result of frequent lung infections. O ...
, Huntington's disease,
sickle-cell anemia Sickle cell disease (SCD) is a group of blood disorders typically inherited from a person's parents. The most common type is known as sickle cell anaemia. It results in an abnormality in the oxygen-carrying protein haemoglobin found in red bl ...
, and type II diabetes. Cellular infection by viruses such as
HIV The human immunodeficiency viruses (HIV) are two species of ''Lentivirus'' (a subgroup of retrovirus) that infect humans. Over time, they cause acquired immunodeficiency syndrome (AIDS), a condition in which progressive failure of the immune ...
and
influenza Influenza, commonly known as "the flu", is an infectious disease caused by influenza viruses. Symptoms range from mild to severe and often include fever, runny nose, sore throat, muscle pain, headache, coughing, and fatigue. These symptoms ...
also involve folding events on
cell membrane The cell membrane (also known as the plasma membrane (PM) or cytoplasmic membrane, and historically referred to as the plasmalemma) is a biological membrane that separates and protects the interior of all cells from the outside environment ( ...
s. Once protein misfolding is better understood, therapies can be developed that augment cells' natural ability to regulate protein folding. Such
therapies A therapy or medical treatment (often abbreviated tx, Tx, or Tx) is the attempted remediation of a health problem, usually following a medical diagnosis. As a rule, each therapy has indications and contraindications. There are many different ...
include the use of engineered molecules to alter the production of a given protein, help destroy a misfolded protein, or assist in the folding process. The combination of computational molecular modeling and experimental analysis has the possibility to fundamentally shape the future of molecular medicine and the rational design of therapeutics, such as expediting and lowering the costs of
drug discovery In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered. Historically, drugs were discovered by identifying the active ingredient from traditional remedies or by ...
. The goal of the first five years of Folding@home was to make advances in understanding folding, while the current goal is to understand misfolding and related disease, especially Alzheimer's. The simulations run on Folding@home are used in conjunction with laboratory experiments, but researchers can use them to study how folding ''
in vitro ''In vitro'' (meaning in glass, or ''in the glass'') studies are performed with microorganisms, cells, or biological molecules outside their normal biological context. Colloquially called "test-tube experiments", these studies in biology an ...
'' differs from folding in native cellular environments. This is advantageous in studying aspects of folding, misfolding, and their relationships to disease that are difficult to observe experimentally. For example, in 2011, Folding@home simulated protein folding inside a
ribosomal Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to for ...
exit tunnel, to help scientists better understand how natural confinement and crowding might influence the folding process. Furthermore, scientists typically employ chemical denaturants to unfold proteins from their stable native state. It is not generally known how the denaturant affects the protein's refolding, and it is difficult to experimentally determine if these denatured states contain residual structures which may influence folding behavior. In 2010, Folding@home used GPUs to simulate the unfolded states of Protein L, and predicted its collapse rate in strong agreement with experimental results. The large data sets from the project are freely available for other researchers to use upon request and some can be accessed from the Folding@home website. The Pande lab has collaborated with other molecular dynamics systems such as the
Blue Gene Blue Gene is an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption. The project created three generations of supercomputers, Blue Gene/L, Blue Gene/P, ...
supercomputer, and they share Folding@home's key software with other researchers, so that the algorithms which benefited Folding@home may aid other scientific areas. In 2011, they released the open-source Copernicus software, which is based on Folding@home's MSM and other parallelizing methods and aims to improve the efficiency and scaling of molecular simulations on large
computer cluster A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The comp ...
s or
supercomputer A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
s. Summaries of all scientific findings from Folding@home are posted on the Folding@home website after publication.


Alzheimer's disease

Alzheimer's disease Alzheimer's disease (AD) is a neurodegeneration, neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in short-term me ...
is an incurable
neurodegenerative A neurodegenerative disease is caused by the progressive loss of structure or function of neurons, in the process known as neurodegeneration. Such neuronal damage may ultimately involve cell death. Neurodegenerative diseases include amyotrophic ...
disease which most often affects the elderly and accounts for more than half of all cases of
dementia Dementia is a disorder which manifests as a set of related symptoms, which usually surfaces when the brain is damaged by injury or disease. The symptoms involve progressive impairments in memory, thinking, and behavior, which negatively affe ...
. Its exact cause remains unknown, but the disease is identified as a
protein misfolding disease In medicine, proteinopathy (; 'pref''. protein -pathy 'suff''. disease proteinopathies ''pl''.; proteinopathic ''adj''), or proteopathy, protein conformational disorder, or protein misfolding disease refers to a class of diseases in which cert ...
. Alzheimer's is associated with toxic aggregations of the
amyloid beta Amyloid beta (Aβ or Abeta) denotes peptides of 36–43 amino acids that are the main component of the amyloid plaques found in the brains of people with Alzheimer's disease. The peptides derive from the amyloid precursor protein (APP), which is ...
(Aβ)
peptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. A ...
, caused by Aβ misfolding and clumping together with other Aβ peptides. These Aβ aggregates then grow into significantly larger
senile plaques Amyloid plaques (also known as neuritic plaques, amyloid beta plaques or senile plaques) are extracellular deposits of the amyloid beta (Aβ) protein mainly in the grey matter of the brain. Degenerative neuronal elements and an abundance of micr ...
, a pathological marker of Alzheimer's disease. Due to the heterogeneous nature of these aggregates, experimental methods such as
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
and
nuclear magnetic resonance Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
(NMR) have had difficulty characterizing their structures. Moreover, atomic simulations of Aβ aggregation are highly demanding computationally due to their size and complexity. Preventing Aβ aggregation is a promising method to developing therapeutic drugs for Alzheimer's disease, according to Naeem and Fazili in a
literature review A literature review is an overview of the previously published works on a topic. The term can refer to a full scholarly paper or a section of a scholarly work such as a book, or an article. Either way, a literature review is supposed to provid ...
article. In 2008, Folding@home simulated the dynamics of Aβ aggregation in atomic detail over timescales of the order of tens of seconds. Prior studies were only able to simulate about 10 microseconds. Folding@home was able to simulate Aβ folding for six orders of magnitude longer than formerly possible. Researchers used the results of this study to identify a
beta hairpin The beta hairpin (sometimes also called beta-ribbon or beta-beta unit) is a simple protein structural motif involving two beta strands that look like a hairpin. The motif consists of two strands that are adjacent in primary structure, oriented in ...
that was a major source of molecular interactions within the structure. The study helped prepare the Pande lab for future aggregation studies and for further research to find a small peptide which may stabilize the aggregation process. In December 2008, Folding@home found several small drug candidates which appear to inhibit the toxicity of Aβ aggregates. In 2010, in close cooperation with the Center for Protein Folding Machinery, these drug leads began to be tested on
biological tissue In biology, tissue is a biological organizational level between cells and a complete organ. A tissue is an ensemble of similar cells and their extracellular matrix from the same origin that together carry out a specific function. Organs are ...
. In 2011, Folding@home completed simulations of several
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
s of Aβ that appear to stabilize the aggregate formation, which could aid in the development of therapeutic drug therapies for the disease and greatly assist with experimental
nuclear magnetic resonance spectroscopy Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a spectroscopic technique to observe local magnetic fields around atomic nuclei. The sample is placed in a magnetic fiel ...
studies of Aβ
oligomer In chemistry and biochemistry, an oligomer () is a molecule that consists of a few repeating units which could be derived, actually or conceptually, from smaller molecules, monomers.Quote: ''Oligomer molecule: A molecule of intermediate relativ ...
s. Later that year, Folding@home began simulations of various Aβ fragments to determine how various natural enzymes affect the structure and folding of Aβ.


Huntington's disease

Huntington's disease Huntington's disease (HD), also known as Huntington's chorea, is a neurodegenerative disease that is mostly inherited. The earliest symptoms are often subtle problems with mood or mental abilities. A general lack of coordination and an unst ...
is a neurodegenerative
genetic disorder A genetic disorder is a health problem caused by one or more abnormalities in the genome. It can be caused by a mutation in a single gene (monogenic) or multiple genes (polygenic) or by a chromosomal abnormality. Although polygenic disorders ...
that is associated with protein misfolding and aggregation. Excessive repeats of the
glutamine Glutamine (symbol Gln or Q) is an α-amino acid that is used in the biosynthesis of proteins. Its side chain is similar to that of glutamic acid, except the carboxylic acid group is replaced by an amide. It is classified as a charge-neutral, ...
amino acid at the
N-terminus The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
of the huntingtin protein cause aggregation, and although the behavior of the repeats is not completely understood, it does lead to the cognitive decline associated with the disease. As with other aggregates, there is difficulty in experimentally determining its structure. Scientists are using Folding@home to study the structure of the huntingtin protein aggregate and to predict how it forms, assisting with
rational drug design Drug design, often referred to as rational drug design or simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. The drug is most commonly an organic small molecule that activa ...
methods to stop the aggregate formation. The N17 fragment of the huntingtin protein accelerates this aggregation, and while there have been several mechanisms proposed, its exact role in this process remains largely unknown. Folding@home has simulated this and other fragments to clarify their roles in the disease. Since 2008, its drug design methods for Alzheimer's disease have been applied to Huntington's.


Cancer

More than half of all known cancers involve
mutations In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
of
p53 p53, also known as Tumor protein P53, cellular tumor antigen p53 (UniProt name), or transformation-related protein 53 (TRP53) is a regulatory protein that is often mutated in human cancers. The p53 proteins (originally thought to be, and often s ...
, a
tumor suppressor A tumor suppressor gene (TSG), or anti-oncogene, is a gene that regulates a cell during cell division and replication. If the cell grows uncontrollably, it will result in cancer. When a tumor suppressor gene is mutated, it results in a loss or red ...
protein present in every cell which regulates the
cell cycle The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and subs ...
and signals for
cell death Cell death is the event of a biological cell ceasing to carry out its functions. This may be the result of the natural process of old cells dying and being replaced by new ones, as in programmed cell death, or may result from factors such as dis ...
in the event of damage to DNA. Specific mutations in p53 can disrupt these functions, allowing an abnormal cell to continue growing unchecked, resulting in the development of
tumors A neoplasm () is a type of abnormal and excessive growth of tissue. The process that occurs to form or produce a neoplasm is called neoplasia. The growth of a neoplasm is uncoordinated with that of the normal surrounding tissue, and persists ...
. Analysis of these mutations helps explain the root causes of p53-related cancers. In 2004, Folding@home was used to perform the first molecular dynamics study of the refolding of p53's
protein dimer In biochemistry, a protein dimer is a macromolecular complex formed by two protein monomers, or single proteins, which are usually non-covalently bound. Many macromolecules, such as proteins or nucleic acids, form dimers. The word ''dimer'' ha ...
in an all-atom simulation of water. The simulation's results agreed with experimental observations and gave insights into the refolding of the dimer that were formerly unobtainable. This was the first
peer review Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review ...
ed publication on cancer from a volunteer computing project. The following year, Folding@home powered a new method to identify the amino acids crucial for the stability of a given protein, which was then used to study mutations of p53. The method was reasonably successful in identifying cancer-promoting mutations and determined the effects of specific mutations which could not otherwise be measured experimentally. Folding@home is also used to study protein chaperones,
heat shock protein Heat shock proteins (HSP) are a family of proteins produced by cells in response to exposure to stressful conditions. They were first described in relation to heat shock, but are now known to also be expressed during other stresses including expo ...
s which play essential roles in cell survival by assisting with the folding of other proteins in the
crowded Crowded may refer to: *A place with a crowd * Crowded (song), 2006 song by Jeanie Ortega * "Crowded", a 1969 song by Nazz on Nazz (album) ''Nazz'' is the debut album by American rock group Nazz. It was released in 1968. The album spawned two si ...
and chemically stressful environment within a cell. Rapidly growing cancer cells rely on specific chaperones, and some chaperones play key roles in
chemotherapy Chemotherapy (often abbreviated to chemo and sometimes CTX or CTx) is a type of cancer treatment that uses one or more anti-cancer drugs (chemotherapeutic agents or alkylating agents) as part of a standardized chemotherapy regimen. Chemotherap ...
resistance. Inhibitions to these specific chaperones are seen as potential modes of action for efficient chemotherapy drugs or for reducing the spread of cancer. Using Folding@home and working closely with the Center for Protein Folding Machinery, the Pande lab hopes to find a drug which inhibits those chaperones involved in cancerous cells. Researchers are also using Folding@home to study other molecules related to cancer, such as the enzyme
Src kinase Tyrosine-protein kinase CSK also known as C-terminal Src kinase is an enzyme that, in humans, is encoded by the CSK gene. This enzyme phosphorylates tyrosine residues located in the C-terminal end of Src-family kinases (SFKs) including SRC, HC ...
, and some forms of the engrailed
homeodomain A homeobox is a DNA sequence, around 180 base pairs long, that regulates large-scale anatomical features in the early stages of embryonic development. For instance, mutations in a homeobox may change large-scale anatomical features of the full ...
: a large protein which may be involved in many diseases, including cancer. In 2011, Folding@home began simulations of the dynamics of the small
knottin An inhibitor cystine knot (aka ICK or Knottin) is a protein structural motif containing three disulfide bridges. Knottins are one of three folds in the cystine knot motif; the other closely related knots are the Growth Factor Cystine Knot (GFCK) ...
protein EETI, which can identify
carcinoma Carcinoma is a malignancy that develops from epithelial cells. Specifically, a carcinoma is a cancer that begins in a tissue that lines the inner or outer surfaces of the body, and that arises from cells originating in the endodermal, mesodermal ...
s in imaging scans by binding to surface receptors of cancer cells.
Interleukin 2 Interleukin-2 (IL-2) is an interleukin, a type of cytokine signaling molecule in the immune system. It is a 15.5–16 kDa protein that regulates the activities of white blood cells (leukocytes, often lymphocytes) that are responsible for ...
(IL-2) is a protein that helps
T cell A T cell is a type of lymphocyte. T cells are one of the important white blood cells of the immune system and play a central role in the adaptive immune response. T cells can be distinguished from other lymphocytes by the presence of a T-cell r ...
s of the
immune system The immune system is a network of biological processes that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to parasitic worms, as well as cancer cells and objects such as wood splinte ...
attack pathogens and tumors. However, its use as a cancer treatment is restricted due to serious side effects such as
pulmonary edema Pulmonary edema, also known as pulmonary congestion, is excessive edema, liquid accumulation in the parenchyma, tissue and pulmonary alveolus, air spaces (usually alveoli) of the lungs. It leads to impaired gas exchange and may cause hypoxemia an ...
. IL-2 binds to these pulmonary cells differently than it does to T cells, so IL-2 research involves understanding the differences between these binding mechanisms. In 2012, Folding@home assisted with the discovery of a mutant form of IL-2 which is three hundred times more effective in its immune system role but carries fewer side effects. In experiments, this altered form significantly outperformed natural IL-2 in impeding tumor growth.
Pharmaceutical companies The pharmaceutical industry discovers, develops, produces, and markets drugs or pharmaceutical drugs for use as medications to be administered to patients (or self-administered), with the aim to cure them, vaccinate them, or alleviate symptoms. ...
have expressed interest in the mutant molecule, and the
National Institutes of Health The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
are testing it against a large variety of tumor models to try to accelerate its development as a therapeutic.


Osteogenesis imperfecta

Osteogenesis imperfecta Osteogenesis imperfecta (; OI), colloquially known as brittle bone disease, is a group of genetic disorders that all result in bones that break easily. The range of symptoms—on the skeleton as well as on the body's other organs—may be mi ...
, known as brittle bone disease, is an incurable genetic bone disorder which can be lethal. Those with the disease are unable to make functional connective bone tissue. This is most commonly due to a mutation in
Type-I collagen Type I collagen is the most abundant collagen of the human body. It forms large, eosinophilic fibers known as collagen fibers. It is present in scar tissue, the end product when tissue heals by repair, as well as tendons, ligaments, the endomys ...
, which fulfills a variety of structural roles and is the most abundant protein in
mammal Mammals () are a group of vertebrate animals constituting the class Mammalia (), characterized by the presence of mammary glands which in females produce milk for feeding (nursing) their young, a neocortex (a region of the brain), fur or ...
s. The mutation causes a deformation in collagen's triple helix structure, which if not naturally destroyed, leads to abnormal and weakened bone tissue. In 2005, Folding@home tested a new
quantum mechanical Quantum mechanics is a fundamental theory in physics that provides a description of the physical properties of nature at the scale of atoms and subatomic particles. It is the foundation of all quantum physics including quantum chemistry, qua ...
method that improved upon prior simulation methods, and which may be useful for future computing studies of collagen. Although researchers have used Folding@home to study collagen folding and misfolding, the interest stands as a pilot project compared to Alzheimer's and Huntington's research.


Viruses

Folding@home is assisting in research towards preventing some
virus A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Since Dmitri Ivanovsky's 1 ...
es, such as
influenza Influenza, commonly known as "the flu", is an infectious disease caused by influenza viruses. Symptoms range from mild to severe and often include fever, runny nose, sore throat, muscle pain, headache, coughing, and fatigue. These symptoms ...
and
HIV The human immunodeficiency viruses (HIV) are two species of ''Lentivirus'' (a subgroup of retrovirus) that infect humans. Over time, they cause acquired immunodeficiency syndrome (AIDS), a condition in which progressive failure of the immune ...
, from recognizing and entering
biological cells The cell is the basic structural and functional unit of life forms. Every cell consists of a cytoplasm enclosed within a membrane, and contains many biomolecules such as proteins, DNA and RNA, as well as many small molecules of nutrients an ...
. In 2011, Folding@home began simulations of the dynamics of the enzyme
RNase H Ribonuclease H (abbreviated RNase H or RNH) is a family of non-sequence-specific endonuclease enzymes that catalyze the cleavage of RNA in an RNA/ DNA substrate via a hydrolytic mechanism. Members of the RNase H family can be found in nearly ...
, a key component of HIV, to try to design drugs to deactivate it. Folding@home has also been used to study
membrane fusion A membrane is a selective barrier; it allows some things to pass through but stops others. Such things may be molecules, ions, or other small particles. Membranes can be generally classified into synthetic membranes and biological membranes. B ...
, an essential event for
viral infection A viral disease (or viral infection) occurs when an organism's body is invaded by pathogenic viruses, and infectious virus particles (virions) attach to and enter susceptible cells. Structural Characteristics Basic structural characteristics, s ...
and a wide range of biological functions. This fusion involves
conformational change In biochemistry, a conformational change is a change in the shape of a macromolecule, often induced by environmental factors. A macromolecule is usually flexible and dynamic. Its shape can change in response to changes in its environment or oth ...
s of viral fusion proteins and protein docking, but the exact molecular mechanisms behind fusion remain largely unknown. Fusion events may consist of over a half million atoms interacting for hundreds of microseconds. This complexity limits typical computer simulations to about ten thousand atoms over tens of nanoseconds: a difference of several orders of magnitude. The development of models to predict the mechanisms of membrane fusion will assist in the scientific understanding of how to target the process with antiviral drugs. In 2006, scientists applied Markov state models and the Folding@home network to discover two pathways for fusion and gain other mechanistic insights. Following detailed simulations from Folding@home of small cells known as
vesicles Vesicle may refer to: ; In cellular biology or chemistry * Vesicle (biology and chemistry), a supramolecular assembly of lipid molecules, like a cell membrane * Synaptic vesicle ; In human embryology * Vesicle (embryology), bulge-like features o ...
, in 2007, the Pande lab introduced a new computing method to measure the
topology In mathematics, topology (from the Greek language, Greek words , and ) is concerned with the properties of a mathematical object, geometric object that are preserved under Continuous function, continuous Deformation theory, deformations, such ...
of its structural changes during fusion. In 2009, researchers used Folding@home to study mutations of
influenza hemagglutinin Influenza hemagglutinin (HA) or haemagglutinin .html" ;"title="/sup>">/sup> (British English) is a homotrimeric glycoprotein found on the surface of influenza viruses and is integral to its infectivity. Hemagglutinin is a Class I Fusion Protei ...
, a protein that attaches a virus to its
host A host is a person responsible for guests at an event or for providing hospitality during it. Host may also refer to: Places * Host, Pennsylvania, a village in Berks County People *Jim Host (born 1937), American businessman * Michel Host ...
cell and assists with viral entry. Mutations to hemagglutinin affect how well the protein binds to a host's
cell surface receptor Cell surface receptors (membrane receptors, transmembrane receptors) are receptors that are embedded in the plasma membrane of cells. They act in cell signaling by receiving (binding to) extracellular molecules. They are specialized integral m ...
molecules, which determines how
infective An infection is the invasion of tissue (biology), tissues by pathogens, their multiplication, and the reaction of host (biology), host tissues to the infectious agent and the toxins they produce. An infectious disease, also known as a transmiss ...
the virus strain is to the host organism. Knowledge of the effects of hemagglutinin mutations assists in the development of
antiviral drug Antiviral drugs are a class of medication used for treating viral infections. Most antivirals target specific viruses, while a broad-spectrum antiviral is effective against a wide range of viruses. Unlike most antibiotics, antiviral drugs do n ...
s. As of 2012, Folding@home continues to simulate the folding and interactions of hemagglutinin, complementing experimental studies at the
University of Virginia The University of Virginia (UVA) is a Public university#United States, public research university in Charlottesville, Virginia. Founded in 1819 by Thomas Jefferson, the university is ranked among the top academic institutions in the United S ...
. In March 2020, Folding@home launched a program to assist researchers around the world who are working on finding a cure and learning more about the
coronavirus pandemic The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identified ...
. The initial wave of projects simulate potentially druggable protein targets from SARS-CoV-2 virus, and the related SARS-CoV virus, about which there is significantly more data available.


Drug design

Drug A drug is any chemical substance that causes a change in an organism's physiology or psychology when consumed. Drugs are typically distinguished from food and substances that provide nutritional support. Consumption of drugs can be via insuffla ...
s function by binding to specific locations on target molecules and causing some desired change, such as disabling a target or causing a
conformational change In biochemistry, a conformational change is a change in the shape of a macromolecule, often induced by environmental factors. A macromolecule is usually flexible and dynamic. Its shape can change in response to changes in its environment or oth ...
. Ideally, a drug should act very specifically, and bind only to its target without interfering with other biological functions. However, it is difficult to precisely determine where and how tightly two molecules will bind. Due to limits in computing power, current ''
in silico In biology and other experimental sciences, an ''in silico'' experiment is one performed on computer or via computer simulation. The phrase is pseudo-Latin for 'in silicon' (correct la, in silicio), referring to silicon in computer chips. It ...
'' methods usually must trade speed for
accuracy Accuracy and precision are two measures of ''observational error''. ''Accuracy'' is how close a given set of measurements (observations or readings) are to their ''true value'', while ''precision'' is how close the measurements are to each other ...
; e.g., use rapid protein docking methods instead of computationally costly free energy calculations. Folding@home's computing performance allows researchers to use both methods, and evaluate their efficiency and reliability. Computer-assisted drug design has the potential to expedite and lower the costs of drug discovery. In 2010, Folding@home used MSMs and free energy calculations to predict the native state of the
villin Villin-1 is a 92.5 kDa tissue-specific actin-binding protein associated with the actin core bundle of the brush border. Villin-1 is encoded by the ''VIL1'' gene. Villin-1 contains multiple gelsolin-like domains capped by a small (8.5 kDa) "headp ...
protein to within 1.8
angstrom The angstromEntry "angstrom" in the Oxford online dictionary. Retrieved on 2019-03-02 from https://en.oxforddictionaries.com/definition/angstrom.Entry "angstrom" in the Merriam-Webster online dictionary. Retrieved on 2019-03-02 from https://www.m ...
(Å)
root mean square deviation The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
(RMSD) from the
crystalline structure In crystallography, crystal structure is a description of the ordered arrangement of atoms, ions or molecules in a crystalline material. Ordered structures occur from the intrinsic nature of the constituent particles to form symmetric patterns t ...
experimentally determined through
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
. This accuracy has implications to future
protein structure prediction Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different ...
methods, including for
intrinsically unstructured proteins In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs ran ...
. Scientists have used Folding@home to research
drug resistance Drug resistance is the reduction in effectiveness of a medication such as an antimicrobial or an antineoplastic in treating a disease or condition. The term is used in the context of resistance that pathogens or cancers have "acquired", that is, ...
by studying
vancomycin Vancomycin is a glycopeptide antibiotic medication used to treat a number of bacterial infections. It is recommended intravenously as a treatment for complicated skin infections, bloodstream infections, endocarditis, bone and joint infections, ...
, an antibiotic
drug of last resort A drug of last resort (DoLR), also known as a heroic dose, is a pharmaceutical drug which is tried after all other drug options have failed to produce an adequate response in the patient. Drug resistance, such as antimicrobial resistance or antine ...
, and
beta-lactamase Beta-lactamases, (β-lactamases) are enzymes () produced by bacteria that provide multi-resistance to beta-lactam antibiotics such as penicillins, cephalosporins, cephamycins, monobactams and carbapenems (ertapenem), although carbapenems ...
, a protein that can break down antibiotics like
penicillin Penicillins (P, PCN or PEN) are a group of β-lactam antibiotics originally obtained from ''Penicillium'' moulds, principally '' P. chrysogenum'' and '' P. rubens''. Most penicillins in clinical use are synthesised by P. chrysogenum using ...
. Chemical activity occurs along a protein's
active site In biology and biochemistry, the active site is the region of an enzyme where substrate molecules bind and undergo a chemical reaction. The active site consists of amino acid residues that form temporary bonds with the substrate (binding site) a ...
. Traditional drug design methods involve tightly binding to this site and blocking its activity, under the assumption that the target protein exists in one rigid structure. However, this approach works for approximately only 15% of all proteins. Proteins contain
allosteric site In biochemistry, allosteric regulation (or allosteric control) is the regulation of an enzyme by binding an effector molecule at a site other than the enzyme's active site. The site to which the effector binds is termed the ''allosteric site ...
s which, when bound to by small molecules, can alter a protein's conformation and ultimately affect the protein's activity. These sites are attractive drug targets, but locating them is very computationally costly. In 2012, Folding@home and MSMs were used to identify allosteric sites in three medically relevant proteins: beta-lactamase,
interleukin-2 Interleukin-2 (IL-2) is an interleukin, a type of cytokine signaling molecule in the immune system. It is a 15.5–16  kDa protein that regulates the activities of white blood cells (leukocytes, often lymphocytes) that are responsible fo ...
, and
RNase H Ribonuclease H (abbreviated RNase H or RNH) is a family of non-sequence-specific endonuclease enzymes that catalyze the cleavage of RNA in an RNA/ DNA substrate via a hydrolytic mechanism. Members of the RNase H family can be found in nearly ...
. Approximately half of all known
antibiotic An antibiotic is a type of antimicrobial substance active against bacteria. It is the most important type of antibacterial agent for fighting bacterial infections, and antibiotic medications are widely used in the treatment and prevention of ...
s interfere with the workings of a bacteria's
ribosome Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to ...
, a large and complex biochemical machine that performs
protein biosynthesis Protein biosynthesis (or protein synthesis) is a core biological process, occurring inside cells, balancing the loss of cellular proteins (via degradation or export) through the production of new proteins. Proteins perform a number of critical ...
by
translating Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transl ...
messenger RNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the p ...
into proteins.
Macrolide antibiotics The Macrolides are a class of natural products that consist of a large macrocyclic lactone ring to which one or more deoxy sugars, usually cladinose and desosamine, may be attached. The lactone rings are usually 14-, 15-, or 16-membered. Mac ...
clog the ribosome's exit tunnel, preventing synthesis of essential bacterial proteins. In 2007, the Pande lab received a
grant Grant or Grants may refer to: Places *Grant County (disambiguation) Australia * Grant, Queensland, a locality in the Barcaldine Region, Queensland, Australia United Kingdom *Castle Grant United States * Grant, Alabama *Grant, Inyo County, C ...
to study and design new antibiotics. In 2008, they used Folding@home to study the interior of this tunnel and how specific molecules may affect it. The full structure of the ribosome was determined only as of 2011, and Folding@home has also simulated
ribosomal protein A ribosomal protein (r-protein or rProtein) is any of the proteins that, in conjunction with rRNA, make up the ribosomal subunits involved in the cellular process of translation. ''E. coli'', other bacteria and Archaea have a 30S small subunit an ...
s, as many of their functions remain largely unknown.


Potential applications in biomedical research

There are many more protein misfolding promoted diseases that can be benefited from Folding@home to either discern the misfolded protein structure or the misfolding kinetics, and assist in drug design in the future. The often fatal
prion diseases Prions are Proteinopathy, misfolded proteins that have the ability to transmit their misfolded shape onto normal variants of the same protein. They characterize several fatal and transmissible neurodegenerative diseases in humans and many othe ...
is among the most significant.


Prion diseases

A
prion Prions are misfolded proteins that have the ability to transmit their misfolded shape onto normal variants of the same protein. They characterize several fatal and transmissible neurodegenerative diseases in humans and many other animals. It ...
(PrP) is a
transmembrane A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequentl ...
cellular protein found widely in
eukaryotic cells Eukaryotes () are organisms whose Cell (biology), cells have a cell nucleus, nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the ...
. In mammals, it is more abundant in the
central nervous system The central nervous system (CNS) is the part of the nervous system consisting primarily of the brain and spinal cord. The CNS is so named because the brain integrates the received information and coordinates and influences the activity of all par ...
. Although its function is unknown, its high conservation among species indicates an important role in the cellular function. The conformational change from the normal prion protein (PrPc, stands for cellular) to the disease causing
isoform A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene or gene family and are the result of genetic differences. While many perform the same or similar biological roles, some isof ...
PrPSc (stands for prototypical prion disease–
scrapie Scrapie () is a fatal, degenerative disease affecting the nervous systems of sheep and goats. It is one of several transmissible spongiform encephalopathies (TSEs), and as such it is thought to be caused by a prion. Scrapie has been known since ...
) causes a host of diseases collectly known as
transmissible spongiform encephalopathies Transmissible spongiform encephalopathies (TSEs) are a group of progressive and fatal conditions that are associated with prions and affect the brain and nervous system of many animals, including humans, cattle, and sheep. According to the most ...
(TSEs), including
Bovine spongiform encephalopathy Bovine spongiform encephalopathy (BSE), commonly known as mad cow disease, is an incurable and invariably fatal neurodegenerative disease of cattle. Symptoms include abnormal behavior, trouble walking, and weight loss. Later in the course of t ...
(BSE) in bovine, Creutzfeldt-Jakob disease (CJD) and
fatal insomnia Fatal insomnia is an extremely rare genetic (and even more rarely, sporadic) disorder that results in trouble sleeping as its hallmark symptom. The problems with sleeping typically start out gradually and worsen over time. Eventually, the patien ...
in human,
chronic wasting disease Chronic wasting disease (CWD), sometimes called zombie deer disease, is a transmissible spongiform encephalopathy (TSE) affecting deer. TSEs are a family of diseases thought to be caused by misfolded proteins called prions and include similar dis ...
(CWD) in the deer family. The conformational change is widely accepted as the result of
protein misfolding In medicine, proteinopathy (; 'pref''. protein -pathy 'suff''. disease proteinopathies ''pl''.; proteinopathic ''adj''), or proteopathy, protein conformational disorder, or protein misfolding disease refers to a class of diseases in which certa ...
. What distinguishes TSEs from other protein misfolding diseases is its transmissible nature. The ‘seeding’ of the infectious PrPSc, either arising spontaneously, hereditary or acquired via exposure to contaminated tissues, can cause a chain reaction of transforming normal PrPc into
fibrils Fibrils (from the Latin ''fibra'') are structural biological materials found in nearly all living organisms. Not to be confused with fibers or filaments, fibrils tend to have diameters ranging from 10-100 nanometers (whereas fibers are micro ...
aggregates or
amyloid Amyloids are aggregates of proteins characterised by a Fibril, fibrillar morphology of 7–13 Nanometer, nm in diameter, a beta sheet (β-sheet) Secondary structure of proteins, secondary structure (known as cross-β) and ability to be Staining, ...
like plaques consist of PrPSc. The molecular structure of PrPSc has not been fully characterized due to its aggregated nature. Neither is known much about the mechanism of the protein misfolding nor its
kinetics Kinetics ( grc, κίνησις, , kinesis, ''movement'' or ''to move'') may refer to: Science and medicine * Kinetics (physics), the study of motion and its causes ** Rigid body kinetics, the study of the motion of rigid bodies * Chemical ki ...
. Using the known structure of PrPc and the results of the in vitro and in vivo studies described below, Folding@home could be valuable in elucidating how PrPSc is formed and how the infectious protein arrange themselves to form fibrils and amyloid like plaques, bypassing the requirement to purify PrPSc or dissolve the aggregates. The PrPc has been enzymatically dissociated from the membrane and purified, its structure studied using structure characterization techniques such as
NMR spectroscopy Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a spectroscopic technique to observe local magnetic fields around atomic nuclei. The sample is placed in a magnetic fiel ...
and
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
.
Post-translational Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribosom ...
PrPc has 231
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
(aa) in murine. The molecule consists of a long and unstructured
amino terminal The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
region spanning up to aa residue 121 and a structured carboxy terminal domain. This globular domain harbors two short sheet-forming anti-parallel β-strands (aa 128 to 130 and aa 160 to 162 in murine PrPc) and three
α-helices The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ear ...
(helix I: aa 143 to 153; helix II: aa 171 to 192; helix III: aa 199 to 226 in murine PrPc), Helices II and III are anti-parallel orientated and connected by a short loop. Their structural stability is supported by a
disulfide bridge In biochemistry, a disulfide (or disulphide in British English) refers to a functional group with the structure . The linkage is also called an SS-bond or sometimes a disulfide bridge and is usually derived by the coupling of two thiol groups. In ...
, which is parallel to both sheet-forming β-strands. These α-helices and the β-sheet form the rigid core of the globular domain of PrPc. The disease causing PrPSc is
proteinase K In molecular biology Proteinase K (, ''protease K'', ''endopeptidase K'', ''Tritirachium alkaline proteinase'', ''Tritirachium album serine proteinase'', ''Tritirachium album proteinase K'') is a broad-spectrum serine protease. The enzyme was dis ...
resistant and insoluble. Attempts to purify it from the brains of infected animals invariably yield heterogeneous mixtures and aggregated states that are not amenable to characterization by NMR spectroscopy or X-ray crystallography. However, it is a general consensus that PrPSc contains a high percentage of tightly stacked β-sheets than the normal PrPc that renders the protein insoluble and resistant to proteinase. Using techniques of
cryoelectron microscopy Cryogenic electron microscopy (cryo-EM) is a cryomicroscopy technique applied on samples cooled to cryogenic temperatures. For biological specimens, the structure is preserved by embedding in an environment of vitreous ice. An aqueous sample so ...
and structural modeling based on similar common protein structures, it has been discovered that PrPSc contains ß-sheets in the region of aa 81–95 to aa 171, while the carboxy terminal structure is supposedly preserved, retaining the disulfide-linked α-helical conformation in the normal PrPc. These ß-sheets form a parallel left-handed beta-helix. Three PrPSc molecules are believed to form a primary unit and therefore build the basis for the so-called scrapie-associated fibrils. The catalytic activity depends on the size of the particle. PrPSc particles which consist of only 14-28 PrPc molecules exhibit the highest rate of infectivity and conversion. Despite the difficulty to purify and characterize PrPSc, from the known molecular structure of PrPc and using
transgenic mice A genetically modified mouse or genetically engineered mouse model (GEMM) is a mouse (''Mus musculus'') that has had its genome altered through the use of genetic engineering techniques. Genetically modified mice are commonly used for research or ...
and N-terminal deletion, the potential ‘hot spots’ of protein misfolding leading to the pathogenic PrPSc could be deduced and Folding@home could be of great value in confirming these. Studies found that both the primary and
secondary Secondary may refer to: Science and nature * Secondary emission, of particles ** Secondary electrons, electrons generated as ionization products * The secondary winding, or the electrical or electronic circuit connected to the secondary winding i ...
structure of the prion protein can be of significance of the conversion. There are more than twenty
mutations In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
of the prion protein gene (
PRNP Major prion protein (PrP), is encoded in the human by the ''PRNP'' gene also known as CD230 (cluster of differentiation 230). Expression of the protein is most predominant in the nervous system but occurs in many other tissues throughout the bod ...
) that are known to be associated with or that are directly linked to the hereditary form of human TSEs 6 indicating single amino acids at certain position, likely within the carboxy domain, of the PrPc can affect the susceptibility to TSEs. The post-translational amino terminal region of PrPc consists of residues 23-120 which make up nearly half of the amino sequence of full-length matured PrPc. There are two sections in the amino terminal region that may influence conversion. First, residues 52-90 contains an octapeptide repeat (5 times) region that likely influences the initial binding (via the octapeptide repeats) and also the actual conversion via the second section of aa 108–124. The highly
hydrophobic In chemistry, hydrophobicity is the physical property of a molecule that is seemingly repelled from a mass of water (known as a hydrophobe). In contrast, hydrophiles are attracted to water. Hydrophobic molecules tend to be nonpolar and, th ...
AGAAAAGA is located between aa residue 113 and 120 and is described as putative aggregation site, although this sequence requires its flanking parts to form fibrillar aggregates. In the carboxy globular domain, among the three helices, study show that helix II has a significant higher propensity to β-strand conformation. Due to the high conformational flexvoribility seen between residues 114-125 (part of the unstructured N-terminus chain) and the high β-strand propensity of helix II, only moderate changes in the environmental conditions or interactions might be sufficient to induce misfolding of PrPc and subsequent fibril formation. Other studies of NMR structures of PrPc showed that these residues (~108–189) contain most of the folded domain including both β-strands, the first two α-helices, and the loop/turn regions connecting them, but not the helix III. Small changes within the loop/turn structures of PrPc itself could be important in the conversion as well. In another study, Riek et al. showed that the two small regions of β-strand upstream of the loop regions act as a nucleation site for the conformational conversion of the loop/turn and α-helical structures in PrPc to β-sheet. The energy threshold for the conversion are not necessarily high. The folding stability, i.e. the free energy of a globular protein in its environment is in the range of one or two
hydrogen bonds In chemistry, a hydrogen bond (or H-bond) is a primarily electrostatic force of attraction between a hydrogen (H) atom which is covalently bound to a more electronegative "donor" atom or group (Dn), and another electronegative atom bearing a ...
thus allows the transition to an isoform without the requirement of high transition energy. From the respective of the interactions among the PrPc molecules, hydrophobic interactions play a crucial role in the formation of β-sheets, a hallmark of PrPSc, as the sheets bring fragments of
polypeptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. A p ...
chains into close proximity. Indeed, Kutznetsov and Rackovsky showed that disease-promoting mutations in the human PrPc had a statistically significant tendency towards increasing local hydrophobicity. In vitro experiments showed the kinetics of misfolding has an initial lag phase followed by a rapid growth phase of fibril formation. It is likely that PrPc goes through some intermediate states, such as at least partially unfolded or degraded, before finally ending up as part of an amyloid fibril.


Patterns of participation

Like other
volunteer computing Volunteer computing is a type of distributed computing in which people donate their computers' unused resources to a research-oriented project, and sometimes in exchange for credit points. The fundamental idea behind it is that a modern desktop co ...
projects, Folding@home is type of
distributed computing A distributed system is a system whose components are located on different computer network, networked computers, which communicate and coordinate their actions by message passing, passing messages to one another from any system. Distributed com ...
project. In these projects non-specialists contribute computer processing power or help to analyze data produced by professional scientists. Participants receive little or no obvious reward. Research has been carried out into the motivations of citizen scientists and most of these studies have found that participants are motivated to take part because of altruistic reasons; that is, they want to help scientists and make a contribution to the advancement of their research. Many participants in citizen science have an underlying interest in the topic of the research and gravitate towards projects that are in disciplines of interest to them. Folding@home is no different in that respect. Research carried out recently on over 400 active participants revealed that they wanted to help make a contribution to research and that many had friends or relatives affected by the diseases that the Folding@home scientists investigate. Folding@home attracts participants who are computer hardware enthusiasts. These groups bring considerable expertise to the project and are able to build computers with advanced processing power. Other volunteer computing projects attract these types of participants and projects are often used to benchmark the performance of modified computers, and this aspect of the hobby is accommodated through the competitive nature of the project. Individuals and teams can compete to see who can process the most computer processing units (CPUs). This latest research on Folding@home involving interview and ethnographic observation of online groups showed that teams of hardware enthusiasts can sometimes work together, sharing best practice with regard to maximizing processing output. Such teams can become communities of practice, with a shared language and online culture. This pattern of participation has been observed in other volunteer computing projects. Another key observation of Folding@home participants is that many are male. This has also been observed in other volunteer projects. Furthermore, many participants work in computer and technology-based jobs and careers. Not all Folding@home participants are hardware enthusiasts. Many participants run the project software on unmodified machines and do take part competitively. Over 100,000 participants are involved in Folding@home. However, it is difficult to ascertain what proportion of participants are hardware enthusiasts. Although, according to the project managers, the contribution of the enthusiast community is substantially larger in terms of processing power.


Performance

Supercomputer FLOPS performance is assessed by running the legacy LINPACK benchmark. This short-term testing has difficulty in accurately reflecting sustained performance on real-world tasks because LINPACK more efficiently maps to supercomputer hardware. Computing systems vary in architecture and design, so direct comparison is difficult. Despite this, FLOPS remain the primary speed metric used in supercomputing. In contrast, Folding@home determines its FLOPS using
wall-clock time Elapsed real time, real time, wall-clock time, wall time, or walltime is the actual time taken from the start of a computer program to the end. In other words, it is the difference between the time at which a task finishes and the time at which the ...
by measuring how much time its work units take to complete. On September 16, 2007, due in large part to the participation of PlayStation 3 consoles, the Folding@home project officially attained a sustained performance level higher than one native
petaFLOPS In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
, becoming the first computing system of any kind to do so.
Top500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
's fastest supercomputer at the time was BlueGene/L, at 0.280 petaFLOPS. The following year, on May 7, 2008, the project attained a sustained performance level higher than two native petaFLOPS, followed by the three and four native petaFLOPS milestones in August 2008 and September 28, 2008 respectively. On February 18, 2009, Folding@home achieved five native petaFLOPS, and was the first computing project to meet these five levels. In comparison, November 2008's fastest supercomputer was IBM's
Roadrunner The roadrunners (genus ''Geococcyx''), also known as chaparral birds or chaparral cocks, are two species of fast-running ground cuckoos with long tails and crests. They are found in the southwestern and south-central United States and Mexico, us ...
at 1.105 petaFLOPS. On November 10, 2011, Folding@home's performance exceeded six native petaFLOPS with the equivalent of nearly eight x86 petaFLOPS. In mid-May 2013, Folding@home attained over seven native petaFLOPS, with the equivalent of 14.87 x86 petaFLOPS. It then reached eight native petaFLOPS on June 21, followed by nine on September 9 of that year, with 17.9 x86 petaFLOPS. On May 11, 2016 Folding@home announced that it was moving towards reaching the 100 x86 petaFLOPS mark. Further use grew from increased awareness and participation in the project from the coronavirus pandemic in 2020. On March 20, 2020 Folding@home announced via Twitter that it was running with over 470 native petaFLOPS, the equivalent of 958 x86 petaFLOPS. By March 25 it reached 768 petaFLOPS, or 1.5 x86 exaFLOPS, making it the first exaFLOP computing system. On November 20, 2020 Folding@home only has 0.2 x86 exaFLOPS due to a calculation error.


Points

Similarly to other volunteer computing projects, Folding@home quantitatively assesses user computing contributions to the project through a credit system. All units from a given protein project have uniform base credit, which is determined by benchmarking one or more work units from that project on an official reference machine before the project is released. Each user receives these base points for completing every work unit, though through the use of a passkey they can receive added bonus points for reliably and rapidly completing units which are more demanding computationally or have a greater scientific priority. Users may also receive credit for their work by clients on multiple machines. This point system attempts to align awarded credit with the value of the scientific results. Users can register their contributions under a team, which combine the points of all their members. A user can start their own team, or they can join an existing team. In some cases, a team may have their own community-driven sources of help or recruitment such as an
Internet forum An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are often longer than one line of text, and are at least temporar ...
. The points can foster friendly competition between individuals and teams to compute the most for the project, which can benefit the folding community and accelerate scientific research. Individual and team statistics are posted on the Folding@home website. If a user does not form a new team, or does not join an existing team, that user automatically becomes part of a "Default" team. This "Default" team has a team number of "0". Statistics are accumulated for this "Default" team as well as for specially named teams.


Software

Folding@home software at the user's end involves three primary components: work units, cores, and a client.


Work units

A work unit is the protein data that the client is asked to process. Work units are a fraction of the simulation between the states in a
Markov model In probability theory, a Markov model is a stochastic model used to Mathematical model, model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, i ...
. After the work unit has been downloaded and completely processed by a volunteer's computer, it is returned to Folding@home servers, which then award the volunteer the credit points. This cycle repeats automatically. All work units have associated deadlines, and if this deadline is exceeded, the user may not get credit and the unit will be automatically reissued to another participant. As protein folding occurs serially, and many work units are generated from their predecessors, this allows the overall simulation process to proceed normally if a work unit is not returned after a reasonable period of time. Due to these deadlines, the minimum system requirement for Folding@home is a Pentium 3 450 MHz CPU with
Streaming SIMD Extensions In computing, Streaming SIMD Extensions (SSE) is a single instruction, multiple data (SIMD) instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series of Central processing units (CPUs) ...
(SSE). However, work units for high-performance clients have a much shorter deadline than those for the uniprocessor client, as a major part of the scientific benefit is dependent on rapidly completing simulations. Before public release, work units go through several
quality assurance Quality assurance (QA) is the term used in both manufacturing and service industries to describe the systematic efforts taken to ensure that the product(s) delivered to customer(s) meet with the contractual and other agreed upon performance, design ...
steps to keep problematic ones from becoming fully available. These testing stages include internal, beta, and advanced, before a final full release across Folding@home. Folding@home's work units are normally processed only once, except in the rare event that errors occur during processing. If this occurs for three different users, the unit is automatically pulled from distribution. The Folding@home support forum can be used to differentiate between issues arising from problematic hardware and bad work units.


Cores

Specialized molecular dynamics programs, referred to as "FahCores" and often abbreviated "cores", perform the calculations on the work unit as a
background process A background process is a computer process that runs ''behind the scenes'' (i.e., in the background) and without user intervention. Typical tasks for these processes include logging, system monitoring, scheduling, and user notification. The backgr ...
. A large majority of Folding@home's cores are based on GROMACS, one of the fastest and most popular molecular dynamics software packages, which largely consists of manually optimized
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
code and hardware optimizations. Although GROMACS is
open-source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Op ...
and there is a cooperative effort between the Pande lab and GROMACS developers, Folding@home uses a
closed-source Proprietary software is software that is deemed within the free and open-source software to be non-free because its creator, publisher, or other rightsholder or rightsholder partner exercises a legal monopoly afforded by modern copyright and inte ...
license to help ensure data validity. Less active cores include ProtoMol and SHARPEN. Folding@home has used
AMBER Amber is fossilized tree resin that has been appreciated for its color and natural beauty since Neolithic times. Much valued from antiquity to the present as a gemstone, amber is made into a variety of decorative objects."Amber" (2004). In Ma ...
, CPMD, Desmond, and
TINKER Tinker or tinkerer is an archaic term for an itinerant tinsmith who mends household utensils. Description ''Tinker'' for metal-worker is attested from the thirteenth century as ''tyckner'' or ''tinkler''. Some travelling groups and Romani p ...
, but these have since been retired and are no longer in active service. Some of these cores perform explicit solvation calculations in which the surrounding
solvent A solvent (s) (from the Latin '' solvō'', "loosen, untie, solve") is a substance that dissolves a solute, resulting in a solution. A solvent is usually a liquid but can also be a solid, a gas, or a supercritical fluid. Water is a solvent for ...
(usually water) is modeled atom-by-atom; while others perform
implicit solvation Implicit solvation (sometimes termed continuum solvation) is a method to represent solvent as a continuous medium instead of individual “explicit” solvent molecules, most often used in molecular dynamics simulations and in other applications of ...
methods, where the solvent is treated as a mathematical continuum. The core is separate from the client to enable the scientific methods to be updated automatically without requiring a client update. The cores periodically create calculation checkpoints so that if they are interrupted they can resume work from that point upon startup.


Client

A Folding@home participant installs a
client Client(s) or The Client may refer to: * Client (business) * Client (computing), hardware or software that accesses a remote service on another computer * Customer or client, a recipient of goods or services in return for monetary or other valuabl ...
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Progra ...
on their
personal computer A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tec ...
. The user interacts with the client, which manages the other software components in the background. Through the client, the user may pause the folding process, open an event log, check the work progress, or view personal statistics. The computer clients run continuously in the
background Background may refer to: Performing arts and stagecraft * Background actor * Background artist * Background light * Background music * Background story * Background vocals * ''Background'' (play), a 1950 play by Warren Chetham-Strode Reco ...
at a very low priority, using idle processing power so that normal computer use is unaffected. The maximum CPU use can be adjusted via client settings. The client connects to a Folding@home
server Server may refer to: Computing *Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients Role * Waiting staff, those who work at a restaurant or a bar attending customers and su ...
and retrieves a work unit and may also download the appropriate core for the client's settings, operating system, and the underlying hardware architecture. After processing, the work unit is returned to the Folding@home servers. Computer clients are tailored to
uniprocessor A uniprocessor system is defined as a computer system that has a single central processing unit that is used to execute computer tasks. As more and more modern software is able to make use of multiprocessing architectures, such as SMP and MPP, th ...
and
multi-core processor A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions (such a ...
systems, and
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s. The diversity and power of each
hardware architecture In engineering, hardware architecture refers to the identification of a system's physical components and their interrelationships. This description, often called a hardware design model, allows hardware designers to understand how their compon ...
provides Folding@home with the ability to efficiently complete many types of simulations in a timely manner (in a few weeks or months rather than years), which is of significant scientific value. Together, these clients allow researchers to study biomedical questions formerly considered impractical to tackle computationally. Professional software developers are responsible for most of Folding@home's code, both for the client and server-side. The development team includes programmers from
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
,
ATI Ati or ATI may refer to: * Ati people, a Negrito ethnic group in the Philippines **Ati language (Philippines), the language spoken by this people group ** Ati-Atihan festival, an annual celebration held in the Philippines *Ati language (China), a ...
,
Sony , commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
, and Cauldron Development. Clients can be downloaded only from the official Folding@home website or its commercial partners, and will only interact with Folding@home computer files. They will upload and download data with Folding@home's data servers (over Computer port (software), port 8080, with 80 as an alternate), and the communication is verified using 2048-bit digital signatures. While the client's graphical user interface (GUI) is open-source, the client is proprietary software citing security and scientific integrity as the reasons. However, this rationale of using proprietary software is disputed since while the license could be enforceable in the legal domain retrospectively, it doesn't practically prevent the modification (also known as Patch (computing), patching) of the executable binary files. Likewise, binary-only software, binary-only distribution does not prevent the malicious modification of executable binary-code, either through a man-in-the-middle attack while being downloaded via the internet, or by the redistribution of binaries by a third-party that have been previously modified either in their binary state (i.e. Patch (computing), patched), or by decompiling and recompiling them after modification. These modifications are possible unless the binary files – and the transport channel – are digital signature, signed and the recipient person/system is able to verify the digital signature, in which case unwarranted modifications should be detectable, but not always. Either way, since in the case of Folding@home the input data and output result processed by the client-software are both digitally signed, the integrity of work can be verified independently from the integrity of the client software itself. Folding@home uses the Cosm (software), Cosm software libraries for networking. Folding@home was launched on October 1, 2000, and was the first volunteer computing project aimed at bio-molecular systems. Its first client was a screensaver, which would run while the computer was not otherwise in use. In 2004, the Pande lab collaborated with David P. Anderson to test a supplemental client on the open-source Berkeley Open Infrastructure for Network Computing, BOINC framework. This client was released to closed beta in April 2005; however, the method became unworkable and was shelved in June 2006.


Graphics processing units

The specialized hardware of
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s (GPU) is designed to accelerate rendering of 3-D graphics applications such as video games and can significantly outperform CPUs for some types of calculations. GPUs are one of the most powerful and rapidly growing computing platforms, and many scientists and researchers are pursuing general-purpose computing on graphics processing units (GPGPU). However, GPU hardware is difficult to use for non-graphics tasks and usually requires significant algorithm restructuring and an advanced understanding of the underlying architecture. Such customization is challenging, more so to researchers with limited software development resources. Folding@home uses the
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
OpenMM library (computing), library, which uses a bridge pattern, bridge design pattern with two application programming interface (API) levels to interface molecular simulation software to an underlying hardware architecture. With the addition of hardware optimizations, OpenMM-based GPU simulations need no significant modification but achieve performance nearly equal to hand-tuned GPU code, and greatly outperform CPU implementations. Before 2010, the computing reliability of GPGPU consumer-grade hardware was largely unknown, and circumstantial evidence related to the lack of built-in error detection and correction in GPU memory raised reliability concerns. In the first large-scale test of GPU scientific accuracy, a 2010 study of over 20,000 hosts on the Folding@home network detected soft errors in the memory subsystems of two-thirds of the tested GPUs. These errors strongly correlated to board architecture, though the study concluded that reliable GPU computing was very feasible as long as attention is paid to the hardware traits, such as software-side error detection. The first generation of Folding@home's GPU client (GPU1) was released to the public on October 2, 2006, delivering a 20–30 times speedup for some calculations over its CPU-based GROMACS counterparts. It was the first time GPUs had been used for either volunteer computing or major molecular dynamics calculations. GPU1 gave researchers significant knowledge and experience with the development of General-purpose computing on graphics processing units, GPGPU software, but in response to scientific inaccuracies with DirectX, on April 10, 2008 it was succeeded by GPU2, the second generation of the client. Following the introduction of GPU2, GPU1 was officially retired on June 6. Compared to GPU1, GPU2 was more scientifically reliable and productive, ran on ATI (brand), ATI and CUDA-enabled
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
GPUs, and supported more advanced algorithms, larger proteins, and real-time visualization of the protein simulation. Following this, the third generation of Folding@home's GPU client (GPU3) was released on May 25, 2010. While Backward compatibility, backward compatible with GPU2, GPU3 was more stable, efficient, and flexibile in its scientific abilities, and used OpenMM on top of an OpenCL framework. Although these GPU3 clients did not natively support the operating systems Linux and macOS, Linux users with Nvidia graphics cards were able to run them through the Wine (software), Wine software application. GPUs remain Folding@home's most powerful platform in FLOPS. As of November 2012, GPU clients account for 87% of the entire project's x86 FLOPS throughput. Native support for Nvidia and AMD graphics cards under Linux was introduced with FahCore 17, which uses OpenCL rather than CUDA.


PlayStation 3

From March 2007 until November 2012, Folding@home took advantage of the computing power of PlayStation 3s. At the time of its inception, its main stream processing, streaming Cell (microprocessor), Cell processor delivered a 20 times speed increase over PCs for some calculations, processing power which could not be found on other systems such as the Xbox 360. The PS3's high speed and efficiency introduced other opportunities for worthwhile optimizations according to Amdahl's law, and significantly changed the tradeoff between computing efficiency and overall accuracy, allowing the use of more complex molecular models at little added computing cost. This allowed Folding@home to run biomedical calculations that would have been otherwise infeasible computationally. The PS3 client was developed in a collaborative effort between
Sony , commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
and the Pande lab and was first released as a standalone client on March 23, 2007. Its release made Folding@home the first volunteer computing project to use PS3s. On September 18 of the following year, the PS3 client became a channel of Life with PlayStation on its launch. In the types of calculations it can perform, at the time of its introduction, the client fit in between a CPU's flexibility and a GPU's speed. However, unlike clients running on
personal computer A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tec ...
s, users were unable to perform other activities on their PS3 while running Folding@home. The PS3's uniform console environment made technical support easier and made Folding@home more user friendly. The PS3 also had the ability to stream data quickly to its GPU, which was used for real-time atomic-level visualizing of the current protein dynamics. On November 6, 2012, Sony ended support for the Folding@home PS3 client and other services available under Life with PlayStation. Over its lifetime of five years and seven months, more than 15 million users contributed over 100 million hours of computing to Folding@home, greatly assisting the project with disease research. Following discussions with the Pande lab, Sony decided to terminate the application. Pande considered the PlayStation 3 client a "game changer" for the project.


Multi-core processing client

Folding@home can use the parallel computing abilities of modern multi-core processors. The ability to use several CPU cores simultaneously allows completing the full simulation far faster. Working together, these CPU cores complete single work units proportionately faster than the standard uniprocessor client. This method is scientifically valuable because it enables much longer simulation trajectories to be performed in the same amount of time, and reduces the traditional difficulties of scaling a large simulation to many separate processors. A 2007 publication in the ''Journal of Molecular Biology'' relied on multi-core processing to simulate the folding of part of the
villin Villin-1 is a 92.5 kDa tissue-specific actin-binding protein associated with the actin core bundle of the brush border. Villin-1 is encoded by the ''VIL1'' gene. Villin-1 contains multiple gelsolin-like domains capped by a small (8.5 kDa) "headp ...
protein approximately 10 times longer than was possible with a single-processor client, in agreement with experimental folding rates. In November 2006, first-generation symmetric multiprocessing (SMP) clients were publicly released for open beta testing, referred to as SMP1. These clients used Message Passing Interface (MPI) communication protocols for parallel processing, as at that time the GROMACS cores were Thread-safe, not designed to be used with multiple threads. This was the first time a volunteer computing project had used MPI. Although the clients performed well in Unix-based operating systems such as Linux and macOS, they were troublesome under Microsoft Windows, Windows. On January 24, 2010, SMP2, the second generation of the SMP clients and the successor to SMP1, was released as an open beta and replaced the complex MPI with a more reliable Thread (computer science), thread-based implementation. SMP2 supports a trial of a special category of ''bigadv'' work units, designed to simulate proteins that are unusually large and computationally intensive and have a great scientific priority. These units originally required a minimum of eight CPU cores, which was raised to sixteen later, on February 7, 2012. Along with these added hardware requirements over standard SMP2 work units, they require more system resources such as random-access memory (RAM) and Internet bandwidth. In return, users who run these are rewarded with a 20% increase over SMP2's bonus point system. The bigadv category allows Folding@home to run especially demanding simulations for long times that had formerly required use of supercomputing Computer cluster#Compute clusters, clusters and could not be performed anywhere else on Folding@home. Many users with hardware able to run bigadv units have later had their hardware setup deemed ineligible for bigadv work units when CPU core minimums were increased, leaving them only able to run the normal SMP work units. This frustrated many users who invested significant amounts of money into the program only to have their hardware be obsolete for bigadv purposes shortly after. As a result, Pande announced in January 2014 that the bigadv program would end on January 31, 2015.


V7

The V7 client is the seventh and latest generation of the Folding@home client software, and is a full rewrite and unification of the prior clients for Microsoft Windows, Windows, macOS, and Linux operating systems. It was released on March 22, 2012. Like its predecessors, V7 can run Folding@home in the background at a very low scheduling priority, priority, allowing other applications to use CPU resources as they need. It is designed to make the installation, start-up, and operation more user-friendly for novices, and offer greater scientific flexibility to researchers than prior clients. V7 uses Trac for Bug tracking system, managing its bug tickets so that users can see its development process and provide feedback. V7 consists of four integrated elements. The user typically interacts with V7's open-source Graphical user interface, GUI, named FAHControl. This has Novice, Advanced, and Expert user interface modes, and has the ability to monitor, configure, and control many remote folding clients from one computer. FAHControl directs FAHClient, a backend as a service, back-end application that in turn manages each FAHSlot (or ''slot''). Each slot acts as replacement for the formerly distinct Folding@home v6 uniprocessor, SMP, or GPU computer clients, as it can download, process, and upload work units independently. The FAHViewer function, modeled after the PS3's viewer, displays a real-time 3-D rendering, if available, of the protein currently being processed.


Google Chrome

In 2014, a client for the Google Chrome and Chromium (web browser), Chromium web browsers was released, allowing users to run Folding@home in their web browser. The client used Google's Native Client (NaCl) feature on Chromium-based web browsers to run the Folding@home code at near-native speed in a sandbox (computer security), sandbox on the user's machine. Due to the phasing out of NaCL and changes at Folding@home, the web client was permanently shut down in June 2019.


Android

In July 2015, a client for Android (operating system), Android mobile phones was released on Google Play for devices running Android 4.4 KitKat or newer. On February 16, 2018 the Android client, which was offered in cooperation with Sony Mobile, Sony, was removed from Google Play. Plans were announced to offer an open source alternative in the future.


Comparison to other molecular simulators

Rosetta@home is a volunteer computing project aimed at protein structure prediction and is one of the most accurate tertiary structure predictors. The conformational states from Rosetta's software can be used to initialize a Markov state model as starting points for Folding@home simulations. Conversely, structure prediction algorithms can be improved from thermodynamic and kinetic models and the sampling aspects of protein folding simulations. As Rosetta only tries to predict the final folded state, and not how folding proceeds, Rosetta@home and Folding@home are complementary and address very different molecular questions. Anton (computer), Anton is a special-purpose supercomputer built for molecular dynamics simulations. In October 2011, Anton and Folding@home were the two most powerful molecular dynamics systems. Anton is unique in its ability to produce single ultra-long computationally costly molecular trajectories, such as one in 2010 which reached the millisecond range. These long trajectories may be especially helpful for some types of biochemical problems. However, Anton does not use Markov state models (MSM) for analysis. In 2011, the Pande lab constructed a MSM from two 100-microsecond, µs Anton simulations and found alternative folding pathways that were not visible through Anton's traditional analysis. They concluded that there was little difference between MSMs constructed from a limited number of long trajectories or one assembled from many shorter trajectories. In June 2011 Folding@home added sampling of an Anton simulation in an effort to better determine how its methods compare to Anton's. However, unlike Folding@home's shorter trajectories, which are more amenable to volunteer computing and other parallelizing methods, longer trajectories do not require adaptive sampling to sufficiently sample the protein's
phase space In dynamical system theory, a phase space is a space in which all possible states of a system are represented, with each possible state corresponding to one unique point in the phase space. For mechanical systems, the phase space usually ...
. Due to this, it is possible that a combination of Anton's and Folding@home's simulation methods would provide a more thorough sampling of this space.


See also

* AlphaFold * BOINC * DreamLab, for the use on Smartphones * Foldit * List of volunteer computing projects * Comparison of software for molecular mechanics modeling * Molecular modeling on GPUs * SETI@home * Storage@home * Molecule editor * Volunteer computing * World Community Grid


References


Sources

* * * *


External links

* {{DEFAULTSORT:FoldingatHome Bioinformatics Computational biology Computational chemistry Computer-related introductions in 2000 Cross-platform software Data mining and machine learning software Hidden Markov models Mathematical and theoretical biology Medical technology Medical research organizations Molecular dynamics software Molecular modelling Molecular modelling software PlayStation 3 software Proprietary cross-platform software Protein folds Protein structure Simulation software Science software for Linux Science software for macOS Science software for Windows University of Pennsylvania Volunteer computing projects