A domain of unknown function (DUF) is a
protein domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of s ...
that has no characterised function. These families have been collected together in the
Pfam
Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 35.0, was released in November 2021 and contains 19,632 families.
Uses
...
database using the prefix DUF followed by a number, with examples being
DUF2992
The ''yjdF'' RNA motif is a conserved RNA structure identified using bioinformatics. Most ''yjdF'' RNAs are located in bacteria classified within the phylum Bacillota. A ''yjdF'' RNA is found in the presumed 5' untranslated region (5' UTR) of ...
and
DUF1220
The Olduvai domain, known until 2018 as DUF1220 (domain of unknown function 1220) and the NBPF repeat, is a protein domain that shows a striking human lineage-specific (HLS) increase in copy number and appears to be involved in human brain evolu ...
. As of 2019, there are almost 4,000 DUF families within the Pfam database representing over 22% of known families. Some DUFs are not named using the nomenclature due to popular usage but are nevertheless DUFs.
The DUF designation is tentative, and such families tend to be renamed to a more specific name (or merged to an existing domain) after a function is identified.
History
The DUF naming scheme was introduced by Chris Ponting, through the addition of DUF1 and DUF2 to the
SMART database.
These two domains were found to be widely distributed in bacterial signaling proteins. Subsequently, the functions of these domains were identified and they have since been renamed as the
GGDEF domain
In molecular biology, the GGDEF domain is a protein domain which appears to be ubiquitous in bacteria and is often linked to a regulatory domain, such as a phosphorylation receiver or oxygen sensing domain. Its function is to act as a diguanylate ...
and
EAL domain
In molecular biology, the EAL domain is a conserved protein domain. It is found in diverse bacterial signalling proteins. It is named EAL after its conserved residues. The EAL domain may function as a diguanylate phosphodiesterase. The domain con ...
respectively.
Characterisation
Structural genomics
Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome. This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling ...
programmes have attempted to understand the function of DUFs through structure determination. The structures of over 250 DUF families have been solved. This (2009) work showed that about two thirds of DUF families had a structure similar to a previously solved one and therefore likely to be divergent members of existing protein superfamilies, whereas about one third possessed a novel protein fold.
Some DUF families share remote sequence homology with domains that has characterized function. Computational work can be used to link these relationships. A 2015 work was able to assign 20% of the DUFs to characterized structural superfamilies.
Pfam also continuously perform the (manually-verified) assignment in "clan" superfamily entries.
[
]
Frequency and conservation
More than 20% of all protein domains were annotated as DUFs in 2013. About 2,700 DUFs are found in bacteria compared with just over 1,500 in eukaryotes. Over 800 DUFs are shared between bacteria and eukaryotes, and about 300 of these are also present in archaea. A total of 2,786 bacterial Pfam domains even occur in animals, including 320 DUFs.
Role in biology
Many DUFs are highly conserved, indicating an important role in biology. However, many such DUFs are not essential, hence their biological role often remains unknown. For instance, DUF143 is present in most bacteria
Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among ...
and eukaryotic
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
genomes. However, when it was deleted in ''Escherichia coli
''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus ''Escher ...
'' no obvious phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological proper ...
was detected. Later it was shown that the proteins that contain DUF143, are ribosomal
Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to fo ...
silencing factors that block the assembly of the two ribosomal subunits. While this function is not essential, it helps the cells to adapt to low nutrient conditions by shutting down protein biosynthesis. As a result, these proteins and the DUF only become relevant when the cells starve. It is thus believed that many DUFs (or proteins of unknown function, PUFs) are only required under certain conditions.
Essential DUFs
Goodacre et al. identified 238 DUFs in 355 essential proteins (in 16 model bacterial species), most of which represent single-domain proteins, clearly establishing the biological essentiality of DUFs. These DUFs are called "essential DUFs" or eDUFs.
External links
List of Pfam families beginning with the letter D, including DUF families
References
{{DEFAULTSORT:Domain Of Unknown Function
Protein domains