Domain Generation Algorithm
   HOME

TheInfoList



OR:

Domain generation algorithms (DGA) are algorithms seen in various families of
malware Malware (a portmanteau for ''malicious software'') is any software intentionally designed to cause disruption to a computer, server, client, or computer network, leak private information, gain unauthorized access to information or systems, depri ...
that are used to periodically generate a large number of
domain names A domain name is a string that identifies a realm of administrative autonomy, authority or control within the Internet. Domain names are often used to identify services provided through the Internet, such as websites, email services and more. As ...
that can be used as rendezvous points with their command and control servers. The large number of potential rendezvous points makes it difficult for law enforcement to effectively shut down
botnet A botnet is a group of Internet-connected devices, each of which runs one or more bots. Botnets can be used to perform Distributed Denial-of-Service (DDoS) attacks, steal data, send spam, and allow the attacker to access the device and its conn ...
s, since infected computers will attempt to contact some of these domain names every day to receive updates or commands. The use of
public-key cryptography Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...
in malware code makes it unfeasible for law enforcement and other actors to mimic commands from the malware controllers as some worms will automatically reject any updates not signed by the malware controllers. For example, an infected computer could create thousands of domain names such as: ''www..com'' and would attempt to contact a portion of these with the purpose of receiving an update or commands. Embedding the DGA instead of a list of previously-generated (by the command and control servers) domains in the unobfuscated binary of the malware protects against a strings dump that could be fed into a network blacklisting appliance preemptively to attempt to restrict outbound communication from infected hosts within an enterprise. The technique was popularized by the family of worms
Conficker Conficker, also known as Downup, Downadup and Kido, is a computer worm targeting the Microsoft Windows operating system that was first detected in November 2008. It uses flaws in Windows OS software and dictionary attacks on administrator passw ...
.a and .b which, at first generated 250 domain names per day. Starting with Conficker.C, the malware would generate 50,000 domain names every day of which it would attempt to contact 500, giving an infected machine a 1% possibility of being updated every day if the malware controllers registered only one domain per day. To prevent infected computers from updating their malware, law enforcement would have needed to pre-register 50,000 new domain names every day. From the point of view of botnet owner, they only have to register one or a few domains out of the several domains that each bot would query every day. Recently, the technique has been adopted by other malware authors. According to network security firm
Damballa Damballa, also spelled Damballah, Dambala, Dambalah, among other variations ( ht, Danbala), is one of the most important of all loa, spirits in Haitian Voodoo and other African diaspora religious traditions such as Obeah. He is traditionally po ...
, the top-5 most prevalent DGA-based
crimeware Crimeware is a class of malware designed specifically to automate cybercrime. Crimeware (as distinct from spyware and adware) is designed to perpetrate identity theft through social engineering or technical stealth in order to access a comput ...
families are Conficker, Murofet, BankPatch, Bonnana and Bobax as of 2011. DGA can also combine words from a
dictionary A dictionary is a listing of lexemes from the lexicon of one or more specific languages, often arranged alphabetically (or by radical and stroke for ideographic languages), which may include information on definitions, usage, etymologies ...
to generate domains. These dictionaries can be hard-coded in malware or taken from a publicly accessible source. Domains generated by dictionary DGA tend to be more difficult to detect due to their similarity to legitimate domains.


Example

def generate_domain(year: int, month: int, day: int) -> str: """Generate a domain name for the given date.""" domain = "" for i in range(16): year = ((year ^ 8 * year) >> 11) ^ ((year & 0xFFFFFFF0) << 17) month = ((month ^ 4 * month) >> 25) ^ 16 * (month & 0xFFFFFFF8) day = ((day ^ (day << 13)) >> 19) ^ ((day & 0xFFFFFFFE) << 12) domain += chr(((year ^ month ^ day) % 25) + 97) return domain + ".com" For example, on January 7, 2014, this method would generate the domain name intgmxdeadnxuyla.com, while the following day, it would return axwscwsslmiagfah.com. This simple example was in fact used by malware like
CryptoLocker The CryptoLocker ransomware attack was a cyberattack using the ''CryptoLocker'' ransomware that occurred from 5 September 2013 to late May 2014. The attack utilized a trojan that targeted computers running Microsoft Windows, and was believed ...
, before it switched to a more sophisticated variant.


Detection

DGA domain names can be blocked using blacklists, but the coverage of these blacklists is either poor (public blacklists) or wildly inconsistent (commercial vendor blacklists). Detection techniques belong in two main classes: reactionary and real-time. Reactionary detection relies on non-supervised clustering techniques and contextual information like network NXDOMAIN responses,
WHOIS WHOIS (pronounced as the phrase "who is") is a query and response protocol that is widely used for querying databases that store the registered users or assignees of an Internet resource, such as a domain name, an IP address block or an autonomou ...
information, and passive DNS to make an assessment of domain name legitimacy. Recent attempts at detecting DGA domain names with
deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...
techniques have been extremely successful, with
F1 score In statistical analysis of binary classification, the F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precision is the number of true positive results divided by the nu ...
s of over 99%. These deep learning methods typically utilize
LSTM Long short-term memory (LSTM) is an artificial neural network used in the fields of artificial intelligence and deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) c ...
and
CNN CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by ...
architectures, though deep
word embedding In natural language processing (NLP), word embedding is a term used for the representation of words for text analysis, typically in the form of a real-valued vector that encodes the meaning of the word such that the words that are closer in the v ...
s have shown great promise for detecting dictionary DGA. However, these deep learning approaches can be vulnerable to adversarial techniques.


See also

*
Zeus (Trojan horse) Zeus, ZeuS, or Zbot is a Trojan horse malware package that runs on versions of Microsoft Windows. While it can be used to carry out many malicious and criminal tasks, it is often used to steal banking information by man-in-the-browser keystr ...
*
Srizbi botnet Srizbi BotNet is considered one of the world's largest botnets, and responsible for sending out more than half of all the spam being sent by all the major botnets combined. The botnets consist of computers infected by the Srizbi trojan, which sen ...


References


Further reading

* * * {{cite web, url=https://blogs.akamai.com/2018/01/a-death-match-of-domain-generation-algorithms.html, title=A Death Match of Domain Generation Algorithms, author=Hongliang Liu, Yuriy Yuzifovich, publisher=Akamai Technologies, date=2017-12-29, access-date=2019-03-15
DGAs in the Hands of Cyber-Criminals - Examining the state of the art in malware evasion techniques

DGAs and Cyber-Criminals: A Case Study

How Criminals Defend Their Rogue Networks, Abuse.ch
Articles with example Python (programming language) code Botnets