As detection mechanisms grow more sophisticated, malware developers try new methods to evade. Recently, there has been a growing trend to exfiltrate data and issue commands to malware via the DNS protocol.
DNS Command & Control and DNS exfiltration can be successful because DNS is an integral part of the internet’s infrastructure, and as such the outbound communications on DNS Port 53 from a high number of network hosts and/or servers must be allowed to communicate outside a network. While most traffic analyzers are looking at the reputation of a DNS query’s name or the returned IP address, many are still ignoring misuse and abuse of the DNS protocol itself. Abusing the protocol provides an opportunity for a victim machine, typically the client, to communicate with the operator’s command and control, typically the server, often without even creating a continuous connection between the two. As you will see when we scrutinize the DNS traffic for anomalies, one can easily identify these operations as an effort to build detections.
- DNS can be used for the covert transfer of data and file exfiltration
- Abnormal DNS traffic should be scrutinized for possible misuse at the perimeter
In this post we will explore the use of DNS as a covert channel of data transfer. We will briefly look at the protocol itself to build the base on which to describe the abuse and some limitations an attacker may face. From there, we will delve into several techniques that can be used to transfer data using DNS. Next, we will dig into two malware families that uses DNS for remote access operations. Finally, we will recap the detection methods and checks that can be implemented to protect networks from the abuses described.
How does DNS work?
There are many DNS servers handling DNS queries over the Internet. The Internet’s DNS servers are organized hierarchically and divided to zones.
This organizational structure gives us to ability to find the right DNS server efficiently. For example, if someone on the Internet is searching for an address that belongs to Microsoft, his DNS server queries the higher hierarchal server and it goes up to the ROOT DNS server, which finds for him Microsoft’s DNS server. Then it will find the specific address in Microsoft and respond with the IP of this address. This is all done using the intermediate DNS servers, and the original client is not involved in the process.
How does it help or hurt the attacker?
The simplest method for this attack is the attacker should register a domain and configure a DNS server accordingly so that it will hold the registered domain records it receives, then what will be happen is every time the victim (or anyone on the internet) sends a sub-domain query for a host that belongs to the registered domain, the query eventually will be delivered to the attacker’s DNS server. The data sent from the client (infected machine) goes thru the DNS hierarchy and no direct connection is made between it and the C&C. In short, DNS is used as a proxy between the bot and its operator.
For example, if the attacker bought the domain “malicious.com”, configured a DNS server handling that domain and the victim will query a DNS for “secretmessage.malicious.com” this query will eventually get to the server and that’s the way “secretmessage” is exfiltrated.
One limitation with this method is that the hostname can only contain strings. For example, if the attacker would like to exfiltrate a binary file, he will have to encode the file data from binary to alphanumeric such as using Base58 encoding. (Base58 encoding alphabet is similar to the Base64 alphabet, but without the six characters ‘l’, ‘1’, ‘O’, ‘0’, ‘/’, ‘+’ and the equals sign for padding.) Once the data is encoded the attacker will then send the file and decode it on the receiving end.
Apart from the fact that the data should be only strings, there’s the limitation of the queried hostname – size of 255 bytes. This limitation requires the attacker to split the file data into multiple queries to transmit it. Therefore, sending a large file in a reasonable time will create a burst of DNS queries to the same domain in a significant way that will stick out from normal DNS traffic.
DNS caching is another limitation that could interfere with DNS exfiltration. During exfiltration, the data may have duplications in it, the encoder method will then unknowingly encode two identical queries to be sent in a short period of time. What will happen is that the second of these two identical queries is cached and responded to by one of the intermediate DNS servers, thus causing the second query not reach the attacker. This can cause data to be out of order or missing in the transferred file.
From the attacker’s perspective there are two complementary methods to deal with this limitation. the first one is to set the TTL of the domain records to minimum (depends on the rate the attacker exfiltration of data). The second method is to insert an increasing counter or timestamp with each message before encoding – so that no two messages would be encoded the same – and allow the other side to properly order the messages, and in some fancy implementation, event order a retransmit of lost packets.
What does it look like?
In order to show what DNS exfiltration looks like on network traffic and how easily we humans can detect, we will show you few examples of few open source tools and another example of a script we created ourselves.
The first exfiltration example will be using the script we wrote ourselves, assuming the attacker’s registered domain is mymaldomain.com:
Notice the number of requests to mymaldomain.com is quite high. Also, each request is to a different subdomain. Each subdomain is a long string with high entropy. In comparison to average DNS query. the overall length of the domain is long.
Next let’s look at another example using an open source tool called dns-exfiltration.[i] In this example, the attacker controls the local DNS server or could change the victim’s DNS server to its malicious DNS server. By doing this the attacker can send even non-valid DNS queries.
After analysis of the client machine and its traffic we immediately observed three anomalous indicators. The first indicator was the presence of the unallowable equals sign characters. A second indicator was the high inconsistency between subdomains and also a high entropy as previously mentioned. The third was the DNS server’s IP address was different than all other organization’s workstations in use (in this traffic the DNS was changed on the client rather than taken over by the attacker).
How does the dns-exfiltration tool prepare a file for exfiltration? First it breaks the file into blocks, then it encodes each block in base64 and sends it to the domain defined in the DNS_ZONE variable.
Another interesting concept is if the attacker can sniff DNS data from the infected host, then there’s no need to send the data to a specific domain. the only thing the attacker need to do is choose an encoding method and a way to pick the data from the rest of DNS traffic. This method is demonstrated using an open source tool called dnscat2:[v]
Notice the hard-coded “dnscat” prefix in the subdomain, its presence is so that the sniffer could easily identify the exfiltrated data. The dnscat2 tool allows this prefix to be replaced with any other unique string of the attacker’s choice.
If we look on the amount of DNS requests and take only the ones that have high entropy you could see an obvious spike, in opposite to the tools we showed before, the whole hostname convention doesn’t have to be kept (in internal LAN scenario, for example).
We can identify the above examples mainly due to the number of requests made in a short period of time. But a careful attacker can quite easily work around it by pacing the request rate. Of course, for large data exfiltration this will take a lot longer and hardly practical, but for C&C operations its quite feasible.
DNS Command & Control with Other DNS Type Records
To communicate with a C&C server using DNS requires a two-way communication. Meaning the server must now find a way to transfer commands to the client – something DNS was not designed for. To do this the attacker must use a query for sending data from the victim to the server and the response for the opposite. DNS TXT type records are often a good solution for this case as they can contain free form plaintext data, but any type of record can be used.
One example is the file infection malware that we call WTimeRAT, which was compiled around 2012 and we named after the filename of its Remote Access Tool component ‘wtime32.dll’. WTime starts off using a typical DNS A record domain query to discover its currently assigned IP for C&C, but rather than follow up with a reverse shell TCP connection for two-way communication with the C&C IP address WTimeRAT will send a uniquely named DNS CNAME query. A CNAME query is a special type of DNS query in which the response to the queried name is another name. This unique CNAME query is destined directly to the resolved C&C IP address and not to the victim’s configured DNS server, thus bypassing all hierarchal registration authorities and any intermediary DNS servers that do not know how to answer this unique name request.
Within the DNS CNAME query the Name field contains a customized domain name. This customized domain is an obfuscated version of the victim’s machine name, major and minor windows version numbers in the following format:
The victim’s machine name is obfuscated by three bitwise operations (ADD 0x03, XOR 0x03, ROR 0x03) before being encoded with a base64 encoding using the url-safe alphabet and finally pre-pended to the modified version numbers. All these actions result in a unique per victim domain name request. In the victim shown the unique domain request is “imnqa-mJ6MgqKmrIKipqLAYHBgbGJ6bGxugnxg.g.r”.
Being that this particular DNS request is a CNAME query the C&C server must respond with another domain. The server’s response is another encoded domain “LS4.g.s”. Using the reverse of the encode scheme WTime knows how to decode the server’s responses and interpret them as remote commands for what to do next on the victim machine. When the ‘LS4’ component of the domain name is decoded it results to the ASCII string ‘go’. As an effort to stay in accordance with the DNS protocol the server attaches an ‘A’ record response for the “LS4.g.s” domain to the CNAME response.
Another great example not using TXT type records was found with the recent Ismdoor trojan. Published in a blog[vi] by Dennis Schwarz. These samples communicate with the C&C server using a rich mechanism with many different message formats. It is using AAAA DNS requests in this format: ..dr..c2.com.
The server responds with data structured in the IPv6 AAAA response, for example: 4f6b:2020:2020:2020:2020:2020:2020:2020 decodes to “OK”.
The malware implements an entire transport layer on top of DNS, which allows the author to maintain a continuous connection with the C&C and basically provide a full reverse-shell functionality. Look at the referenced blog for all implemented functions.
The use of ipv6 DNS is another blind spot for DNS since it is rarely used. But still, with techniques provided in this article, it is quite apparent that Ismdoor easily stands out from regular DNS traffic and will be detectable.
On February 8, a report was published detailing UDPOS, a Point-Of-Sale malware that leverages these techniques for command and control activity. These current events further demonstrate how the techniques described here are in active use by malware authors and reinforces our expectation that they will continue to be leveraged in the future.
According to all the examples above the one of the most important thing you should take is the way of detection to these behaviors. DNS exfiltration cases could be detected by identify the following events:
- Count subdomains per domain and raise a flag when one counting above the average
- Validate that the rate of DNS requests per domain fits the average rate
- Check the entropy of each subdomain and domain
- Check DNS request for non-valid characters in the subdomain or the domain
- Verify the DNS server is used by most of the organization
– The Threat Research Team