Friday, 8 July 2016

Is Malware Changing How It Hides Its Comms?

It might sound a bit obvious, but in order for malware to capitalise on its ill-gotten gains it has to communicate with its criminal masters. That very act of phoning home can give away the presence of the malware, and once you know its there you can trace where "home" is.  Not surprising then that malware developers began using encryption (often simple SSL) to obscure their communications amongst the mass of other data that flows on and off any networked system.

Encrypting malware communications has never been entirely successful as the data still presents patterns that can be singled out (more on this in some upcoming research we are due to publish).  We have seen attempts to determine how TLS is being used by malware already.  Plus organisation like SANS have provided advice for some time on how to spot subverted SSL/TLS traffic.  However, it does make it a great deal more difficult to spot and thence locate malware based on network traffic analysis.

Research just published entitled "Deciphering Malware’s use of TLS (without Decryption)" shows how you can now detect malware traffic which uses TLS to attempt to hide.  In essence, it relies upon extracting the following features:

  1. Flow meta data of the sort typically collected by network analysis packages
  2. Sequence of Packet Lengths and Times (SPLT)
  3. Byte distribution - the byte values and the probability distributions are used in the machine learning algorithms to spot abnormalities
  4. Unencrypted TLS header information which is available because TLS is implemented on top of other protocols eg port numbers
The research noted that the use by malware of TLS clients is noticeably different from typical enterprise clients:

Even the simple "Hello" dialogues were different enough to raise a warning.  The paper includes a table of several very useful characteristics for various common pieces of malware:

Similar findings for the TLS servers were found and are reported in the same way. And, using these characteristics the team produced an algorithm that could not only spot the malicious communications but was then able to assign it to a likely family of malware.  Whilst the attribution was not fool proof the results presented suggest it is a major step forward in unmasking these malicious communications.

Sadly, the criminals have known that this has been coming for some time.  It is part of the usual arms race.  What we are beginning to see is the use of steganography in further obscuring not just the malware but also its communications.  There is work already under way in analysing this and countering it.  I would strongly recommend that anyone who wants to keep abreast of this work visits our website on Criminal Use of Information Hiding, an initiative involving academics such as me, and Europol's Centre for Cyber Crime (EC3).

The arms race will continue but we have a fair idea in which direction it is headed.