Artifact analysis for suspicious or malicious content

As a security analyst, you can look for threats that evaded detection by analyzing reconstructed artifacts, such as files and images. To understand connections between collaborators and artifacts, you can also investigate the links to and from these files and images.

Example - Using artifact analysis to find the source of an attack (patient zero)

John is a security analyst at Replay Industries. Several systems are infected despite all of security measures that are in place. After he identifies and quarantines these systems, John needs to find out how these systems became infected and whether other assets are similarly compromised.

Packet recovery from an IP address

Starting with the IP addresses and the approximate time frame that is involved, John is able to use QRadar® Incident Forensics to recover the relevant packet data.

Forensics Recovery Dialog box — Figure 1. Recovery from an IP address

File analysis

Looking for executable content, John starts by using the file analysis capabilities included within QRadar Incident Forensics. Now he can see a list of all of the files, how often they were sent, whether they contained embedded files or scripts, and their entropy scores. John quickly sees an image file which QRadar Incident Forensics flagged as both suspect content and as having an embedded script.

The file entropy score, which measure the randomness of data and is used to find encrypted malware, and the entropy distribution also clearly show that a portion of the file is not what it should be. Further analysis proves that this file contains a new form of malware that slipped by existing security measures undetected and was responsible for the infected systems.

In the following diagram, entropy is used as an indicator of the variability of bits per byte. Because each character in a data unit consists of 1 byte, the entropy value indicates the variation of the characters and the compressibility of the data unit. Variations in the entropy values in the file might indicate that suspect content is hidden in files. For example, the high entropy values might be an indication that the data is stored encrypted and compressed and the lower values might indicate that at runtime the payload is decrypted and stored in different sections.

File entropy graph that shows embedded scripts. — Figure 3. Example of file entropy graph that shows embedded scripts

John now needs to understand where this file came from and who else might have it. John uses QRadar Incident Forensics to quickly find the web server that supplied the infected image file. The web page in question is popular for broadcasting the most current news for everyone’s favorite NFL team and is compromised. Even though the website contained many images, it was only the one image that John found earlier by using file analysis that contained the embedded malware.

Link analysis to visualize website communication

To determine what other systems might be affected, John uses link analysis to quickly visualize all of the websites that were viewed and despite the large amount of traffic across websites for companies that Replay did business with, a small subset of accesses might clearly be seen to the infected web host. John analyzes these links to see what other servers on his network were used to access this web host.

In his investigation, John uses the nodes in the graph, which represent web pages and the arrows between the nodes represent the relationships or transactions between the web pages to quickly assess traffic patterns and to see how documents were traversed. The larger the node, the more links the document has in its path and the larger the link arrow, the more times that link was used.

Link analysis graph highlights the relationships between web hosts. — Figure 4. Example of file entropy graph that shows embedded scripts

Being a popular NFL news site, it was not surprising to see a number of other servers were in contact with that web host and were potentially affected.

Image analysis

To narrow down which servers downloaded the malicious image file, John switches to image analysis and can quickly see all of the image files that were sent or received.

John quickly confirms that all of his infected servers and 2 servers he was unaware of, had all access the compromised image file.

John also determines that several of the other servers that accessed the same website didn't download the infected file. John now has the information that he needs to quarantine these 2 extra servers, and create a new file hash of the infected file that Replay Industries can upload and shared with others on IBM® X-Force® Exchange.