New sample in from the NYC honeypot – 50/54 detections on Virustotal, so let’s take a look.
UPX is a popular packer used more for compression than security. Packing the malware does obfuscate it, but typically UPX isn’t very hard to figure out compared with certain other packers. We can actually see some stuff “leaking” through the packing:
At the end we see a tiny number of function names, along with .dlls such as WS2_32.dll and others. Along the way I saw bit and pieces of a string that suggest that this malware might construct emails, so maybe this malware is spam-focused.
PEiD confirms this, and also gives us some other info like the entrypoint:
There are various ways you can approach unpacking. Something like UPX can probably be unpacked with an automated tool (which I actually end up doing later on in this analysis), but I actually like unpacking things manually. One thing you can do is open the file in IDA and see if there are any “far” jumps within the code, as this is probably the start of the unpacked code. In this sample, the entry point of the malware was at 4098A0, but a little bit down from there at 409A2C I see this:
Sort of strange to see a jump to a place relatively far away from where we are now. I opened this in Olly and set a memory breakpoint (on access) at 404C50. After this breakpoint was hit the first time and I noticed that code started to appear where there was previously just meaningless data, I set a regular breakpoint and then came back to this area once that was hit:
I’m using a plugin called Ollydump to dump the unpacked process:
We still see some of the same UPX-related junk in the beginning, but looking through the strings, we start to see some really interesting (and unobfuscated) stuff:
Now we can really take a look through this thing.
Running strings on the unpacked sample, one of the first things we come across are what appear to be really bad password choices:
There are some that on first glance appear like they might be not completely horrible choices like “baseball”, but those that look like they might be OK are actually just some silly keyboard patterns, such as “qazwsxedc” which is just the first three alphabetic columns on the left side of an English keyboard. Moving on through the strings, we start to see a few more interesting things such as what appears to be the construction of an IP address and what might be a name given to this process in case it’s run as a service:
Few more pages in, we see some SMTP commands and some strings that look like they are part of an email to make it seem more legitimate, as well as some specific IP addresses:
Going through function names revealed by strings and by PEview reveals some interesting info about how this sample likely operates (meaning, until I observe the sample I can’t just assume that this is definitely what it does regardless of what functions are mentioned or imported):
– I see several functions around opening/copying/deleting files as well as functions related to the temp file path. I always like to see these because it implies that there is going to be some file system change taking place that can be used to both construct a signature and also identify the malware objectives. One string recovered was “Isass.exe” which is supposed to mimic “lsass.exe” – perhaps this sample copies itself somewhere as “Isass.exe” as a stealth measure.
– WaitForSingleObject appears to be called, so there might be a mutex created by this malware that could also form a signature.
– I see multiple functions related to services such as OpenService, CreateService, StartService, DeleteService, that suggest that this might be how the malware achieves persistence and stealth. I saw a few strings earlier that might make for good fake service names to blend in with other, legitimate services running on the host.
– There is an import of WS2_32.dll and several functions such as connect, socket, listen, bind, send/recv, gethostbyname,WSAstartup, inet_addr and so on that suggest that 1) there is a networking component to this malware and 2) since this is a lower-level networking dll, there will probably be some fabrication of traffic info (such as header info) to make the malware traffic blend in better with legitimate traffic, which also helps us identify network signatures (particularly in the case of poorly-written fake headers).
PEview reveals nothing unusual in terms of possible anti-debugging steps (i.e., number of data directories looks fine, no TLS). KANAL detects no known crypto signatures. I saw strings that look like a “normal” set of sections (.text, .data, .rdata, .rsrc) but Resource Hacker wasn’t finding anything. I might try to mess around with this later to rename the sections in the unpacked header, but for now I’m just taking note of this in case it’s useful later.
I tried running this sample multiple times, as a regular user and as administrator, both with a simulated Internet connection and on a real one, but absolutely nothing appeared to happen. The packed sample ran and then exited, while the unpacked sample crashed shortly after execution.
Nothing interesting is coming up in Wireshark, process explorer, autoruns, or anything else. The process monitor data basically shows the malware process being created, some registry lookups (nothing obviously interesting there either), some libraries being loaded, and then the malware terminates.
I’ve seen strings that suggest that this malware could run on various versions of Windows, including Windows 7 which is what I’m running in the analysis VM. Perhaps there is an issue with the Windows environment, but at this point I think that there must be some issue with the malware not liking VirtualBox. I’m going to have to look through it in the disassembly and the debugger to see what seems to be preventing this sample from fully executing.
Before doing this, I tried a couple of other things. One was I ran the unpacked sample through Import Reconstructor (ImpRec) to see if maybe there was just an issue with the way the import table was set following the unpacking.
This didn’t help, the sample still crashes. During a quick glance through the disassembly, and during debugging, I didn’t notice anything checking for artifacts left over from VirtualBox or VMware, but even so I tried terminating all VirtualBox related functions and then rerunning the sample, but it didn’t help. It’s possible that this is just a poorly formed piece of malware that isn’t working right in my environment, but that seems too simple of an explanation, so I need to dig into the disassembly more.
Disassembly and Debugging
Basically what’s observed when debugging is that this sample almost immediately starts to encounter exceptions. My feeling is that this isn’t simply an anti-debugging / anti-analysis technique in place. There absolutely are some things that look suspicious when I run the sample in the debugger (for example, a call to NtRaiseException followed by INT 3, or there MIGHT be something going on with the PEB structure beginning at fs:), but the thing is that this sample doesn’t seem to function at all outside of the debugger either. The unpacked version crashes shortly after execution, and the packed version runs some innocuous functions and then terminates. I see many exceptions taking place, but nothing that appears to look for a VM (either VMware, VirtualBox, or another). I looked through all of the process monitor results to see if the malware checked:
– user/computer name (it checked the username, but this would not indicate a VM)
– registry keys/values (it did check some, but nothing checked would indicate a VM)
– Files/directories associated with VirtualBox or Vms
Nothing was apparent. I checked the unpacked disassembly for:
– CPU instructions (sidt, sgdt, sldt, smsw, str, in, cpuid)
– Timing instructions (rdtsc, GetTickCount, QueryPerformanceCounter)
– GetTickCount is actually seen many times but not in an anti-debugging context
– Checks on running processes, services, or mutexes
– Hardware info checks
– OS info checks (it checks for the Windows version but nothing that would indicate a VM)
– Checks for INT 3 or others such as 0xCD03 (there is a line where 0CCh is moved into AL, but this is part of a coding sub, and not related to anti-debugging)
I’m just not seeing anything that indicates that this thing is checking for a VM or a debugger. One thing I did notice however, was that there seems to be an issue in the code between the packed and unpacked versions of the malware:
Packed location 404C50:
Unpacked location 404C50:
The OR DWORD PTR DS: [EBX+68FF6AEC], 004051C8 instruction doesn’t make sense and immediately starts causing exceptions, which then seems to send the unpacked version into a tailspin. I’ll use the unpacked version just for disassembly, but will continue trying to debug the packed version.
I eventually got UPX and used it to automatically unpack the sample, and the unpacked version is much cleaner than what I had dumped manually. There is a section at 404C50 that pretty much matches exactly what we can see at the RCE Endeavors blog as something that sets a new exception handler which would allow for sneaky code execution. I feel like there has to be something there. Looking at the code that was unpacked by the malware again:
Following along with the article, at 404C53 we see 0FFFFFFFFh being PUSHed and then 004051C8 PUSHed (scope table and the try level). At 404C5A we see PUSH 00404DD0, which is a jump to __except_handler3 (4 being the topmost exception handler, IIRC). Then we see the value at fs: being put into EAX (fs:0 is the start of the TIB, and in my case is 0018FFC4). Then this is PUSHed, and the stack pointer is moved into fs: (which, in my case, is 0018FF78). This is what we see at 0018FF78:
This is something I’ll have to come back to someday, as I can recognize that something is going on here but I just can’t figure out what. On one hand, it makes no sense to me that someone would create and propagate a worm that didn’t work. On the other, there are some design decisions in this sample (like hard coding directories, plaintext domains and IP addresses) that calls into question the quality of the construction. I’m going to concentrate on disassembly of the rest of the unpacked file to try to see how the rest of it works.
404C50 is where we see the shenanigans with the SEH:
Down below at 404D7F we see the call to WinMain.
Inside of WinMain, we see a call to __p____argc and then a comparison, then where the code wants to go is to the left side (i.e., not take the jump):
However, as soon as the call to StartServiceCtrlDispatcherA happens, a non-continuable exception triggers. Following this doesn’t get my anywhere, so I’m going to go back to where this JNZ occurs and make sure that we follow the jump:
Getting to the next block at 402F3B, the jump is taken again so I’ll modify the flags and avoid the jump again. Getting to the call to ds:strncpy, the sample again hits exceptions, so I’m just going to through the disassembly and stop debugging completely for now.
The code on the right is a bit funny because I did get that to execute successfully in the debugger, and you’d think that a malware author would not want the malware to have error messages pop up. This is one of those examples I’m thinking of when I question the design of this sample.
Below all of this, we see a couple more branches but both ultimately call sub 402C30. This sub calls 402050, which reads a file called stm8.inf located in the Windows directory (or we see this being created if it doesn’t exist). After this takes place, we see a call to a sub at 402970 which involves moving lots of data into various registers and then several calls to sprintf to write this stuff to buffers. After those sprintf calls, there are several calls to LoadLibraryA and then several calls to GetProcAddress, so here we’re seeing the sample call several different libraries and functions before moving on to the rest of the code.
Again, if this sub fails, then the code exits. The pattern here is pretty much if any of these parts fail to execute successfully, the code will terminate. This might help explain why the code appears to do nothing when being executed in the VM. Following the success of the previous sub, there’s a call to WSAstartup (version 2.2 requested) and then the code flows to a call to sub 403BF0.
403BF0 is a large sub that begins with moving the byte 88h into BL before storing a string (a total of 0x40 times, based on the value moved into ECX) and moving several more bytes into other offsets:
I opened this in the debugger and patched the code to call 403BF0. This initial block creates the following string in memory:
The next bit of code loops through this string (up to 16 bytes long) and then XORs each byte with 0xEF, which leaves us with the string gmail.com:
This string then gets PUSHed and then sub 403170 is called to work with it. There are several subs nested here, one of the main ones called next is 4049F0. Within sub 4049F0, you see a call to DnsQueryA and then you see several strings being put on the heap:
The strings are:
If 4049F0 is successful, then you jump over the call to 404AF0 which appears to gather network info on the local host and then I presume tries to gather the info above in another way since following this branch would mean that 4049F0 failed. Back in sub 403170, there’s a call to GetTickCount and then a test between AL and 3, then a conditional jump to either exit and return 0 or continue with the function. In my debugger the code continued and took the gmail.com string and passed it to a call to 403070. We see more work being done with the gmail.com string, the heap, and calls to GetTickCount. We then pass the gmail.com string to 403030. More comparisons, more movement of the gmail.com string, and finally the entire 403170 sub returns and we end up at 403C75, where we come upon another interesting set of branches:
The debugger isn’t following the jump (which leads to another XOR decoding). For now I want to see what’s in the XOR branch so I’ll mess with the flags so we go there.
As you can see, it just ends up decoding gmail-smtp-in.l.google.com. Much later on in this branch, we see that string being passed to sub 4031D0. There we see a call to GetHostByName, and if that fails, then the function returns 0 and eventually this branch dies (I’m not connected to the Internet while I am running this instance), but I’ll change the flags and keep this going. Eventually this branch loops back up and runs through the same iterations but for the “alt%d” variations such as alt1,2,etc.
All roads seem to lead to location 403E4E, which checks to see if any of the branches leading up to this point were successful or not – if not, we exit. If they were, then we continue on to the hard-coded IP address blocks.
You can see the string 18.104.22.168 being moved into EDI, and then shortly after there’s a comparison using that string, and then in my case it moves on to the next string which is 173.1944.68.27, and so on until it finds the IP it wants or it runs through all of them. Then, the return is made to 402C30, which we left a long time ago, it feels.
At this point, something weird happens again involving exception handling, and we find ourselves blown out into 7- land (so to speak), but I patched something to get us back to 402C92 which is where I wanted to continue from. We see a call to GetModuleFileNameA (which fails, incidentally) and then a call to GetUserNameA (perhaps to form part of the data used to create the mass emails?) and then the username is passed to strupr to make it all upper case.
We work our way down to 402D79 without any further intervention in the debugger, and it appears we’re in the right place to begin constructing totally legit-looking emails:
But first, a call to 4019C0 where it looks like we have another one of those encoding subs. The first encoding loop here decodes this little string:
ows\\CurrentV. Next loop in this sub decodes:
E\\Micro. Next loop:
soft\\wind, then finally this entire mess gets passed as “SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run” under HKLM to RegOpenKeyExA:
If this operation is successful, then we skip the rest of this sub. If not, then we go on to decode more stuff and do more things, so I’m going to keep going on this branch.
Soon we see a call to LoadLibraryA for advapi32.dll, and then another XOR decoding:
The malware decodes the string RegSetValueExA:
kernel32.dll gets loaded, then the function RegSetValueExA is imported:
Then later we see what appears to be this malware being set up to run at boot under the guide of “Local Security Authority Process”:
We can see this change being made in Process Monitor:
It’s also apparent in Autoruns, though it appears the name of the file isn’t quite there just yet:
Then after all of this, a call to RegCloseKey and then back to the other sub where we can finally get into crafting some email (I hope)!
A few dozen lines in, I see a strange subject line being created:
I also saw the user name string that was converted into all upper case earlier. I accidentally stepped-over one of the subs here at 401E90, but in there you see that and a nested function create and bind a TCP socket and then call listen, looping until the string “google.com” is seen, then back to the previous sub. We see the string “Subject: 0.0.0.0” put in a buffer with a call to sprintf, and some further manipulations of strings. There’s a call to sub 402350, where we see a call to GetHostByName (a deprecated function, according to MSDN, which probably speaks to the age of this sample). If this function fails (which it did in my debugger), then the sub returns, otherwise it builds a string out of an IP address from the GetHostByName call. Soon after this, we see the “Subject: 0.0.0.0\09h\30h\09h\30h” string in its current form being passed to a call to 402170.
Sub 402170 is the sub where the Windows version is obtained and named using one of the strings we observed earlier, e.g. “WinVista”, “WinNt”, “Win2000”, “Unknown”, “WinXp”, “Win2003”, “Win7”. This done via a call to GetVersion and then parsing that info to identify the version. The sample correctly identified the version running in my test environment (Windows 7). Before returning, there is a call to a sub at 4022C0 which is to obtain the local and system time, and then this information is parsed into various pieces such as day, month, year, hour, etc. I imagine for the purpose of constructing the mass emails. Returning from these subs, we see that our subject line continues to change:
After all these changes to the string, including a few more minor ones, we return back to 402E92, and the a few lines later there’s a call to 4033C0 – THIS finally seems to be the construction sub. It begins with a whole lot of byte movements, reminiscent of the XOR encoding though I don’t see that happening here:
After ALL of these many things are MOVed around, we pass through a few conditional jumps that either take the whole mess and return (returning 0) or we do actually get to an XOR decoding loop – I didn’t have to intervene to get there, and the first string decoded is “nbweinf12160”:
So as not to make you sit through every loop like I did earlier, here’s what we get at the end of these three loops:
firstname.lastname@example.org [this email is no longer valid, by the way]
We get kicked over to a call to 403340, which is a sub that tries to set up some networking with one of the hard-coded IPs mentioned earlier, where we also see a call to ioctlsocket (nonblocking).
The debugger wants to follow the code over to where there’s another decoding sub called (402C70), which receives our subject line as an argument. Oddly, this branch re-encodes the subject line that was being constructed and then basically breaks everything down and exits. I’m reloading the VM snapshot and trying the other branch…
Unfortunately, that other branch also died, running into exceptions. Perhaps this is because of the patching I’ve done, maybe things are messed up now. I’m going to just try to get back to the SMTP part of the malware and see how that works.
I patched that last line of code to JMP 403690, so I can see how the SMTP commands are sent. Not too far into that sub we see the first SMTP command being constructed and then a large block of code related to a call to send:
Long story short, we see HELO (google.com) being sent:
Since I’m not really letting this thing connect to the Internet, it tries to exit but I’ll keep intervening to keep it sending info. Next up is another big block of code, all pertaining to a call to send:
We can continue to observe the SMTP commands being sent in this manner. There are numerous conditional jumps throughout the code that necessitate intervention in order to keep going on the path I want. We see a few calls to GetSystemTime and some random number generation before the construction of the command that specifies the destination email address (in this run, no email address was populated):
It seems that this is meant to be a “Microsoft News Letter”:
We eventually see QUIT being sent, and then the socket getting closed:
And then the malware had a meltdown since I had been patching all kinds of stuff to jump around. This is about all I want to really check out in the malware in tandem with the debugger. Some other highlights include:
– Sub 401000 is where the malware appears to install itself by copying itself as the file Isass.exe (a string we saw earlier – it places this file in the %system% directory, though I’d point out that in this file it tends to hard code these areas (e.g., one branch of its code refers to the c:\windows\ directory, rather than using the environment variable)
– Sub 404010 shows that it attempts to run itself as a service called “Windows Genuine Updater”:
– Sub 404110 attempts to open this newly created service and start it
– 4015D0 looks like it might generate IP addresses using a combination of calls to GetTickCount and random numbers
Dynamic Analysis, Revisted
Now that I actually got the sample to do something, I’m going to revisit some of the things I would normally try under dynamic analysis which were unrevealing due to the sample not fully executing.
Taking a look at Wireshark, there actually was almost nothing that seemed like it could have been attributed to the malware except for a DNS resolution to alt1.gmail-smtp-in.l.google.com which was something that was observed when I was forcing execution through the debugger. I suspect that the reason why there isn’t any SMTP traffic being observed is that while I was forcing execution of the malware in Ollydbg, networking was never completely started via a successful call to WSAStartup.
RegShot does show the following change being made to the Registry, as we saw earlier in the debugger and in Autoruns:
HKLM\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Run\Local Security Authority Process: “ %1”
Other than this, nothing else interesting came out of RegShot. Process Monitor also didn’t have anything interesting in it, apart from what was already discussed above.
This was a weird sample to work on because of the issues getting it to run. I don’t have a dedicated physical testing system, otherwise I would have tried it on there to see if I could get it to run properly. I’m torn between whether this was a very sophisticated sample that employed anti-VM techniques that I couldn’t detect (like the SEH shenanigans referenced in the post) or if it was just not well-written and this was causing the execution issues. Just because I didn’t find something doesn’t mean that it wasn’t there, but on the other hand my opinion is that there were several questionable design decisions throughout the sample, so… Not sure on this. This type malware appears to be very old. I’ve seen references to this worm going back as far as 2002, so perhaps this would help explain some of the execution issues also.
I didn’t see anything obvious as far as how the malware gets its email addresses to mail to, however we did see many examples of what appear to be bad passwords hard-coded in the sample. My intuition is that this program tries to harvest email addresses from the host computer. I didn’t see any C2 functionality in here, so I suppose this is sort of a fire-and-forget piece of malware. This sample doesn’t appear to have any clear goal in mind, so maybe it was created for its own sake.
Persistence is attempted via the registry and creating a copy of itself (Isass.exe) that is meant to resemble the lsass.exe file. Persistence wasn’t fully achieved, probably due to the malware not functioning correctly as I forced it through the debugger. I also didn’t see any movement of this sample through a network (or really anything like this in the code), though again, the sample wasn’t functioning 100% so I’m not sure it’s wise to completely rule it out.
Finally, it seems that Pepex is a fairly consistent name for this sample, so no need to name it like I did with BEAR.
Findings and observations:
Mass-mailer worm with execution issues. Design flaws reveal functionality and signatures. This sample was first observed 14+ years ago, but doesn’t seem to have any obvious malicious function besides wasting resources.
Detection is very high for this sample, probably due to age, so up-to-date AV probably would help mitigate this sample. Not opening suspicious files received via email or other routes also stands for this sample. Use of strong passwords (and definitely NOT the very poor examples found in this malware) is advised. Removal can be done by modifying the registry entry for persistence (if successful in the first place) and also the Isass.exe file.
Interesting to see this very old piece of malware, even if it didn’t fully run in the test environment. Not a terribly destructive sample, mostly just annoying.