HTB FFModule
FFModule uses malware techniques to obfuscate itself, injects its code into Firefox, and hooks one of its internal routines. While analyzing malware is a difficult task, there are dynamic analysis frameworks such as Sharem that can help a lot. But since I don't like Windows, I decided to solve this one without running the malware in a Windows environment, relying only on unreliable Linux emulators. This made the task harder but also definitely more interesting.
Difficulty : Medium
Category : Reversing
First look
At first inspection, the program is a classic PE executable for 64bit.
~ $ file ffmodule.exe
ffmodule.exe: PE32+ executable for MS Windows 6.00 (console), x86-64, 6 sections
Opening the program into Ghidra shows the first shady thing :

The main function uses a xor obfuscation to hide a part of the binary. Let's retrieve it using Qiling :
from qiling import Qiling
from qiling.const import QL_ARCH, QL_OS, QL_VERBOSE
from unicorn.unicorn_const import UC_MEM_READ_UNMAPPED
SHELLCODE_ADDR = 0x140017000
SHELLCODE_SIZE = 1444
SHELLCODE_DECRYPTED_ADDR = 0x1400011aa
def hook_extractShellcode(ql: Qiling) -> None:
SHELLCODE = bytes(ql.mem.read(SHELLCODE_ADDR, SHELLCODE_SIZE))
open("./shellcode.bin", "wb").write(SHELLCODE)
ql.log.log(QL_VERBOSE.DISABLED, f"shellcode extracted")
if __name__ == "__main__":
ql = Qiling(["./rootfs/x8664_windows/ffmodule.exe"], rootfs="./rootfs/x8664_windows/", verbose=QL_VERBOSE.DISABLED)
ql.hook_address(hook_extractShellcode, SHELLCODE_DECRYPTED_ADDR)
ql.run()
Process Injection
Now that we have "deobfuscated" this payload, let's dive into the second function of the malware :

The decompiled view is straight forward, the program seems to walk through Windows's processes looking for "firefox.exe" and then injects in it the previously unravelled payload. At this point, we'll call this payload the shellcode.
The Shellcode
The first step : PEB Walking and GetProcAddress
Ok, so this is where it gets more difficult. Let's look at the first function of the shellcode, the most important and the most useful to understand:

The preamble of the function gets the register GS at offset 0x30, and then performs a lot of dereferences on it.
GS[0x30]in 64-bit is a global pointer to the TEB (Thread Environment Block) which contains all the information about the current thread.RAX + 0x60gets the PPEB (Process Environment Block) which contains all the information about the current process.RAX + 0x18gets thePPEB_LDR_DATA(The loader data). which contains all the information populated by the loader. All information about the loaded modules is in there.RAX + 0x20gets the fieldInMemoryOrderModuleList, a doubly linked list that points to all modules by load order. This list is stored in a bigger struct calledLDR_DATA_TABLE_ENTRYwhich contains the information about the current module in the list.RAX, qword ptr [RAX]this instruction essentially reads the first field of the linked-list node (theFlinkfield). So this yields the next module in the linked list.RAX, qword ptr [RAX]same instruction, So after this the registerRAXholds the pointer to the third library loaded in memory.RAX + 0x20This one took me a while to understand. At this pointRAXis treated as aLDR_DATA_TABLE_ENTRYstruct. The fieldInMemoryOrderLinksis at offset0x10of this struct; adding0x20points to the offset0x30of theLDR_DATA_TABLE_ENTRY, which is theDllBasefield.
In summary, this first step of the function sets R8 to the DllBase address of the third module loaded in firefox's memory. Let's continue:
R8 + 0x3CtheDllBaseinR8points the base address of the DLL in memory. That base can be casted into the anIMAGE_DOS_HEADER. The offset at0x3Cpoints to thee_lfanewfield, which is the offset to theIMAGE_NT_HEADERcontaining the headers for modern PE files.RAX + 0x88is basically :RDX = (IMAGE_DATA_DIRECTORY *)(RAX->OptionalHeader->DataDirectory[IMAGE_EXPORT_DIRECTORY]). the offset0x88will fetch the section of the DLL which contains all the exported functions.RDX + 0x20finally, this reads theAddressOfNamesfield, which is an array of RVAs (pointers) to the ASCII names of the exported functions.
This method of locating libraries and functions is called PEB walking: we walk through the PEB and related loader structures to retrieve the information the malware needs.

Without going into too much detail, the next step of the function iterates over all the function names in the third DLL (as described above), hashes each name character-by-character, and checks whether the result matches a hardcoded CRC32 value in the malware.
To verify which function is retrieved, let's first check what's the third module loaded by Firefox.
~ $ WINEPATH=./firefox/x8664_windows/Windows/System32 WINEDEBUG=+loaddll wine ./firefox/firefox.exe
002c:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\wineboot.exe" at 0000000140000000: builtin
002c:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\kernelbase.dll" at 00006FFFFF3C0000: builtin
002c:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\kernel32.dll" at 00006FFFFFA10000: builtin
Let's assume it's kernel32.dll, now I want to retrieve all the exports of the DLL, let's do it in python.
import pefile
import sys
if __name__ == "__main__":
pe = pefile.PE(sys.argv[1])
for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
print(exp.name.decode())
~ $ python3 ./dump_pe_exports.py ./firefox/x8664_windows/Windows/System32/kernel32.dll > kernel32_exports
And finally, we need to reimplement the "hashing" algorithm of the malware, to be more coupled with the original implementation, I will use C intrinsecs.
#include <dirent.h>
#include <errno.h>
#include <nmmintrin.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
uint32_t calculate_module_crc(const char *module) {
uint32_t crc = 0xffffffffu;
while (*module) {
crc = _mm_crc32_u8(crc, (unsigned char)*module);
module++;
}
return crc;
}
uint32_t parse_u32(const char *s) {
if (!s)
return 0;
errno = 0;
char *end = NULL;
unsigned long uv = strtoul(s, &end, 0);
if (errno) {
perror("strtoul");
return 0;
}
return (uint32_t)uv;
}
static char *trim_newline_and_space(char *s) {
if (!s)
return s;
// trim newline and trailing CR/LF and spaces/tabs
size_t len = strlen(s);
while (len > 0) {
char c = s[len - 1];
if (c == '\n' || c == '\r' || c == ' ' || c == '\t') {
s[len - 1] = '\0';
len--;
} else
break;
}
// also skip leading spaces/tabs
char *start = s;
while (*start == ' ' || *start == '\t')
start++;
if (start != s)
memmove(s, start, strlen(start) + 1);
return s;
}
int main(int argc, char **argv) {
if (argc != 3) {
fprintf(stderr, "need 2 arguments\n");
fprintf(stderr, "usage : %s <modules_file> <validators_file>\n", argv[0]);
return EXIT_FAILURE;
}
const char *modules_file = argv[1];
const char *validators_file = argv[2];
/* --- Read validators file into dynamic array --- */
FILE *vf = fopen(validators_file, "r");
if (!vf) {
fprintf(stderr, "Could not open validators file '%s': %s\n",
validators_file, strerror(errno));
return EXIT_FAILURE;
}
size_t validators_cap = 32;
size_t validators_count = 0;
uint32_t *validators = malloc(validators_cap * sizeof(uint32_t));
if (!validators) {
perror("malloc");
fclose(vf);
return EXIT_FAILURE;
}
char line[512];
while (fgets(line, sizeof(line), vf)) {
trim_newline_and_space(line);
if (line[0] == '\0')
continue; // skip empty lines
uint32_t v = parse_u32(line);
if (validators_count >= validators_cap) {
size_t newcap = validators_cap * 2;
uint32_t *tmp = realloc(validators, newcap * sizeof(uint32_t));
if (!tmp) {
perror("realloc");
free(validators);
fclose(vf);
return EXIT_FAILURE;
}
validators = tmp;
validators_cap = newcap;
}
validators[validators_count++] = v;
}
fclose(vf);
if (validators_count == 0) {
fprintf(stderr, "No validators loaded from %s\n", validators_file);
free(validators);
return EXIT_FAILURE;
}
/* --- Process modules file and check against validators --- */
FILE *mf = fopen(modules_file, "r");
if (!mf) {
fprintf(stderr, "Could not open modules file '%s': %s\n", modules_file,
strerror(errno));
free(validators);
return EXIT_FAILURE;
}
while (fgets(line, sizeof(line), mf)) {
trim_newline_and_space(line);
if (line[0] == '\0')
continue;
uint32_t crc = calculate_module_crc(line);
bool matched = false;
for (size_t i = 0; i < validators_count; ++i) {
if (crc == validators[i]) {
printf("MATCH: %s -> 0x%08x matches validator 0x%08x (index %zu)\n",
line, crc, validators[i], i);
matched = true;
}
}
}
fclose(mf);
free(validators);
return EXIT_SUCCESS;
}
~ $ ./calculate_functions_hashes ./kernel32_exports ./validators
MATCH: GetProcAddress -> 0xbc553b82 matches validator 0xbc553b82 (index 1)
And we got our function ! Then the function set the address of GetProcAddress to the register R12 and returns.
The second step : PEB Walking and VirtualAlloc
The second step uses the same PEB walking technic to retrieve the kernel32.DllBase into R14, it then calls a function a tiny bit lower in the .text section.

We can see in the disassembler that the first instruction of this function is POP RDX, which alters the stack layout and retrieves the return address. As shown in the disassembler, the malware has stored a string at that address. The malware therefore uses stack obfuscation to store strings; this pattern is repeated multiple times throughout the sample, so I won't mention it again.

This function calls GetProcAddress(Kernel32.DllBase, "VirtualAlloc"), then VirtualAlloc(...) to allocate a custom buffer in memory, the pointer to this buffer is stored on the stack. It then calls the next function.
The third step : nss3.PR_Write and VirtualProtect

This function calls GetProcAddress(Kernel32.DllBase, "VirtualProtect") and puts its result on the stack. Then the rest of the function is yet again some PEB walking and "hashing" method to check for some harcoded CRC32.

While the algorithm is the same, we can see that the offset in InMemoryOrderModuleList is not the same, this time we have 0x50 which retrieves the InMemoryOrderModuleList->BaseDllName->buff (Simply the name of the DLL selected). So the malware is looking for a specific DLL.

To retrieve the DLL the malware is looking, I'll have the same approach as before. First, let's get all Firefox's and Windows's DLLs :
import os
import sys
files = []
for dirpath in sys.argv[1:]:
for root, dirs, filenames in os.walk(dirpath):
for name in filenames:
if name.lower().endswith(".dll"):
#files.append(os.path.basename(name))
files.append(name)
for f in files:
print(f)
~ $ python3 ./get_all_basedllNames.py ./firefox/ ./firefox/x8664_windows/Windows/System32 |sort -u > firefox_all_dllBaseNames
Then, let's make a second C program to handle DLL names;
int main(int argc, char **argv) {
if (argc != 3) {
fprintf(stderr, "need 2 arguments\n");
fprintf(stderr, "usage : %s <modules_file> <validators_file>\n", argv[0]);
return EXIT_FAILURE;
}
const char *modules_file = argv[1];
const char *validators_file = argv[2];
[...SNIP...]
/* --- Process modules file and check against validators --- */
FILE *mf = fopen(modules_file, "r");
if (!mf) {
fprintf(stderr, "Could not open modules file '%s': %s\n", modules_file,
strerror(errno));
free(validators);
return EXIT_FAILURE;
}
while (fgets(line, sizeof(line), mf)) {
trim_newline_and_space(line);
if (line[0] == '\0')
continue;
char lineLower[strlen(line) + 1];
strncpy(lineLower, line, strlen(line)+1);
for (int i = 0; i < strlen(line) - 4; i++)
lineLower[i] = tolower(line[i]);
uint32_t crcLower = calculate_module_crc(lineLower);
char lineUpper[strlen(line) + 1];
strncpy(lineUpper, line, strlen(line)+1);
for (int i = 0; i < strlen(line) - 4; i++)
lineUpper[i] = toupper(line[i]);
uint32_t crcUpper = calculate_module_crc(lineUpper);
char lineFullUpper[strlen(line) + 1];
strncpy(lineFullUpper, line, strlen(line)+1);
for (int i = 0; line[i]; i++)
lineFullUpper[i] = toupper(line[i]);
uint32_t crcFullUpper = calculate_module_crc(lineFullUpper);
bool matched = false;
for (size_t i = 0; i < validators_count; ++i) {
uint32_t v = validators[i];
if (v == crcLower) {
printf("MATCH: %s -> 0x%08x matches validator 0x%08x (index %zu)\n",
lineLower, (unsigned)crcLower, (unsigned)v, i);
}
if (v == crcUpper) {
printf("MATCH: %s -> 0x%08x matches validator 0x%08x (index %zu)\n",
lineUpper, (unsigned)crcUpper, (unsigned)v, i);
}
if (v == crcFullUpper) {
printf("MATCH: %s -> 0x%08x matches validator 0x%08x (index %zu)\n",
lineFullUpper, (unsigned)crcFullUpper, (unsigned)v, i);
}
}
}
fclose(mf);
free(validators);
return EXIT_SUCCESS;
}
~ $ gcc -o calculate_module_hashes calculate_module_hashes.c -msse4.2
~ $ ./calculate_module_hashes ./firefox_all_dllBaseNames ./validators
MATCH: nss3.dll -> 0x779e27f4 matches validator 0x779e27f4 (index 0)
MATCH: NSS3.dll -> 0xc231d75d matches validator 0xc231d75d (index 1)
MATCH: NSS3.DLL -> 0x7c434195 matches validator 0x7c434195 (index 2)
And we got our DLL ! The function then proceeds to set nss3.DllBase into R14. Let's continue with the following function.

This function calls GetProcAddress(nss3.DllBase, "PR_Write") and copies its content into the prevously allocated buffer. PR_Write is an internal function of Firefox that handle IO to files and sockets. The function also calls VirtualProtect(AllocatedBuffer, PAGE_EXECUTE_READ) and VirtualProtect(pPR_Write, PAGE_EXECUTE_READWRITE).
The fourth step : Patching the nss3.PR_Write routine

The end of the payload patches the first bytes of the PR_Write function. To fully understand the result of this patching, I will emulates this part of the shellcode in Qiling and lief.
from qiling import Qiling
from qiling.const import QL_ARCH, QL_OS, QL_VERBOSE
from unicorn import UC_PROT_ALL
from capstone import Cs, CS_ARCH_X86, CS_MODE_64
import lief
def get_PR_Write():
pe: lief.PE.Binary = lief.PE.parse("./firefox/nss3.dll")
pPR_Write_rva = pe.get_export().find_entry("PR_Write").address
pPR_Write = pe.rva_to_offset(pPR_Write_rva)
nss3 = open("./firefox/nss3.dll", "rb")
nss3.seek(pPR_Write)
return nss3.read(0x1000)
SHELLCODE = open("./shellcode.bin", "rb").read()
MARKER = (0x3300730073006e).to_bytes(8, "big")
PR_WRITE_BUFFER = get_PR_Write()
PR_WRITE_ADDRESS = 0x40000000
DEST_BUFFER_ADDRESS = 0x30000000
EMU_START = 0x140000544
EMU_STOP = 0x140000592
cs = Cs(CS_ARCH_X86, CS_MODE_64)
def GetAllocatedBuffer_hook(ql: Qiling) -> None:
print("Hooking rsp request to get malware allocated buffer")
ql.arch.regs.rax = DEST_BUFFER_ADDRESS
# patch rip to the next instruction
code = ql.mem.read(ql.arch.regs.rip, 16)
insns = list(cs.disasm(code, ql.arch.regs.rip, count=1))
ql.arch.regs.rip += insns[0].size if insns else 1
return
def stop_emu_hook(ql: Qiling) -> None:
print("Hooking Stop emulation")
ql.emu_stop()
if __name__ == "__main__":
ql = Qiling(code=SHELLCODE, ostype=QL_OS.WINDOWS, archtype=QL_ARCH.X8664 ,rootfs="./firefox/x8664_windows/", verbose=QL_VERBOSE.DISASM)
# Emulates the pointer returned by GetProcAddr(nss3_DllBase, 'PR_Write');
ql.mem.map(PR_WRITE_ADDRESS, len(PR_WRITE_BUFFER), UC_PROT_ALL, info="Emulates the pointer returned by GetProcAddr(nss3_DllBase, 'PR_Write');")
ql.mem.write(PR_WRITE_ADDRESS, PR_WRITE_BUFFER)
ql.arch.regs.r15 = PR_WRITE_ADDRESS
# Emulates the pointer returned by VirtualAlloc()
ql.mem.map(DEST_BUFFER_ADDRESS, len(PR_WRITE_BUFFER), UC_PROT_ALL, info="Emulates the pointer returned by VirtualAlloc")
ql.mem.write(DEST_BUFFER_ADDRESS, PR_WRITE_BUFFER)
# Hook the stack request
ql.hook_address(GetAllocatedBuffer_hook, 0x140000568)
ql.hook_address(stop_emu_hook, 0x14000058e)
ql.run(begin=EMU_START, end=EMU_STOP)
open("patched_PR_Write.bin", "wb").write(ql.mem.read(PR_WRITE_ADDRESS, 0x1000))
[=] 0000000140000544 [[shellcode] + 0x000544] 41 c7 07 41 55 41 54 mov dword ptr [r15], 0x54415541
[=] 000000014000054b [[shellcode] + 0x00054b] 66 41 c7 47 04 49 bc mov word ptr [r15 + 4], 0xbc49
[=] 0000000140000552 [[shellcode] + 0x000552] e8 05 00 00 00 call 0x14000055c
[=] 000000014000055c [[shellcode] + 0x00055c] 58 pop rax
[=] 000000014000055d [[shellcode] + 0x00055d] 49 89 47 06 mov qword ptr [r15 + 6], rax
[=] 0000000140000561 [[shellcode] + 0x000561] 66 41 c7 47 0e 49 bd mov word ptr [r15 + 0xe], 0xbd49
[=] 0000000140000568 [[shellcode] + 0x000568] 48 8b 04 24 mov rax, qword ptr [rsp]
Hooking rsp request to get malware allocated buffer
[=] 000000014000056c [[shellcode] + 0x00056c] 49 89 47 10 mov qword ptr [r15 + 0x10], rax
[=] 0000000140000570 [[shellcode] + 0x000570] 41 c7 47 18 41 ff e4 cc mov dword ptr [r15 + 0x18], 0xcce4ff41
[=] 0000000140000578 [[shellcode] + 0x000578] 4c 89 f9 mov rcx, r15
[=] 000000014000057b [[shellcode] + 0x00057b] 48 c7 c2 00 10 00 00 mov rdx, 0x1000
[=] 0000000140000582 [[shellcode] + 0x000582] 49 c7 c0 20 00 00 00 mov r8, 0x20
[=] 0000000140000589 [[shellcode] + 0x000589] 4c 8d 4c 24 10 lea r9, [rsp + 0x10]
[=] 000000014000058e [[shellcode] + 0x00058e] 48 81 ec 00 01 00 00 sub rsp, 0x100
Hooking Stop emulation
Let's now inspect the patched function into Ghidra :

The function preamble is simple, it save some registers context and jumps into a sub routine of the shellcode. Let's also have a look at it I guess...
The fifth step : The Malware Hook

The hook first checks if the buffer is bigger than 4 and if it starts by POST, I guess this is a way to only hook POST requests and steal usernames, password, cookies, and such... It then uses the first function we analyzed to retrieve the GetProcAddress function.
The rest of the code uses PEB Walking and CRC32 "hashing" to invoke a specific dll, but I couldn't find which one using static analysis... Anyway in the meantime the shellcode also puts strings on the stack.
1400170c2 158 MOV dword ptr [RSP + Stack[-0xe6]],"atSA"
...
1400170de 0 MOV word ptr [RSP + Stack[0x70]],"SW"
...
140017105 0 MOV word ptr [RSP + Stack[0x76]],"tr"
...
140017115 0 MOV dword ptr [RSP + Stack[0x78]],00h,00h,"pu"
This gives the string : WSAStartup, which is a function of the winsock.h library. This function allocates a struct to further handle windows sockets.

The rest of function effectively allocates an IPV4 UDP socket :
GetProcAddr(WinSock.DllBase, "WSAStartup")
Winsock.WSAStartup("2.2", &wsaData)
GetProcAddr(WinSock.DllBase, "socket")
WinSock.socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)
GetProcAddr(WinSock.DllBase, "sendto")
And here is the last function :)

The function retrieves some data on the stack (in place of RIP), and does some substitutions on it (XOR, ROL, and ADD). Since all theses substitutions are reversible, Let's deobfuscate this data :
package main
func rol8(b byte, n uint) byte {
return (b << n) | (b >> (8 - n))
}
func main() {
encBuf := []byte{
0x54, 0xd5, 0x13, 0x3a, 0x41,
0x99, 0xd4, 0xb6, 0x93, 0x74,
0x19, 0x41, 0x97, 0x61, 0x5a,
0xb6, 0x54, 0x61, 0x61, 0x38,
0x81, 0x98, 0xb7, 0xb6, 0xf4,
0xe1, 0xd8, 0xb9, 0x77, 0x19,
0xf7, 0xfa,
}
for _, e := range encBuf {
byte := rol8(e+0xED, 3) ^ 0x42
print(string(byte))
}
}
And we've got our flag ;)
References
The windows bible
- https://www.geoffchappell.com
Qiling
- https://docs.qiling.io/en/latest/
- https://n1ght-w0lf.github.io/tutorials/qiling-for-malware-analysis-part-2/
Process injection
- https://syrion.me/rustware-part-1-shellcode-process-injection-development/
PEB Walking
- https://www.m0n1x90.dev/blog/peb-teb-eat
- https://yelhamer.github.io/posts/Resolving-Windows-API-Functions-via-the-PEB_LDR_DATA-Structure/
- https://0xrick.github.io/win-internals/pe3/
- https://0xrick.github.io/win-internals/pe4/
- https://stackoverflow.com/questions/68185603/how-image-dos-header-works
- https://dev.to/wireless90/exploring-the-export-table-windows-pe-internals-4l47
- https://sachiel-archangel.medium.com/how-to-analyze-dll-address-acquisition-process-593bc8a54988
Firefox PR_Write function
- https://firefox-source-docs.mozilla.org/nspr/reference/pr_write.html