逆核系列No.18--TLS CallBackFunction

本节涉及PE头的相关知识,附上PE结构图:

参考书《逆向工程核心原理》,示例程序亦出自书本内容。有时间精力的建议买来读读。

TLS是个线程的独立数据存储空间,使用TLS技术==可在线程内部独立使用或修改进程的全局数据或静态数据,就像对待自身的局部变量一样==。

示例程序:HelloTls.exe(32bit)

作用是简单弹框

OD载入程序,发现比EP代码执行更早的TLS代码:

检测当前进程是否处于调试状态,是则弹窗显示‘‘Debugger Detected!’‘随后终止程序运行。

由此可见TLS技术在逆向工程中可用作一种反调试技术使用。



IMGAE_DATA_DURECTORY[9]

若在编程中启用了TLS功能,PE头文件中会设置TLS表项(IMAGE_NT_HEADERS -> IMAGE_OPTIONAL_HEADER -> IMAGE_DATA_DIRECTORY[9])

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//结构体原型
typedef struct _IMAGE_TLS_DIRECTORY64 {
ULONGLONG StartAddressOfRawData;
ULONGLONG EndAddressOfRawData;
PDWORD AddressOfIndex;
PIMAGE_TLS_CALLBACK *AddressOfCallBacks;
DWORD SizeOfZeroFill;
DWORD Characteristics;
} IMAGE_TLS_DIRECTORY64;
typedef IMAGE_TLS_DIRECTORY64 * PIMAGE_TLS_DIRECTORY64;

typedef struct _IMAGE_TLS_DIRECTORY32 {
DWORD StartAddressOfRawData;
DWORD EndAddressOfRawData;
PDWORD AddressOfIndex;
PIMAGE_TLS_CALLBACK *AddressOfCallBacks;
DWORD SizeOfZeroFill;
DWORD Characteristics;
} IMAGE_TLS_DIRECTORY32;
typedef IMAGE_TLS_DIRECTORY32 * PIMAGE_TLS_DIRECTORY32;

#ifdef _WIN64
typedef IMAGE_TLS_DIRECTORY64 IMAGE_TLS_DIRECTORY;
typedef PIMAGE_TLS_DIRECTORY64 PIMAGE_TLS_DIRECTORY;
#else
typedef IMAGE_TLS_DIRECTORY32 IMAGE_TLS_DIRECTORY;
typedef PIMAGE_TLS_DIRECTORY32 PIMAGE_TLS_DIRECTORY;
#endif

示例程序为32bit,PEview下的TLS结构体信息:

较为关键的是AddressOfCallbacks,其中记录着每个TLS函数的地址所构成的数组地址。这里是0x00408114(RVA)所在节区为.rdata,故转化后的RAW地址 = 0x6714,示例程序中只注册了一个TLS函数,0x00401000,可以通过修改PE文件头来增加TLS函数个数。


TLS callback函数原型

1
2
3
4
5
6
typedef VOID
(NTAPI *PIMAGE_TLS_CALLBACK) (
PVOID DllHandle, //模块句柄,标识唤醒该callback函数的单元
DWORD Reason, //唤醒callback的原因,有四种
PVOID Reserved
);
Value Meaning
DLL_PROCESS_ATTACH1 The DLL is being loaded into the virtual address space of the current process as a result of the process starting up or as a result of a call to LoadLibrary. DLLs can use this opportunity to initialize any instance data or to use the TlsAlloc function to allocate a thread local storage (TLS) index. The lpReserved parameter indicates whether the DLL is being loaded statically or dynamically.
DLL_PROCESS_DETACH0 The DLL is being unloaded from the virtual address space of the calling process because it was loaded unsuccessfully or the reference count has reached zero (the processes has either terminated or called FreeLibrary one time for each time it called LoadLibrary). The lpReserved parameter indicates whether the DLL is being unloaded as a result of a FreeLibrary call, a failure to load, or process termination. The DLL can use this opportunity to call the TlsFree function to free any TLS indices allocated by using TlsAlloc and to free any thread local data. Note that the thread that receives the DLL_PROCESS_DETACH notification is not necessarily the same thread that received the DLL_PROCESS_ATTACHnotification.
DLL_THREAD_ATTACH 2 The current process is creating a new thread. When this occurs, the system calls the entry-point function of all DLLs currently attached to the process. The call is made in the context of the new thread. DLLs can use this opportunity to initialize a TLS slot for the thread. A thread calling the DLL entry-point function with DLL_PROCESS_ATTACH does not call the DLL entry-point function with DLL_THREAD_ATTACH. Note that a DLL’s entry-point function is called with this value only by threads created after the DLL is loaded by the process. When a DLL is loaded using LoadLibrary, existing threads do not call the entry-point function of the newly loaded DLL.
DLL_THREAD_DETACH 3 A thread is exiting cleanly. If the DLL has stored a pointer to allocated memory in a TLS slot, it should use this opportunity to free the memory. The system calls the entry-point function of all currently loaded DLLs with this value. The call is made in the context of the exiting thread.

通过一个示例程序展示调用原因不同的TLS callback函数的调用顺序,TlsTest.exe(资料来源书中)

源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <windows.h>

#pragma comment(linker, "/INCLUDE:__tls_used")

void print_console(char* szMsg)
{
HANDLE hStdout = GetStdHandle(STD_OUTPUT_HANDLE);
//之所以不使用printf输出是由于开启特定编译选项编译源程序时,先于诸线程调用执行的TSL回调函数中可能发生Run-Time Error
WriteConsoleA(hStdout, szMsg, strlen(szMsg), NULL, NULL);
}

void NTAPI TLS_CALLBACK1(PVOID DllHandle, DWORD Reason, PVOID Reserved)
{
char szMsg[80] = {0,};
wsprintfA(szMsg, "TLS_CALLBACK1() : DllHandle = %X, Reason = %d\n", DllHandle, Reason);
print_console(szMsg);
}

void NTAPI TLS_CALLBACK2(PVOID DllHandle, DWORD Reason, PVOID Reserved)
{
char szMsg[80] = {0,};
wsprintfA(szMsg, "TLS_CALLBACK2() : DllHandle = %X, Reason = %d\n", DllHandle, Reason);
print_console(szMsg);
}

#pragma data_seg(".CRT$XLX")
PIMAGE_TLS_CALLBACK pTLS_CALLBACKs[] = { TLS_CALLBACK1, TLS_CALLBACK2, 0 };
#pragma data_seg()

DWORD WINAPI ThreadProc(LPVOID lParam)
{
print_console("ThreadProc() start\n");

print_console("ThreadProc() end\n");

return 0;
}

int main(void)
{
HANDLE hThread = NULL;

print_console("main() start\n");

hThread = CreateThread(NULL, 0, ThreadProc, NULL, 0, NULL);
WaitForSingleObject(hThread, 60*1000);
CloseHandle(hThread);

print_console("main() end\n");

return 0;
}

故程序执行顺序如下:

  1. 主线程开始运行,触发DLL_PROCESS_ATTACH 从而callback1、2函数被唤醒,进行内容输出
  2. 随后main函数内容开始执行,输出main() start
  3. main中调用CreateThread,触发DLL_THREAD_ATTACH 从而callback1、2函数被唤醒,进行内容输出(区别于之前的唤醒,这里的唤醒原因不同,第一个是1,这里是2(代号))
  4. 随后ThreadProc子线程执行本体内容,输出ThreadProc() start\n ThreadProc() end\n
  5. 随后子线程return时,子线程消亡,触发DLL_THREAD_DETACH,代号3。callback1、2函数再次被唤醒
  6. 最后main再输出main() end \n 随后main进行return,触发DLL_PROCESS_DETACH 代号0,callback1、2函数再次被唤醒

回调函数的调试

以HelloTls.exe为例:alt + o,打开调试选项,设置OD加载程序的暂停位置为 system breakpoint

载入程序:

在先前查看到的TLS回调函数的位置设置断点:

执行程序,OD停在断点位置,就可以开始调试TLS函数了,对于多个TLS函数,可以分别查看,各自下断点:


修改PE头添加TLS回调函数

  • 为写入代码与数据准备空间(需要挪移原TLS的内容)
    • 方案一:添加节区末尾的空白区域(由于节区对齐关系可能存在映射到内存中存在足够的空白空间供写入)
    • 方案二:增加最后一个节区的大小(影响最后一个节区的节区头信息)
    • 方案三:在最后添加新节区(影响NumberOfSection等PE字段)

采取方案二,先观察PE文件的对齐规则:

在PE文件的最后位置,插入200byte内容(最小对齐FileAlignment的大小)

PE文件最后一个节区的节区头头信息:

将该节区的SizeOfRawData扩容200字节,因此PointerToRawData为0x00009000,SizeOfRawData为400,是符合文件对齐规则的。节区.rsrc的VirtualSize = 0x1B4扩容后VirtualSize = 0x3B4 < SectionAlignment(0x1000),故不需要对节区的VirtualSize做修改。

除了修改最后一个节区(.rsrc)的SizeOfRawData之外还需要修改该节区的Characteristic,往上附加写和执行的权限

IMAGE_SCN_MEM_WRITE0x80000000 The section can be written to.
IMAGE_SCN_MEM_EXECUTE0x20000000 The section can be executed as code.

修改后的characteristic为:0xE0000040

验证上述修改:

扩展出来的空间的RAW = 0x9200 Size = 0x200, RVA = 0xC200

可以看出源程序时没有启用TLS机制的,故为空值(如果原来有数据,则复制一份到新开辟的空间中),这里写入TLS则需要进行设置,首先是TLS表的RAW = 0x9200 RVA = 0xC200, SIze则是写入的TLS结构体个数,这里写入一个则size = 0x18,这是由结构体IMAGE_TLS_DIRECTORY大小决定的,若原来就有数据,则size+18

1
2
3
4
5
6
7
8
9
10
//该结构体有x64版本,示例程序为32bit,占18字节空间
typedef struct _IMAGE_TLS_DIRECTORY32 {
DWORD StartAddressOfRawData;
DWORD EndAddressOfRawData;
PDWORD AddressOfIndex;
PIMAGE_TLS_CALLBACK *AddressOfCallBacks;
DWORD SizeOfZeroFill;
DWORD Characteristics;
} IMAGE_TLS_DIRECTORY32;
typedef IMAGE_TLS_DIRECTORY32 * PIMAGE_TLS_DIRECTORY32;

修改IMAGE_OPTIONAL_HEADER -> DATA_DIRECTOR[9]

验证修改:

最后就是写入TLS的表项内容到开辟出来的空间了

AddressOfCallback = 0x0040C224(RVA) = 0x9224(RAW)对应写入的TLS函数地址构成的链表,以DWORD NULL结尾 0x0040C230,第一个TLS函数对应的RAW = 0x9230,这里是TLS函数的指令代码空间。C2 0C00 的指令码对应的汇编指令时RETN 0C命令,即不执行任何操作,直接返回。(PS:之所以不直接食用RETN是为了进行栈恢复,因为函数有三个参数,参考函数原型)

1
2
3
4
5
6
typedef VOID
(NTAPI *PIMAGE_TLS_CALLBACK) (
PVOID DllHandle,
DWORD Reason,
PVOID Reserved
);

发现上图HxD中的修改无法让程序正常运行,应该是结构体_IMAGE_TLS_DIRECTORY32某些字段没有设置好导致的,搜索下其他字段的含义以及应该写入的值,找到了:(图片来源在图片下方)

因此修改后:

程序可正常运行,使用OD载入程序,并且在写入的TLS的位置设置断点:

OD中运行程序,发现TLS函数被执行:

验证成功后需要修改TLS函数的本体内容即可,这里执行的是RETN 0xC,并没有执行实质性操作。

分析调用TLS函数前在栈中的参数情况

1
2
3
4
5
6
typedef VOID
(NTAPI *PIMAGE_TLS_CALLBACK) (
PVOID DllHandle,
DWORD Reason,
PVOID Reserved
);

用于判断TLS函数调用的理由是第二个参数,运行至此的 ss:[ebp+8]是第二个参数,不清楚的推荐阅读

编辑TLS代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
0040C230    837D 08 01      CMP DWORD PTR SS:[EBP+0x8], 0x1	;获取TLS调用的Reason参数,是否由主线程加载唤醒
0040C234 75 29 JNZ SHORT Hello_Ex.0040C25F ;不是由主线程加载唤醒的则不作为进行return
0040C236 64:A1 30000000 MOV EAX, DWORD PTR FS:[0x30] ;获取PEB信息
0040C23C 8078 02 00 CMP BYTE PTR DS:[EAX+0x2], 0x0 ;判断是否处于被调试状态
0040C240 74 1D JE SHORT Hello_Ex.0040C25F ;不是处于被调试状态则不做操作进行return
0040C242 6A 00 PUSH 0x0 ;进入这里说明程序处于被调试状态,接下来操作则是进行弹窗的调用准备
0040C244 68 70C24000 PUSH Hello_Ex.0040C270 ;入栈存放弹窗的标题字符串所在地址
0040C249 68 80C24000 PUSH Hello_Ex.0040C280 ;入栈存放弹窗内容字符串所在地址
0040C24E 6A 00 PUSH 0x0
0040C250 FF15 E8804000 CALL DWORD PTR DS:[<&USER32.MessageB>; user32.MessageBoxA
0040C256 6A 01 PUSH 0x1
0040C258 FF15 28804000 CALL DWORD PTR DS:[<&KERNEL32.ExitPr>; kernel32.ExitProcess
0040C25E C2 0C00 RETN 0xC

1
2
3
4
5
6
7
//函数原型:
int MessageBoxA(
[in, optional] HWND hWnd,
[in, optional] LPCSTR lpText,
[in, optional] LPCSTR lpCaption,
[in] UINT uType
);

保存汇编指令到可执行文件:Hello_Ex_modify.exe

汇编代码写出来了,接下来则是需要将几个提到的内容在16进制视图下写入:(注意字符串00结尾)

RVA 0x0040C270 -> RAW 0x9270,写入标题字符串:Victory say:

RVA 0x0040C280 -> RAW 0x9280,写入标题字符串:I found Debugger!

其他涉及的call API是通过IAT中获取的,若IAT中没有相应地API导入则会相对麻烦,例子中IAT是有对应的函数的,所以可以直接使用

OD查看字符串写入情况:

查看汇编指令情况:

设置断点,并运行,发现并未如愿:

发现读取参数的位置也不对,另外PEB记录的PEB.BeingDebugged的值为0(图中标错,应该往后一个字节的),本来是附加了调试器的,值应该为1的,感觉也不对。

期间也换了不同版本的OD,结果仍然一致,实验结果复刻不一致。

调试书的作者的版本依旧发现PEB.BeingDebugger的值依旧不正确,值表明程序并没有检测到由调试器的附加

对于出现上述的状况是第三方OD相关插件自己做了规避,使用原版的OD即可正常查看字段

Author: Victory+
Link: https://cvjark.github.io/2022/05/24/逆核系列No-18-TLS-CallBackFunction/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.