逆核系列No.1--PE格式

本节内容

学习逆向难免接触PE文件,尽管当下有许多PE文件格式的分析器,一方面出于深入了解PE文件,一方面由于相关的混淆技术出现,使得学习并掌握PE文件格式变得尤为重要。本文结合示例程序为notepad.exe(Windows自带记事本)来学习PE文件格式,使用到的工具:HxD,notepad.exe。

PS:由于PE文件格式回根据操作系统的位数不同而有些许区别,32位系统对应的PE格式为PE32,64位系统对应的PE格式为PE32+,需要看官先记住这一点。

参考书《逆向工程核心原理》,个人觉得挺优秀的一本书,部分内容也摘录自书中。


正文

下图为notepad.exe载入十六进制编辑器HxD中的结果,可以看出,学习PE格式的目的在于从这些十六进制字符序列中读取出内容,了解其在PE文件中下的解读含义

这里顺带附上PE文件格式的图,在后续的学习中回十分频繁的使用到这个图:

IMAGE_DOS_HEADER

包括DOS头、DOS存根(非必需)、NT头、各节区的头部(描述该节区的权限等属性常见有.text、.data、.rsrc等)

整体占64字节内容,结构体的成员如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
typedef struct _IMAGE_DOS_HEADER {      // DOS .EXE header
WORD e_magic; // Magic number,算是_IMAGE_DOS_HEADER的标识,需要着重记住
WORD e_cblp; // Bytes on last page of file
WORD e_cp; // Pages in file
WORD e_crlc; // Relocations
WORD e_cparhdr; // Size of header in paragraphs
WORD e_minalloc; // Minimum extra paragraphs needed
WORD e_maxalloc; // Maximum extra paragraphs needed
WORD e_ss; // Initial (relative) SS value
WORD e_sp; // Initial SP value
WORD e_csum; // Checksum
WORD e_ip; // Initial IP value
WORD e_cs; // Initial (relative) CS value
WORD e_lfarlc; // File address of relocation table
WORD e_ovno; // Overlay number
WORD e_res[4]; // Reserved words
WORD e_oemid; // OEM identifier (for e_oeminfo)
WORD e_oeminfo; // OEM information; e_oemid specific
WORD e_res2[10]; // Reserved words
LONG e_lfanew; // File address of new exe header,另一个重点的字段,指示NT相对偏移
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

_IMAGE_DOS_HEADER.e_magic是DOS头的签名,用于标识DOS头部的,由于该结构体是由Mark Zbikowski提出,因此签名取自这个小伙子的名字首写MZ

_IMAGE_DOS_HEADER.e_lfanew记录的是PE头后续必须部分的NT头相对于DOS头部(从DOS头的签名开始算起)的偏移(因为二者中间隔了非必需的DOS存根区,偏移存在差异)。

在notepad.exe的例子中,e_lfanew的值=0x000000E0,表示PE头中的NT头在距离DOS头的偏移0xE0处:


DOS存根(可选部分)

IMAGE_NT_HEADERS
1
2
3
4
5
6
7
8
9
10
11
typedef struct _IMAGE_NT_HEADERS64 {		//64bit OS version
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;

typedef struct _IMAGE_NT_HEADERS { //32bit OS version
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER32 OptionalHeader;
} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;

从结构体IMAGE_NT_HEADERS的成员可以看出,该结构体根据OS的位数不同,有所区别,这里先学习32bit版本下的PE32文件格式(64bit的版本对应PE32+)

_IMAGE_NT_HEADERS.Signature

是NT头部的标识,记录为50 45 00 00(小端序)

_IMAGE_NT_HEADERS.FileHeader

对应结构体_IMAGE_FILE_HEADER

1
2
3
4
5
6
7
8
9
typedef struct _IMAGE_FILE_HEADER {
WORD Machine;
WORD NumberOfSections;
DWORD TimeDateStamp;
DWORD PointerToSymbolTable;
DWORD NumberOfSymbols;
WORD SizeOfOptionalHeader;
WORD Characteristics;
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;

其中重要的成员有4个,一旦这几个出现错误的内容则会导致文件==无法正常运行==

_IMAGE_FILE_HEADER.Machine:标识CPU的唯一值,有特定值对应,上述例子的Machine的值=0x014c 对应Intel i386

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#define IMAGE_FILE_MACHINE_UNKNOWN           0
#define IMAGE_FILE_MACHINE_I386 0x014c // Intel 386.
#define IMAGE_FILE_MACHINE_R3000 0x0162 // MIPS little-endian, 0x160 big-endian
#define IMAGE_FILE_MACHINE_R4000 0x0166 // MIPS little-endian
#define IMAGE_FILE_MACHINE_R10000 0x0168 // MIPS little-endian
#define IMAGE_FILE_MACHINE_WCEMIPSV2 0x0169 // MIPS little-endian WCE v2
#define IMAGE_FILE_MACHINE_ALPHA 0x0184 // Alpha_AXP
#define IMAGE_FILE_MACHINE_POWERPC 0x01F0 // IBM PowerPC Little-Endian
#define IMAGE_FILE_MACHINE_SH3 0x01a2 // SH3 little-endian
#define IMAGE_FILE_MACHINE_SH3E 0x01a4 // SH3E little-endian
#define IMAGE_FILE_MACHINE_SH4 0x01a6 // SH4 little-endian
#define IMAGE_FILE_MACHINE_ARM 0x01c0 // ARM Little-Endian
#define IMAGE_FILE_MACHINE_THUMB 0x01c2
#define IMAGE_FILE_MACHINE_IA64 0x0200 // Intel 64
#define IMAGE_FILE_MACHINE_MIPS16 0x0266 // MIPS
#define IMAGE_FILE_MACHINE_MIPSFPU 0x0366 // MIPS
#define IMAGE_FILE_MACHINE_MIPSFPU16 0x0466 // MIPS
#define IMAGE_FILE_MACHINE_ALPHA64 0x0284 // ALPHA64
#define IMAGE_FILE_MACHINE_AXP64 IMAGE_FILE_MACHINE_ALPHA64

_IMAGE_FILE_HEADER.NumberOfSection:标记存在的节区数,该值需要大于0且与实际的节区数一致,上述例子的NumberOfSection值=0x0004 表示存在4个节区

_IMAGE_FILE_HEADER.SizeOfOptionalHeader:指出NT头结构中最后一个成员optionalHeader的大小,optionalHeader对应的结构体是IMAGE_OPTIONAL_HEADER,根据系统位数不同,其大小已经确定,只是在载入文件时需要查看该值来确定OptionalHeader,PE32中该值=0x00E0,该字段用来说明optionalHeader大小

_IMAGE_FILE_HEADER.Characteristics:这个字段用于标识文件的属性,比如文件是否是可运行的状态,是否为DLL文件,也有对应的值,不可随意更动。常用的是0x0002(可执行文件) 、 0x2000(DLL文件),上述例子中该值=0x210E(貌似是文件属性的叠加,多个属性的样子)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#define IMAGE_FILE_RELOCS_STRIPPED           0x0001  // Relocation info stripped from file.
#define IMAGE_FILE_EXECUTABLE_IMAGE 0x0002 // File is executable (i.e. no unresolved externel references).
#define IMAGE_FILE_LINE_NUMS_STRIPPED 0x0004 // Line nunbers stripped from file.
#define IMAGE_FILE_LOCAL_SYMS_STRIPPED 0x0008 // Local symbols stripped from file.
#define IMAGE_FILE_AGGRESIVE_WS_TRIM 0x0010 // Agressively trim working set
#define IMAGE_FILE_LARGE_ADDRESS_AWARE 0x0020 // App can handle >2gb addresses
#define IMAGE_FILE_BYTES_REVERSED_LO 0x0080 // Bytes of machine word are reversed.
#define IMAGE_FILE_32BIT_MACHINE 0x0100 // 32 bit word machine.
#define IMAGE_FILE_DEBUG_STRIPPED 0x0200 // Debugging info stripped from file in .DBG file
#define IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP 0x0400 // If Image is on removable media, copy and run from the swap file.
#define IMAGE_FILE_NET_RUN_FROM_SWAP 0x0800 // If Image is on Net, copy and run from the swap file.
#define IMAGE_FILE_SYSTEM 0x1000 // System File.
#define IMAGE_FILE_DLL 0x2000 // File is a DLL.
#define IMAGE_FILE_UP_SYSTEM_ONLY 0x4000 // File should only be run on a UP machine
#define IMAGE_FILE_BYTES_REVERSED_HI 0x8000 // Bytes of machine word are reversed.
_IMAGE_NT_HEADERS.OptionalHeader

_IMAGE_OPTIONAL_HEADER结构体类型,这个结构体有PE32 & PE32+两个版本,这里先学习PE32版本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
typedef struct _IMAGE_OPTIONAL_HEADER {			//32bit OS version
//
// Standard fields.
//

WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
DWORD BaseOfData;

//
// NT additional fields.
//

DWORD ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
DWORD SizeOfStackReserve;
DWORD SizeOfStackCommit;
DWORD SizeOfHeapReserve;
DWORD SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;

typedef struct _IMAGE_OPTIONAL_HEADER64 { //64bit OS version
WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
ULONGLONG ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
ULONGLONG SizeOfStackReserve;
ULONGLONG SizeOfStackCommit;
ULONGLONG SizeOfHeapReserve;
ULONGLONG SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;

IMAGE_OPTIONAL_HEADER.Magic,当为IMAGE_OPTIONAL_HEADER32时该值=10B,为IMAGE_OPTIONAL_HEADER64时该值=20B

MAGE_OPTIONAL_HEADER.AddressEntryPoint相当重要,指出文件最先执行代码的RVA

MAGE_OPTIONAL_HEADER.ImageBase,文件优先装入的位置(可能被抢占)

…这里省略一些不是很重要的成员

MAGE_OPTIONAL_HEADER最后一个成员特别重要:IMAGE_DATA_DIRECTORY类型的DataDirectory,是一个结构体数组

1
2
3
4
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

数组成员,在进行资源定位时需要根据这里的信息来完成:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
DataDirectory[0] = EXPORT Directory
DataDirectory[1] = IMPORT Directory
DataDirectory[2] = RESOURCE Directory
DataDirectory[3] = EXCEPTION Directory
DataDirectory[4] = SECURITY Directory
DataDirectory[5] = BASERELOC Directory
DataDirectory[6l = DEBUG Directory
DataDirectory[7] = COPYRIGHT Directory
DataDirectory[8] = GLOBALPTR Directory
DataDirectory[9] = TLS Directory
DataDirectory[A] = LOAD CONFIG Directory
DataDirectory[B] = BOUND IMPORT Directory
DataDirectory[C] = IAT Directory
DataDirectory[D] = DELAY IMPORT Directory
DataDirectoryIE] = COM DESCRIPTOR Directory
DataDirectory[F] = Reserved Directory

IMAGE_SECTION_HEADER

对应的结构体:IMAGE_SECTION_HEADER

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef struct _IMAGE_SECTION_HEADER {
BYTE Name[IMAGE_SIZEOF_SHORT_NAME];
union {
DWORD PhysicalAddress;
DWORD VirtualSize; //节区在内存中所占大小
} Misc;
DWORD VirtualAddress; //节区载入内存的地址(RVA)
DWORD SizeOfRawData; //节区在文件中所占大小
DWORD PointerToRawData; //节区在文件中的位置
DWORD PointerToRelocations;
DWORD PointerToLinenumbers;
WORD NumberOfRelocations;
WORD NumberOfLinenumbers;
DWORD Characteristics; //节区属性
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;

由于文件在磁盘上,载入内存时会有差异,但同个部分的相对位置的不发生变化的。

在我的电脑中的HxD查看notepad.exe,以.text节区为例,选中部分是.text的节区部分:

成员 解释
Name(NULL结束) 2E 74 65 78 74 00 00 00 节区名
VirtualSize 0x00007748(后续有对齐字段影响它) 内存中节区大小
VirtualAddress(RVA) 0x00001000 内存中节区的offset
SizeOfRawData 0x00007800 磁盘文件中节区大小
PointToRawDate 0x00000400 磁盘文件中节区offset
Characteristics 0x60000020

characteristics使用如下不同的属性对应的值进行OR运算构成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Section characteristics.
//
// IMAGE_SCN_TYPE_REG 0x00000000 // Reserved.
// IMAGE_SCN_TYPE_DSECT 0x00000001 // Reserved.
// IMAGE_SCN_TYPE_NOLOAD 0x00000002 // Reserved.
// IMAGE_SCN_TYPE_GROUP 0x00000004 // Reserved.
#define IMAGE_SCN_TYPE_NO_PAD 0x00000008 // Reserved.
// IMAGE_SCN_TYPE_COPY 0x00000010 // Reserved.

#define IMAGE_SCN_CNT_CODE 0x00000020 // Section contains code.
#define IMAGE_SCN_CNT_INITIALIZED_DATA 0x00000040 // Section contains initialized data.
#define IMAGE_SCN_CNT_UNINITIALIZED_DATA 0x00000080 // Section contains uninitialized data.

#define IMAGE_SCN_LNK_OTHER 0x00000100 // Reserved.
#define IMAGE_SCN_LNK_INFO 0x00000200 // Section contains comments or some other type of information.
// IMAGE_SCN_TYPE_OVER 0x00000400 // Reserved.
#define IMAGE_SCN_LNK_REMOVE 0x00000800 // Section contents will not become part of image.
#define IMAGE_SCN_LNK_COMDAT 0x00001000 // Section contents comdat.
// 0x00002000 // Reserved.
// IMAGE_SCN_MEM_PROTECTED - Obsolete 0x00004000
#define IMAGE_SCN_NO_DEFER_SPEC_EXC 0x00004000 // Reset speculative exceptions handling bits in the TLB entries for this section.
#define IMAGE_SCN_GPREL 0x00008000 // Section content can be accessed relative to GP
#define IMAGE_SCN_MEM_FARDATA 0x00008000
// IMAGE_SCN_MEM_SYSHEAP - Obsolete 0x00010000
#define IMAGE_SCN_MEM_PURGEABLE 0x00020000
#define IMAGE_SCN_MEM_16BIT 0x00020000
#define IMAGE_SCN_MEM_LOCKED 0x00040000
#define IMAGE_SCN_MEM_PRELOAD 0x00080000

#define IMAGE_SCN_ALIGN_1BYTES 0x00100000 //
#define IMAGE_SCN_ALIGN_2BYTES 0x00200000 //
#define IMAGE_SCN_ALIGN_4BYTES 0x00300000 //
#define IMAGE_SCN_ALIGN_8BYTES 0x00400000 //
#define IMAGE_SCN_ALIGN_16BYTES 0x00500000 // Default alignment if no others are specified.
#define IMAGE_SCN_ALIGN_32BYTES 0x00600000 //
#define IMAGE_SCN_ALIGN_64BYTES 0x00700000 //
#define IMAGE_SCN_ALIGN_128BYTES 0x00800000 //
#define IMAGE_SCN_ALIGN_256BYTES 0x00900000 //
#define IMAGE_SCN_ALIGN_512BYTES 0x00A00000 //
#define IMAGE_SCN_ALIGN_1024BYTES 0x00B00000 //
#define IMAGE_SCN_ALIGN_2048BYTES 0x00C00000 //
#define IMAGE_SCN_ALIGN_4096BYTES 0x00D00000 //
#define IMAGE_SCN_ALIGN_8192BYTES 0x00E00000 //
// Unused 0x00F00000

#define IMAGE_SCN_LNK_NRELOC_OVFL 0x01000000 // Section contains extended relocations.
#define IMAGE_SCN_MEM_DISCARDABLE 0x02000000 // Section can be discarded.
#define IMAGE_SCN_MEM_NOT_CACHED 0x04000000 // Section is not cachable.
#define IMAGE_SCN_MEM_NOT_PAGED 0x08000000 // Section is not pageable.
#define IMAGE_SCN_MEM_SHARED 0x10000000 // Section is shareable.
#define IMAGE_SCN_MEM_EXECUTE 0x20000000 // Section is executable.
#define IMAGE_SCN_MEM_READ 0x40000000 // Section is readable.
#define IMAGE_SCN_MEM_WRITE 0x80000000 // Section is writeable.

例子中节区.text的characteristics的值=0x60000020,表明是Section executable&Section readable&Section contains code的组合


其他
RAW<->RVA转换

可执行程序在文件系统中各个字段的位置有自己相对于PE文件的偏移,当载入内存又是另外的情况了。

公式:==RAW - PointToRawData = RVA - VirtualAddress==

书中有个很好的例子(notepad.exe)会使用到前文学习到的一些字段。

Q1:RVA = 5000 时 FILE OFFSET是多少?

RVA = 5000 对应的节区是.text,这是由于RVA = 5000落在.text的节区头信息中VirtualSize(0x7748)以及VirtualAddress(0x1000)决定的区间确定的。再根据字段PointToRawData(0x400),套用公式:RAW = 5000 - 1000 + 400 = 4400

因此RVA = 5000对应RAW = 4400的内容


IAT(Import Address Table)

关于IAT的描述:

一般程序在调用自身函数的时候,自身函数地址RAV是固定的;但是当程序在调用dll里的函数的时候,由于dll的地址会发生重定位,导致dll里的函数地址每次都会发生变化。

为了每次都能准确的调用dll函数的地址,就特意构建了一张表,用于存储每次程序运行,dll发生重定位之后,dll的函数的地址。

而这样之后,那自身程序在调用dll函数的时候,就可以用“类似指针”指向这张表格,取其值为函数新的地址。即可准确调用dll的函数。

相关结构体:IMAGE_IMPORT_DESCRIPTOR 属于NT头第三个成员的IMAGE_OPTIONAL_HEADER【NT头中最大的结构体】中最后一个成员,上文介绍NT头部分提及了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
DataDirectory[0] = EXPORT Directory
DataDirectory[1] = IMPORT Directory //IAT
DataDirectory[2] = RESOURCE Directory
DataDirectory[3] = EXCEPTION Directory
DataDirectory[4] = SECURITY Directory
DataDirectory[5] = BASERELOC Directory
DataDirectory[6l = DEBUG Directory
DataDirectory[7] = COPYRIGHT Directory
DataDirectory[8] = GLOBALPTR Directory
DataDirectory[9] = TLS Directory
DataDirectory[A] = LOAD CONFIG Directory
DataDirectory[B] = BOUND IMPORT Directory
DataDirectory[C] = IAT Directory
DataDirectory[D] = DELAY IMPORT Directory
DataDirectoryIE] = COM DESCRIPTOR Directory
DataDirectory[F] = Reserved Directory

Import Address Table位于==DataDirectory[1]==中的值(是个地址,维护一个数组,每一个元素对应一个IMAGE_IMPORT_DESCRIPTOR结构体,且IMAGE_IMPORT_DESCRIPTOR的数量对应文件运行需要导入的库的个数,结构体最后一NULL结尾)

1
2
3
4
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress; //指向结构体_IMAGE_IMPORT_DESCRIPTOR
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef struct _IMAGE_IMPORT_DESCRIPTOR {
union {
DWORD Characteristics; // 0 for terminating null import descriptor
DWORD OriginalFirstThunk; // RVA to original unbound IAT (PIMAGE_THUNK_DATA)
};
DWORD TimeDateStamp; // 0 if not bound,
// -1 if bound, and real date\time stamp
// in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
// O.W. date/time stamp of DLL bound to (Old BIND)

DWORD ForwarderChain; // -1 if no forwarders
DWORD Name; //导入DLL的名称
DWORD FirstThunk; // RVA to IAT (if bound this IAT has actual addresses)
} IMAGE_IMPORT_DESCRIPTOR;
typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR;

以notepad.exe为例子:选中部分是为NT头中的OptionalHeader成员

这里为了对照,再贴一次IMAGE_OPTIONAL_HEADER结构体原型:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
typedef struct _IMAGE_OPTIONAL_HEADER {
//
// Standard fields.
//

WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
DWORD BaseOfData;

//
// NT additional fields.
//

DWORD ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
DWORD SizeOfStackReserve;
DWORD SizeOfStackCommit;
DWORD SizeOfHeapReserve;
DWORD SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;

选中部分是为IMAGE_DATA_DIRECTORY DataDirectory的部分:

而DataDirectory可以包括

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
DataDirectory[0] = EXPORT Directory
DataDirectory[1] = IMPORT Directory //IAT
DataDirectory[2] = RESOURCE Directory
DataDirectory[3] = EXCEPTION Directory
DataDirectory[4] = SECURITY Directory
DataDirectory[5] = BASERELOC Directory
DataDirectory[6l = DEBUG Directory
DataDirectory[7] = COPYRIGHT Directory
DataDirectory[8] = GLOBALPTR Directory
DataDirectory[9] = TLS Directory
DataDirectory[A] = LOAD CONFIG Directory
DataDirectory[B] = BOUND IMPORT Directory
DataDirectory[C] = IAT Directory
DataDirectory[D] = DELAY IMPORT Directory
DataDirectoryIE] = COM DESCRIPTOR Directory
DataDirectory[F] = Reserved Directory

这里IAT用到的是第二个元素

即DataDirectory数组中的import信息中Import的RVA=0x00007604 Size=0x000000C8

我们看到的是内存中的,若想进一步查看得结合找出Import信息的RVA位于哪个区段:

节区.text中在内存的起始位置=0x00001000, size=0x00007748, 即从0x00001000->0x00008748是属于节区.text

而DataDirectory数组中的import信息中Import的RVA=0x00007604 这个RVA落在节区.text区间内,故属于节区.text

要进一步查看Import信息,需要将Import的RVA转为RAW:观察节区.text的RVA+PointToRawData 确定:

Raw = 0x00007604 - 0x00001000 + 0x00000400 = 0x00006A04 查看该地址:

每一个_IMAGE_IMPORT_DESCRIPTOR的结构体栈20byte,数组最后以一个结构体为null结束,因此确定所有的IMAGE_IMPORT_DESCRPITOR元素。下图是为NULL的IMAGE_IMPORT_DESCIPTOR:

结合IMAGE_IMPORT_DESCRPITOR的结构体成员看下第一个成员对应什么内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef struct _IMAGE_IMPORT_DESCRIPTOR {
union {
DWORD Characteristics; // 0 for terminating null import descriptor
DWORD OriginalFirstThunk; // RVA to original unbound IAT (PIMAGE_THUNK_DATA)
};
DWORD TimeDateStamp; // 0 if not bound,
// -1 if bound, and real date\time stamp
// in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
// O.W. date/time stamp of DLL bound to (Old BIND)

DWORD ForwarderChain; // -1 if no forwarders
DWORD Name;
DWORD FirstThunk; // RVA to IAT (if bound this IAT has actual addresses)
} IMAGE_IMPORT_DESCRIPTOR;
typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR;
成员
OriginalFirstThunk 0x00007990 INT的地址RVA
TimeDateStamp 0xFFFFFFFF
ForwarderChain 0xFFFFFFFF
Name 0x00007AAC 库名称字符串地址
FirstThunk 0x000012C4 IAT地址 RVA

先是利用NAME中的RVA找到RAW,进而定位到引入的dll文件名字,接着通过OriginalFirstThunk(INT)的RVA找到RAW对应的结构体地址链表,链表节点的每个DWORD代表一个RVA地址,计算出RAW可找到引入的函数名。

OriginalFirstThunk: 0x00007990 - 0x00001000 + 0x00000400 = 0x00006D90:

跟踪0x00007A7A(RVA)可得到导入的API名称:

==API名称==:0x00007A7A - 0x00001000 + 0x00000400 = 0x00006E7A

前面的0x000F是固定的不管。可以看到导入的API名称是PageSetupDlgw

Name: 0x00007AAC - 0x00001000 + 0x00000400 = 0x0000A07A = 0x00006EAC:

==对应导入的dll==:comdlg32.dll


EAT(Export Address Table)

**结构体: ** IMAGE_EXPORT_DIRECTORY

1
2
3
4
5
6
7
8
9
10
11
12
13
typedef struct _IMAGE_EXPORT_DIRECTORY {
DWORD Characteristics;
DWORD TimeDateStamp;
WORD MajorVersion;
WORD MinorVersion;
DWORD Name;
DWORD Base;
DWORD NumberOfFunctions;
DWORD NumberOfNames;
DWORD AddressOfFunctions; // RVA from base of image
DWORD AddressOfNames; // RVA from base of image
DWORD AddressOfNameOrdinals; // RVA from base of image
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;

存在于PE文件中的格式依旧是采取数组形式存放,位于NT头中最后一个数组成员

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
DataDirectory[0] = EXPORT Directory		//EAT
DataDirectory[1] = IMPORT Directory
DataDirectory[2] = RESOURCE Directory
DataDirectory[3] = EXCEPTION Directory
DataDirectory[4] = SECURITY Directory
DataDirectory[5] = BASERELOC Directory
DataDirectory[6l = DEBUG Directory
DataDirectory[7] = COPYRIGHT Directory
DataDirectory[8] = GLOBALPTR Directory
DataDirectory[9] = TLS Directory
DataDirectory[A] = LOAD CONFIG Directory
DataDirectory[B] = BOUND IMPORT Directory
DataDirectory[C] = IAT Directory
DataDirectory[D] = DELAY IMPORT Directory
DataDirectoryIE] = COM DESCRIPTOR Directory
DataDirectory[F] = Reserved Directory

是下标为0的那个内容,其实和IMAGE_IMPORT_DIRECTORY的形式差不多。

kernel32.dll的Export:

即DataDirectory数组中的中起始EXport的RVA=0x0000262C Size=0x00006CFD

故RAW = 0x0000262C - 0x00001000 + 0x00000400 = 0x00001A2C。

第一个IMAGE_EXPORT_DIRECTORY结构体:

对应的几个重要成员:

成员 RAW 备注
Name 0x00004B8E 0x00003F8E 库名字:KERNEL32.dll
NumberOfFunction 0x000003B9 - 导入的API个数
NumberOfNames 0x000003B9 -
AddressOfFunctions 0x00002654 0x00001A54 导入的函数地址数组
AddressOfName 0x00003538 0x00002938 API名的地址数组,元素个数同导入的API
Author: Victory+
Link: https://cvjark.github.io/2022/04/26/逆核系列No.1--PE格式/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.