2009年4月28日
再看nt下无驱执行ring0代码
(在不考虑正误的前提下,胡说八道一番…有点罗嗦…没办法。。。)
Author:icelord@ustb@05/11/19
(这仅是一篇日记而已…)
大二时看网上的“无驱执行ring0代码”看得我一头雾水,今天重新拿起来看了几遍,有点收获,将想法记录一下….
X86平台上,cpu可以工作在三个特权级别下,ring0—ring3,其中ring0具有最高权限,可以执行任何指令而无限制,ring3受到cpu保护机制限制,只能执行非特权指令。当在ring3下执行特权指令如lgdt、iret等时,会触发general protection异常(也就是出错啦,跳转到相应得错误处理程序)。
在nt平台下,普通应用程序运行在ring3下,操作系统运行于ring0。如果在程序中需要执行一些特权指令的话,程序必须转入到ring0。由于用户程序执行特权指令可能会破坏系统资源,故出于保护和稳定的目的,操作系统通过“门”机制向用户态程序提供必要的服务。在x86种有四种门:中断门、陷阱门、调用门、任务门。
先描述一下门的概念,个人认为就是两种不同状态之间的通道,换句话说,也就是不同环(特权级)之间切换的通道。
一般在操作系统初始化时,由real mode进入保护模式(pmode)后,cpu就处于ring0状态,在完成各种初始化后,系统转入用户态并创建用户进程。用户程序需要特权服务时,通过系统调用进入ring0,由系统完成指定的、在提供的服务范围内的请求。这样可以避免恶意程序破坏系统。但是如果我们想执行一些“特殊的”操作时怎么办?
这个问题有几个解决方法,第一种方法是安装驱动;驱动程序作为内核的辅助模块为用户程序提供服务,它与操作系统一样工作在ring0,而且他是可以根据自己需编写。所以可以通过编写一个“自定义的”驱动程序来执行ring0代码,这种方法在nt下和linux下应该都可,(不好意思,本人没实践过,忙着“好好学习”,有时间再说)。第二种限定于nt平台,通过“自定义门”来实现执行ring0代码。
由于nt平台的特性,导致用户可以通过操作Section对象""device"PhysicalMemory来操作物理内存。这样就可通过找到全局描述符表GDT或中断描述符表IDT并修改其内容来构造自己的“门”,甚至使用自定义代码来覆盖内核代码,。。。通过这些方法来达到执行ring0代码的目的。说得有点混乱…下面将一些预备知识,以便为后面理解作基础….


还有几个寄存器,这里就不乱说了….下面给出相应得结构:
/* 中断描述符结构 */
typedef struct IDT_ITEM__
{
unsigned short offset_low ; // 偏移量的0~15位
unsigned short seg_selector ; // 段选择符
unsigned char reserved1 ; // 未使用,须设为全零
unsigned char saved_1_1_0 : 3 ; // 类型/中断门14/陷阱门15
unsigned char d : 1 ; // D 位
unsigned char reserved2 : 1 ; // 保留,需设为0
unsigned char dpl : 2 ; // 特权ring3
unsigned char p : 1 ; // P 位存在/有效
unsigned short offset_high ; // 偏移量的16~31位
}IDT_ITEM ;
/* 调用门结构 */
typedef struct InvokeGate__
{
unsigned short Offset_0_15 ;/* 偏移量的 0~15 位 */
unsigned short SegSelector ;/* 段选择符 */
unsigned char reserved1;
unsigned char Type : 4 ; /* 类型字段,调用门需为 1100 ( 0xC ) */
unsigned char DT_0 : 1 ; /* 需为 0 , 以表明这是一个系统用的描述符 */
unsigned char DPL : 2 ; /* 特权级 */ ring3
unsigned char P : 1 ; /* 存在位 */
unsigned short Offset_16_31 ; /* 偏移量的 16~31 位 */
}INT_GATE;
讲到这里,问题好像还没解决的样子….
预备知识:
1) Windows NT将开始512MB物理内存连续映射到0x8000 0000 开始到0xa000 0000处(暂时不考虑大于/小于512MB)//这里在暂时对winnt 5.0 和 5.1
2) ZwMapViewOfSection()函数可以将物理内存映射到当前进程地址空间,并返回映射的虚拟地址
3) Sidt指令和sgdt指令是非特权及指令,用户态可执行
4) Windows Nt使用平坦模式的内存段,即段基址为0
说到这里,问题的基本可以解决了,这就是网上流传的“无驱执行ring0代码” 的基础。
预备知识:

预备知识:
在ring3“使用”门要求门的描述符特权级=3,这样这个门才能被用户态程序所使用,
也就是上面结构中DPL域的值=3
构造中断门:
将中断描述符指向你的中断处理程序,里面填写你的ring0代码,至于参数可以通过寄存器eax/ebx/ecx…/esi/edi传递,就跟linux系统调用类似
调用时:
Mov eax,param1
Mov ebx,param2
…..
Int 中断向量号
构造陷阱门与中断门类似。。。。。
构造调用门:
将描述符指向你的ring0要执行的函数
word sel[3];
Sel[0]=sel[1]=0;
Sel[2]=调用门选择子
_Asm call fword ptr [sel] //调用调用门,进入ring0
//这里没有试验过
//估计 ushort[3]è
//struct
//{
// Ulong offset;
// Ushort selector;
//};
这里糊涂啊….有待研究….
到这里,你应该基本了解了为什么“无驱执行ring0”了,流程图如下:

到这里问题因该很简单了,代码我就不贴了,老早以前的东西了…..
本次只是更新一下本人对这这些的认识,上面有很多错误,暂时摆到这里,以后有时间再修改吧,懒啊….
需要代码在百度查:无驱执行ring0代码就可以查到…..
1.Basic of IDT
in Windbg ,we can get IDT table using
lkd> !idt -a
Dumping IDT:
00: 80542550 nt!KiTrap00
01: 805426cc nt!KiTrap01
02: Task Selector = 0x0058
03: 80542ae0 nt!KiTrap03
04: 80542c60 nt!KiTrap04
05: 80542dc0 nt!KiTrap05
06: 80542f34 nt!KiTrap06
07: 805435ac nt!KiTrap07
08: Task Selector = 0x0050
09: 805439b0 nt!KiTrap09
0a: 80543ad0 nt!KiTrap0A
0b: 80543c10 nt!KiTrap0B
0c: 80543e70 nt!KiTrap0C
0d: 8054415c nt!KiTrap0D
0e: 80544858 nt!KiTrap0E
0f: 80544b90 nt!KiTrap0F
10: 80544cb0 nt!KiTrap10
11: 80544dec nt!KiTrap11
12: Task Selector = 0x00A0
13: 80544f54 nt!KiTrap13
14: 80544b90 nt!KiTrap0F
15: 80544b90 nt!KiTrap0F
16: 80544b90 nt!KiTrap0F
17: 80544b90 nt!KiTrap0F
18: 80544b90 nt!KiTrap0F
19: 80544b90 nt!KiTrap0F
1a: 80544b90 nt!KiTrap0F
1b: 80544b90 nt!KiTrap0F
1c: 80544b90 nt!KiTrap0F
1d: 80544b90 nt!KiTrap0F
1e: 80544b90 nt!KiTrap0F
1f: 806e510c
20: 00000000
21: 00000000
22: 00000000
23: 00000000
24: 00000000
25: 00000000
26: 00000000
27: 00000000
28: 00000000
29: 00000000
2a: 80541d7e nt!KiGetTickCount
2b: 80541e80 nt!KiCallbackReturn
2c: 80542030 nt!KiSetLowWaitHighThread
2d: bacbcdc4
2e: 80541801 nt!KiSystemService
2f: 80544b90 nt!KiTrap0F
30: 80540ec0 nt!KiUnexpectedInterrupt0
31: 80540eca nt!KiUnexpectedInterrupt1
32: 80540ed4 nt!KiUnexpectedInterrupt2
33: 80540ede nt!KiUnexpectedInterrupt3
34: 80540ee8 nt!KiUnexpectedInterrupt4
35: 80540ef2 nt!KiUnexpectedInterrupt5
36: 80540efc nt!KiUnexpectedInterrupt6
37: 806e4864
38: 80540f10 nt!KiUnexpectedInterrupt8
39: 80540f1a nt!KiUnexpectedInterrupt9
3a: 80540f24 nt!KiUnexpectedInterrupt10
3b: 80540f2e nt!KiUnexpectedInterrupt11
3c: 80540f38 nt!KiUnexpectedInterrupt12
3d: 806e5e2c
3e: 80540f4c nt!KiUnexpectedInterrupt14
3f: 80540f56 nt!KiUnexpectedInterrupt15
40: 80540f60 nt!KiUnexpectedInterrupt16
41: 806e5c88
42: 80540f74 nt!KiUnexpectedInterrupt18
43: 80540f7e nt!KiUnexpectedInterrupt19
44: 80540f88 nt!KiUnexpectedInterrupt20
45: 80540f92 nt!KiUnexpectedInterrupt21
46: 80540f9c nt!KiUnexpectedInterrupt22
47: 80540fa6 nt!KiUnexpectedInterrupt23
48: 80540fb0 nt!KiUnexpectedInterrupt24
49: 80540fba nt!KiUnexpectedInterrupt25
4a: 80540fc4 nt!KiUnexpectedInterrupt26
4b: 80540fce nt!KiUnexpectedInterrupt27
4c: 80540fd8 nt!KiUnexpectedInterrupt28
4d: 80540fe2 nt!KiUnexpectedInterrupt29
4e: 80540fec nt!KiUnexpectedInterrupt30
4f: 80540ff6 nt!KiUnexpectedInterrupt31
50: 806e493c
51: 8054100a nt!KiUnexpectedInterrupt33
52: 80541014 nt!KiUnexpectedInterrupt34
53: 8054101e nt!KiUnexpectedInterrupt35
54: 80541028 nt!KiUnexpectedInterrupt36
55: 80541032 nt!KiUnexpectedInterrupt37
56: 8054103c nt!KiUnexpectedInterrupt38
57: 80541046 nt!KiUnexpectedInterrupt39
58: 80541050 nt!KiUnexpectedInterrupt40
59: 8054105a nt!KiUnexpectedInterrupt41
5a: 80541064 nt!KiUnexpectedInterrupt42
5b: 8054106e nt!KiUnexpectedInterrupt43
5c: 80541078 nt!KiUnexpectedInterrupt44
5d: 80541082 nt!KiUnexpectedInterrupt45
5e: 8054108c nt!KiUnexpectedInterrupt46
5f: 80541096 nt!KiUnexpectedInterrupt47
60: 805410a0 nt!KiUnexpectedInterrupt48
61: 805410aa nt!KiUnexpectedInterrupt49
62: 805410b4 nt!KiUnexpectedInterrupt50
63: 8a0004ec b9f3dbca (KINTERRUPT 8a0004b0)
b9f00bd8 (KINTERRUPT 89f84bb0)
64: 805410c8 nt!KiUnexpectedInterrupt52
65: 805410d2 nt!KiUnexpectedInterrupt53
66: 805410dc nt!KiUnexpectedInterrupt54
67: 805410e6 nt!KiUnexpectedInterrupt55
68: 805410f0 nt!KiUnexpectedInterrupt56
69: 805410fa nt!KiUnexpectedInterrupt57
6a: 80541104 nt!KiUnexpectedInterrupt58
6b: 8054110e nt!KiUnexpectedInterrupt59
6c: 80541118 nt!KiUnexpectedInterrupt60
6d: 80541122 nt!KiUnexpectedInterrupt61
6e: 8054112c nt!KiUnexpectedInterrupt62
6f: 80541136 nt!KiUnexpectedInterrupt63
70: 80541140 nt!KiUnexpectedInterrupt64
71: 8054114a nt!KiUnexpectedInterrupt65
72: 80541154 nt!KiUnexpectedInterrupt66
73: 8a157b1c ba517e80 (KINTERRUPT 8a157ae0)
b9f00bd8 (KINTERRUPT 8a007bb0)
b9c75b78 (KINTERRUPT 89f86bb0)
74: 80541168 nt!KiUnexpectedInterrupt68
75: 80541172 nt!KiUnexpectedInterrupt69
76: 8054117c nt!KiUnexpectedInterrupt70
77: 80541186 nt!KiUnexpectedInterrupt71
78: 80541190 nt!KiUnexpectedInterrupt72
79: 8054119a nt!KiUnexpectedInterrupt73
7a: 805411a4 nt!KiUnexpectedInterrupt74
7b: 805411ae nt!KiUnexpectedInterrupt75
7c: 805411b8 nt!KiUnexpectedInterrupt76
7d: 805411c2 nt!KiUnexpectedInterrupt77
7e: 805411cc nt!KiUnexpectedInterrupt78
7f: 805411d6 nt!KiUnexpectedInterrupt79
80: 805411e0 nt!KiUnexpectedInterrupt80
81: 805411ea nt!KiUnexpectedInterrupt81
82: 805411f4 nt!KiUnexpectedInterrupt82
83: 8a5c79ec ba6d8da8 (KINTERRUPT 8a5c79b0)
b9f3dbca (KINTERRUPT 89f87bb0)
84: 80541208 nt!KiUnexpectedInterrupt84
85: 80541212 nt!KiUnexpectedInterrupt85
86: 8054121c nt!KiUnexpectedInterrupt86
87: 80541226 nt!KiUnexpectedInterrupt87
88: 80541230 nt!KiUnexpectedInterrupt88
89: 8054123a nt!KiUnexpectedInterrupt89
8a: 80541244 nt!KiUnexpectedInterrupt90
8b: 8054124e nt!KiUnexpectedInterrupt91
8c: 80541258 nt!KiUnexpectedInterrupt92
8d: 80541262 nt!KiUnexpectedInterrupt93
8e: 8054126c nt!KiUnexpectedInterrupt94
8f: 80541276 nt!KiUnexpectedInterrupt95
90: 80541280 nt!KiUnexpectedInterrupt96
91: 8054128a nt!KiUnexpectedInterrupt97
92: 89d42bec baa88a30 (KINTERRUPT 89d42bb0)
93: 89fffbec baa98495 (KINTERRUPT 89fffbb0)
94: 805412a8 nt!KiUnexpectedInterrupt100
95: 805412b2 nt!KiUnexpectedInterrupt101
96: 805412bc nt!KiUnexpectedInterrupt102
97: 805412c6 nt!KiUnexpectedInterrupt103
98: 805412d0 nt!KiUnexpectedInterrupt104
99: 805412da nt!KiUnexpectedInterrupt105
9a: 805412e4 nt!KiUnexpectedInterrupt106
9b: 805412ee nt!KiUnexpectedInterrupt107
9c: 805412f8 nt!KiUnexpectedInterrupt108
9d: 80541302 nt!KiUnexpectedInterrupt109
9e: 8054130c nt!KiUnexpectedInterrupt110
9f: 80541316 nt!KiUnexpectedInterrupt111
a0: 80541320 nt!KiUnexpectedInterrupt112
a1: 8054132a nt!KiUnexpectedInterrupt113
a2: 80541334 nt!KiUnexpectedInterrupt114
a3: 89f9684c baa9fd80 (KINTERRUPT 89f96810)
a4: 80541348 nt!KiUnexpectedInterrupt116
a5: 80541352 nt!KiUnexpectedInterrupt117
a6: 8054135c nt!KiUnexpectedInterrupt118
a7: 80541366 nt!KiUnexpectedInterrupt119
a8: 80541370 nt!KiUnexpectedInterrupt120
a9: 8054137a nt!KiUnexpectedInterrupt121
aa: 80541384 nt!KiUnexpectedInterrupt122
ab: 8054138e nt!KiUnexpectedInterrupt123
ac: 80541398 nt!KiUnexpectedInterrupt124
ad: 805413a2 nt!KiUnexpectedInterrupt125
ae: 805413ac nt!KiUnexpectedInterrupt126
af: 805413b6 nt!KiUnexpectedInterrupt127
b0: 805413c0 nt!KiUnexpectedInterrupt128
b1: 8a54b3e4 ba78431e (KINTERRUPT 8a54b3a8)
b2: 89d423fc baa88a30 (KINTERRUPT 89d423c0)
b3: 805413de nt!KiUnexpectedInterrupt131
b4: 805413e8 nt!KiUnexpectedInterrupt132
b5: 805413f2 nt!KiUnexpectedInterrupt133
b6: 805413fc nt!KiUnexpectedInterrupt134
b7: 80541406 nt!KiUnexpectedInterrupt135
b8: 80541410 nt!KiUnexpectedInterrupt136
b9: 8054141a nt!KiUnexpectedInterrupt137
ba: 80541424 nt!KiUnexpectedInterrupt138
bb: 8054142e nt!KiUnexpectedInterrupt139
bc: 80541438 nt!KiUnexpectedInterrupt140
bd: 80541442 nt!KiUnexpectedInterrupt141
be: 8054144c nt!KiUnexpectedInterrupt142
bf: 80541456 nt!KiUnexpectedInterrupt143
c0: 80541460 nt!KiUnexpectedInterrupt144
c1: 806e4ac0
c2: 80541474 nt!KiUnexpectedInterrupt146
c3: 8054147e nt!KiUnexpectedInterrupt147
c4: 80541488 nt!KiUnexpectedInterrupt148
c5: 80541492 nt!KiUnexpectedInterrupt149
c6: 8054149c nt!KiUnexpectedInterrupt150
c7: 805414a6 nt!KiUnexpectedInterrupt151
c8: 805414b0 nt!KiUnexpectedInterrupt152
c9: 805414ba nt!KiUnexpectedInterrupt153
ca: 805414c4 nt!KiUnexpectedInterrupt154
cb: 805414ce nt!KiUnexpectedInterrupt155
cc: 805414d8 nt!KiUnexpectedInterrupt156
cd: 805414e2 nt!KiUnexpectedInterrupt157
ce: 805414ec nt!KiUnexpectedInterrupt158
cf: 805414f6 nt!KiUnexpectedInterrupt159
d0: 80541500 nt!KiUnexpectedInterrupt160
d1: 806e3e54
d2: 80541514 nt!KiUnexpectedInterrupt162
d3: 8054151e nt!KiUnexpectedInterrupt163
d4: 80541528 nt!KiUnexpectedInterrupt164
d5: 80541532 nt!KiUnexpectedInterrupt165
d6: 8054153c nt!KiUnexpectedInterrupt166
d7: 80541546 nt!KiUnexpectedInterrupt167
d8: 80541550 nt!KiUnexpectedInterrupt168
d9: 8054155a nt!KiUnexpectedInterrupt169
da: 80541564 nt!KiUnexpectedInterrupt170
db: 8054156e nt!KiUnexpectedInterrupt171
dc: 80541578 nt!KiUnexpectedInterrupt172
dd: 80541582 nt!KiUnexpectedInterrupt173
de: 8054158c nt!KiUnexpectedInterrupt174
df: 80541596 nt!KiUnexpectedInterrupt175
e0: 805415a0 nt!KiUnexpectedInterrupt176
e1: 806e5048
e2: 805415b4 nt!KiUnexpectedInterrupt178
e3: 806e4dac
e4: 805415c8 nt!KiUnexpectedInterrupt180
e5: 805415d2 nt!KiUnexpectedInterrupt181
e6: 805415dc nt!KiUnexpectedInterrupt182
e7: 805415e6 nt!KiUnexpectedInterrupt183
e8: 805415f0 nt!KiUnexpectedInterrupt184
e9: 805415fa nt!KiUnexpectedInterrupt185
ea: 80541604 nt!KiUnexpectedInterrupt186
eb: 8054160e nt!KiUnexpectedInterrupt187
ec: 80541618 nt!KiUnexpectedInterrupt188
ed: 80541622 nt!KiUnexpectedInterrupt189
ee: 80541629 nt!KiUnexpectedInterrupt190
ef: 80541630 nt!KiUnexpectedInterrupt191
f0: 80541637 nt!KiUnexpectedInterrupt192
f1: 8054163e nt!KiUnexpectedInterrupt193
f2: 80541645 nt!KiUnexpectedInterrupt194
f3: 8054164c nt!KiUnexpectedInterrupt195
f4: 80541653 nt!KiUnexpectedInterrupt196
f5: 8054165a nt!KiUnexpectedInterrupt197
f6: 80541661 nt!KiUnexpectedInterrupt198
f7: 80541668 nt!KiUnexpectedInterrupt199
f8: 8054166f nt!KiUnexpectedInterrupt200
f9: 80541676 nt!KiUnexpectedInterrupt201
fa: 8054167d nt!KiUnexpectedInterrupt202
fb: 80541684 nt!KiUnexpectedInterrupt203
fc: 8054168b nt!KiUnexpectedInterrupt204
fd: 806e55a8
fe: 806e5748
ff: 805416a0 nt!KiUnexpectedInterrupt207 |
these data are generated by windbg's process.If we check in the memory interrelated with IDT,how could we do?
Before we analyze the memory,we should recognize IDTR register.It is a 48 bit ,and stores IDT Base Address and IDT limit.
Here we see.
IDTR Regsiter structure
——————————————————————————————
47 32 16|15 0
--------------------------------- | ---------------------
IDT Base Address | IDT Limit
HiIDTBase | LowIDTBase |
--------------------------------- | ---------------------
|
we can get IDTR using instruction :SIDT. From IDTR Base Addreess we get IDT table.IDT table have 0x100 IDT items.These items
are called Descriptor.We can also draw IDT Descriptor.
There are 3 different Descriptors: Interrupt Gate ,Trap Gate and Task Gate.
Here we see.
|
Interrupt Gate
————————————————————————————————
31 16 15 13 12 8 7 0
---------------------------------|---|-------|----|--------|-------------
Offset 31...16 | P| CPL|flag| type |
--------------------------------------------------------------------
31 16 0
------------------------------------|----------------------------------
Segment Selector | Offset 15...0
--------------------------------------------------------------------
|
In my cmputer,I get IDTR throught SIDT instruction,here is the result:
IDT Base Address:0x8003f400
IDT Limit :0x7ff
|
Memory in 0x8003f400:
8003f400 50 25 08 00 00 8e 54 80 cc 26 08 00 00 8e 54 80 P%....T..&....T.
8003f410 2e 11 58 00 00 85 00 00 e0 2a 08 00 00 ee 54 80 ..X......*....T.
8003f420 60 2c 08 00 00 ee 54 80 c0 2d 08 00 00 8e 54 80 `,....T..-....T.
8003f430 34 2f 08 00 00 8e 54 80 ac 35 08 00 00 8e 54 80 4/....T..5....T.
8003f440 88 11 50 00 00 85 00 00 b0 39 08 00 00 8e 54 80 ..P......9....T.
8003f450 d0 3a 08 00 00 8e 54 80 10 3c 08 00 00 8e 54 80 .:....T..<....T.
8003f460 70 3e 08 00 00 8e 54 80 5c 41 08 00 00 8e 54 80 p>....T.\A....T.
8003f470 58 48 08 00 00 8e 54 80 90 4b 08 00 00 8e 54 80 XH....T..K....T.
[...]
8003f480 b0 4c 08 00 00 8e 54 80 ec 4d 08 00 00 8e 54 80 .L....T..M....T.
8003f490 90 4b a0 00 00 85 54 80 54 4f 08 00 00 8e 54 80 .K....T.TO....T.
8003f4a0 90 4b 08 00 00 8e 54 80 90 4b 08 00 00 8e 54 80 .K....T..K....T.
8003f4b0 90 4b 08 00 00 8e 54 80 90 4b 08 00 00 8e 54 80 .K....T..K....T.
8003f4c0 90 4b 08 00 00 8e 54 80 90 4b 08 00 00 8e 54 80 .K....T..K....T.
8003f4d0 90 4b 08 00 00 8e 54 80 90 4b 08 00 00 8e 54 80 .K....T..K....T.
8003f4e0 90 4b 08 00 00 8e 54 80 90 4b 08 00 00 8e 54 80 .K....T..K....T.
8003f4f0 90 4b 08 00 00 8e 54 80 0c 51 08 00 00 8e 6e 80 .K....T..Q....n.
[...]
8003f500 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 ................
8003f510 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 ................
8003f520 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 ................
8003f530 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 ................
8003f540 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 ................
8003f550 7e 1d 08 00 00 ee 54 80 80 1e 08 00 00 ee 54 80 ~.....T.......T.
8003f560 30 20 08 00 00 ee 54 80 c4 cd 08 00 00 ee cb ba 0 ....T.........
8003f570 01 18 08 00 00 ee 54 80 90 4b 08 00 00 8e 54 80 ......T..K....T.
[...]
|
e.g. int0 ,we can get
8003f400 : 50 25 08 00 00 8e 54 80
|
————————————————————————————————
31 16 15 13 12 8 7 0
---------------------------------|--|----|-----|--------|-------------
|1 00 0 1110 0000 0000
8 0 5 4 | 8 E 0 0
--------------------------------------------------------------------
31 16 0
-----------------------------------|----------------------------------
0 0 0 8 | 2 5 5 0
--------------------------------------------------------------------
|
8~12bit indicate interrupt discripter type ,int 0 is interrupt instruction,so 0xE stand for interrupt gate.
Actually,in windbg,we get
int 00: 80542550 nt!KiTrap00,so KiTrapXX serial funcs are Interrupt Functions.
there is some one else.If we check int 0x2,
8003f410:2e 11 58 00 00 85 00 00
interrupt discripter type is 0x5, and Interrupt Service Routine(ISR) is 0x0000112e, 0x112e is not available.
how can we find the int 0x2 's ISR. Actually ,in the gate descriptor ,there a element called:Segment Selector.
To get ISR,we need to use the Selector to check GDT ,so really ISR is GDT's Descriptor Base+ IDT's offset.
(GDT's Descriptor is same like IDT's Descriptor)
int 0: Segment Selector is 0x8,
In GDT:
Sel Type Base Limit DPL Attributes
0008 Code32 00000000 FFFFFFFF 0 RE
So,int 0 's ISR = 80542550 + 0
int 2: Segment Selector is 0x58,
In GDT:
Sel Type Base Limit DPL Attributes
00058 TSS32 80872368 00000068 0 P
We know this Segment type is TSS32,int 2 is a Task Gate.
?????????????????????????????
In IDT table,most of Descriptors are Interrupt Gate. Next ,we use windbg to trace interrupt procedure.
windows中实际的中断处理是通过IoConnectInterrupt注册的。。。。
NTSTATUS
NTAPI
IoConnectInterrupt(OUT PKINTERRUPT *InterruptObject,
IN PKSERVICE_ROUTINE ServiceRoutine,
IN PVOID ServiceContext,
IN PKSPIN_LOCK SpinLock,
IN ULONG Vector,
IN KIRQL Irql,
IN KIRQL SynchronizeIrql,
IN KINTERRUPT_MODE InterruptMode,
IN BOOLEAN ShareVector,
IN KAFFINITY ProcessorEnableMask,
IN BOOLEAN FloatingSave)
{
PKINTERRUPT Interrupt;
PKINTERRUPT InterruptUsed;
PIO_INTERRUPT IoInterrupt;
PKSPIN_LOCK SpinLockUsed;
BOOLEAN FirstRun;
CCHAR Count = 0;
KAFFINITY Affinity;
PAGED_CODE();
/* Assume failure */
*InterruptObject = NULL;
/* Get the affinity */
Affinity = ProcessorEnableMask & KeActiveProcessors;/*获取CPU Affinity....*/
while (Affinity)/*在多处理平台上的ISR需要连接到Affinity对应的每个CPU上。。*/
{
/* Increase count */
if (Affinity & 1) Count++;/*计算需要处理的count数。。。。*/
Affinity >>= 1;
}
/* Make sure we have a valid CPU count */
if (!Count) return STATUS_INVALID_PARAMETER;
/* Allocate the array of I/O Interrupts */
IoInterrupt = ExAllocatePoolWithTag(NonPagedPool,
(Count - 1) * sizeof(KINTERRUPT) +
sizeof(IO_INTERRUPT),
TAG_KINTERRUPT);/*在非分页内存上为KINTERRUPT分配内存。。。*/
if (!IoInterrupt) return STATUS_INSUFFICIENT_RESOURCES;
/* Select which Spinlock to use */
SpinLockUsed = SpinLock ? SpinLock : &IoInterrupt->SpinLock; /*如果指定了SpinLock。则使用参数里的SpinLock,否则使用刚刚分配好的KINTERRUPT的SpinLock.....*/
/* We first start with a built-in Interrupt inside the I/O Structure */
*InterruptObject = &IoInterrupt->FirstInterrupt;
Interrupt = (PKINTERRUPT)(IoInterrupt + 1);
FirstRun = TRUE;
/* Start with a fresh structure */
RtlZeroMemory(IoInterrupt, sizeof(IO_INTERRUPT));
/* Now create all the interrupts */
Affinity = ProcessorEnableMask & KeActiveProcessors;
for (Count = 0; Affinity; Count++, Affinity >>= 1)/*循环处理该中断需要连接的处理器,然后连接中断。。。。*/
{
/* Check if it's enabled for this CPU */
if (Affinity & 1)
{
/* Check which one we will use */
InterruptUsed = FirstRun ? &IoInterrupt->FirstInterrupt : Interrupt;
/* Initialize it */
KeInitializeInterrupt(InterruptUsed,
ServiceRoutine,
ServiceContext,
SpinLockUsed,
Vector,
Irql,
SynchronizeIrql,
InterruptMode,
ShareVector,
Count,
FloatingSave); /*初始化KINTERRUPT...*/
/* Connect it */
if (!KeConnectInterrupt(InterruptUsed))/*初始化完后,这里建立实际的连接。。。*/
{
/* Check how far we got */
if (FirstRun)
{
/* We failed early so just free this */
ExFreePool(IoInterrupt);
}
else
{
/* Far enough, so disconnect everything */
IoDisconnectInterrupt(&IoInterrupt->FirstInterrupt);
}
/* And fail */
return STATUS_INVALID_PARAMETER;
}
/* Now we've used up our First Run */
if (FirstRun)
{
FirstRun = FALSE;
}
else
{
/* Move on to the next one */
IoInterrupt->Interrupt[(UCHAR)Count] = Interrupt++;
}
}
}
/* Return Success */
return STATUS_SUCCESS;
}
由此看出比较重要的实际上是KeInitializeInterrupt和KeConnectInterrupt这两个函数。。。。。
VOID
NTAPI
KeInitializeInterrupt(IN PKINTERRUPT Interrupt,
IN PKSERVICE_ROUTINE ServiceRoutine,
IN PVOID ServiceContext,
IN PKSPIN_LOCK SpinLock,
IN ULONG Vector,
IN KIRQL Irql,
IN KIRQL SynchronizeIrql,
IN KINTERRUPT_MODE InterruptMode,
IN BOOLEAN ShareVector,
IN CHAR ProcessorNumber,
IN BOOLEAN FloatingSave)
{
ULONG i;
PULONG DispatchCode = &Interrupt->DispatchCode[0], Patch = DispatchCode;
/*每个KIINTERRUPT的实际入口就是这个DispatchCode数组。.这个数组是中断处理的入口代码汇编后形成的机器码。。。。*/
/* Set the Interrupt Header */
Interrupt->Type = InterruptObject;
Interrupt->Size = sizeof(KINTERRUPT);
/* Check if we got a spinlock */
if (SpinLock)/*设置此中断对象的自旋锁。。。*/
{
Interrupt->ActualLock = SpinLock;
}
else
{
/* This means we'll be usin the built-in one */
KeInitializeSpinLock(&Interrupt->SpinLock);
Interrupt->ActualLock = &Interrupt->SpinLock;
}
/* Set the other settings */
Interrupt->ServiceRoutine = ServiceRoutine;/*这里初始化各个域*/
Interrupt->ServiceContext = ServiceContext;
Interrupt->Vector = Vector;
Interrupt->Irql = Irql;
Interrupt->SynchronizeIrql = SynchronizeIrql;
Interrupt->Mode = InterruptMode;
Interrupt->ShareVector = ShareVector;
Interrupt->Number = ProcessorNumber;
Interrupt->FloatingSave = FloatingSave;
Interrupt->TickCount = (ULONG)-1;
Interrupt->DispatchCount = (ULONG)-1;
/* Loop the template in memory */
for (i = 0; i < KINTERRUPT_DISPATCH_CODES; i++)/*这里将汇编代码KiInterruptTemplate的机器指令复制到DispatchCode....注意。。这里很重要。。。*/
{
/* Copy the dispatch code */
*DispatchCode++ = KiInterruptTemplate[i];
}
/* Sanity check */
ASSERT((ULONG_PTR)&KiChainedDispatch2ndLvl -
(ULONG_PTR)KiInterruptTemplate <= (KINTERRUPT_DISPATCH_CODES * 4));
/* Jump to the last 4 bytes */
Patch = (PULONG)((ULONG_PTR)Patch +
((ULONG_PTR)&KiInterruptTemplateObject -
(ULONG_PTR)KiInterruptTemplate) - 4); /*注意..KiInterruptTemplate只是一个模板而已。。。就是一个框架。所以这里需要移动到最后的4个字节。。。来将实际的中断服务函数的地址写在这里。。。。。这样。。。就实现了对具体中断服务的跳转。。。*/
/* Apply the patch */
*Patch = PtrToUlong(Interrupt); /*这里就是将KIINTERRUPT的地址写入中断处理模板的最后4字节。。。。*/
/* Disconnect it at first */
Interrupt->Connected = FALSE;
}
现在来看看KiInterruptTemplate部分的代码。。就清楚了。。。KiInterruptTemplate的代码在ntoskrnl/ke/i386/Traps.s里。。。。
.func KiInterruptTemplate
_KiInterruptTemplate:
/* Enter interrupt trap */
INT_PROLOG kit_a, kit_t, DoPushFakeErrorCode
_KiInterruptTemplate2ndDispatch:
/* Dummy code, will be replaced by the address of the KINTERRUPT */
mov edi, 0
_KiInterruptTemplateObject:/*这条跳转指令的地址在KeConnectInterrupt函数里会被替换成实际的转入中断处理的函数的地址。。。*/
/* Dummy jump, will be replaced by the actual jump */
jmp _KeSynchronizeExecution@12
_KiInterruptTemplateDispatch:
/* Marks the end of the template so that the jump above can be edited */
TRAP_FIXUPS kit_a, kit_t, DoFixupV86, DoFixupAbios
.endfunc
/*因为
Patch = (PULONG)((ULONG_PTR)Patch +
((ULONG_PTR)&KiInterruptTemplateObject -
(ULONG_PTR)KiInterruptTemplate) - 4);
所以这里之后Patch对应的就是mov edi,0这条指令的"0"这个立即数了。。。。。所以执行替换后。。这条指令就变成了mov edi,PKiInterrupt了。。*/
接下来分析另外一个重要的KeConnectInterrupt函数。。
BOOLEAN
NTAPI
KeConnectInterrupt(IN PKINTERRUPT Interrupt)
{
BOOLEAN Connected, Error, Status;
KIRQL Irql, OldIrql;
UCHAR Number;
ULONG Vector;
DISPATCH_INFO Dispatch;
/* Get data from interrupt */
Number = Interrupt->Number;
Vector = Interrupt->Vector;
Irql = Interrupt->Irql;
/* Validate the settings */
if ((Irql > HIGH_LEVEL) ||
(Number >= KeNumberProcessors) ||
(Interrupt->SynchronizeIrql < Irql) ||
(Interrupt->FloatingSave))
{
return FALSE;
}
/* Set defaults */
Connected = FALSE;
Error = FALSE;
/* Set the system affinity and acquire the dispatcher lock */
KeSetSystemAffinityThread(1 << Number);
OldIrql = KiAcquireDispatcherLock();
/* Check if it's already been connected */
if (!Interrupt->Connected)
{
/* Get vector dispatching information */
KiGetVectorDispatch(Vector, &Dispatch); /*这里获取Vector这个中断号对应的中断分发信息。。。。比如此中断上是否已经连接了中断服务。。。还有此中断是共享的还是单独的。。。。*/
/* Check if the vector is already connected */
if (Dispatch.Type == NoConnect)/*如果还没有任何ISR连接到这个中断号上。。。。*/
{
/* Do the connection */
Interrupt->Connected = Connected = TRUE;
/* Initialize the list */
InitializeListHead(&Interrupt->InterruptListEntry);
/* Connect and enable the interrupt */
KiConnectVectorToInterrupt(Interrupt, NormalConnect);/*这里连接中断服务。。。*/
Status = HalEnableSystemInterrupt(Vector, Irql, Interrupt->Mode);/*调用HAL打开这个中断号对应的中断,因为未使用的中断都是被屏蔽的。。。*/
if (!Status) Error = TRUE;
}
else if ((Dispatch.Type != UnknownConnect) &&
(Interrupt->ShareVector) &&
(Dispatch.Interrupt->ShareVector) &&
(Dispatch.Interrupt->Mode == Interrupt->Mode))
{
/* The vector is shared and the interrupts are compatible */
ASSERT(FALSE); // FIXME: NOT YET SUPPORTED/TESTED
Interrupt->Connected = Connected = TRUE;
ASSERT(Irql <= SYNCH_LEVEL);
/* Check if this is the first chain */
if (Dispatch.Type != ChainConnect)/*如果此中断是共享的。。并且还没有建立共享中断需要的入口点。。。。则建立。。。*/
{
/* Setup the chainned handler */
KiConnectVectorToInterrupt(Dispatch.Interrupt, ChainConnect);
}
/* Insert into the interrupt list */
InsertTailList(&Dispatch.Interrupt->InterruptListEntry,
&Interrupt->InterruptListEntry);/*将刚才建立的中断连接入此中断号对应的队列。。。*/
}
}
/* Unlock the dispatcher and revert affinity */
KiReleaseDispatcherLock(OldIrql);
KeRevertToUserAffinityThread();
/* Check if we failed while trying to connect */
if ((Connected) && (Error))/*如果出错了。。。*/
{
DPRINT1("HalEnableSystemInterrupt failed\n");
KeDisconnectInterrupt(Interrupt);
Connected = FALSE;
}
/* Return to caller */
return Connected;
}
实际的连接函数是KiConnectVectorToInterrupt..
VOID
NTAPI
KiConnectVectorToInterrupt(IN PKINTERRUPT Interrupt,
IN CONNECT_TYPE Type)
{
DISPATCH_INFO Dispatch;
PKINTERRUPT_ROUTINE Handler;
PULONG Patch = &Interrupt->DispatchCode[0];
/* Get vector data */
KiGetVectorDispatch(Interrupt->Vector, &Dispatch); /*获取中断号对应的分发信息。。。*/
/* Check if we're only disconnecting */
if (Type == NoConnect)/*如果是要取消此中断服务。。。。则设置Handler为Dispatch.NoDispath...也就是最开始的默认中断处理。。。。。就是简单的打印一些调试信息而已。。。*/
{
/* Set the handler to NoDispatch */
Handler = Dispatch.NoDispatch;
}
else
{
/* Get the right handler */
Handler = (Type == NormalConnect) ?
Dispatch.InterruptDispatch:
Dispatch.ChainedDispatch;/*有两种类型的中断。。。。共享的和独立的。。共享的中断处理需要循环注册了的中断服务。。。如果是独立的中断。。。则直接调用注册了的中断服务就行了。。。。所以这里的Handler是不一样的。。*/
ASSERT(Interrupt->FloatingSave == FALSE);
/* Set the handler */
Interrupt->DispatchAddress = Handler; /*将Handler的地址写入DispatchAddress....*/
/* Jump to the last 4 bytes */
Patch = (PULONG)((ULONG_PTR)Patch +
((ULONG_PTR)&KiInterruptTemplateDispatch -
(ULONG_PTR)KiInterruptTemplate) - 4);
/* Apply the patch */
*Patch = (ULONG)((ULONG_PTR)Handler - ((ULONG_PTR)Patch + 4)); /*这里将KiInterruptTemplateDispatch那里的jmp指令的跳转地址改写为Handler....至于这里为什么要用Handler的地址减去Patch再加4...那是因为这里是相对地址的跳转。所以是从当前指令来偏移的。。。。*/
/* Now set the final handler address */
ASSERT(Dispatch.FlatDispatch == NULL);
Handler = (PVOID)&Interrupt->DispatchCode;
}
/* Set the pointer in the IDT */
((PKIPCR)KeGetPcr())->IDT[Interrupt->Vector].ExtendedOffset =
(USHORT)(((ULONG_PTR)Handler >> 16) & 0xFFFF);
((PKIPCR)KeGetPcr())->IDT[Interrupt->Vector].Offset =
(USHORT)PtrToUlong(Handler);/*好了。。这里改写相应的IDT......这里一改写。。。那真正的中断处理也就连接好了。。。。。。。。。大功告成了。。*/
}
现在再来看看共享中断和非共享中断的处理吧。。。。。。先看非共享中断的处理。。。。
func KiInterruptDispatch@0
_KiInterruptDispatch@0:
/* Increase interrupt count */
inc dword ptr PCR[KPCR_PRCB_INTERRUPT_COUNT] /*递增PCR的中断计数器。。。。*/
/* Save trap frame */
mov ebp, esp /*保存TRAP FRAME的指针。。。。。*/
/* Save vector and IRQL */
mov eax, [edi+KINTERRUPT_VECTOR]/*在跳转到这里之前。。。edi已经是指向了KIINTERRUPT了。。。这里将中断向量号存入eax........*/
mov ecx, [edi+KINTERRUPT_SYNCHRONIZE_IRQL]/*这里将IRQL写入ecx.....*/
/* Save old irql */
push eax
sub esp, 4
/* Begin interrupt */
push esp
push eax
push ecx
call _HalBeginSystemInterrupt@12 /*这个函数前面已经分析过了。。。*/
/* Check if it was handled */
or al, al
jz SpuriousInt
/* Acquire the lock */
GetIntLock:
mov esi, [edi+KINTERRUPT_ACTUAL_LOCK]
ACQUIRE_SPINLOCK(esi, IntSpin) /*获取自旋锁。。。。。*/
/* Make sure that this interrupt isn't storming */
VERIFY_INT kid
/* Save the tick count */
mov ebx, _KeTickCount
/* Call the ISR */
mov eax, [edi+KINTERRUPT_SERVICE_CONTEXT]
push eax
push edi
call [edi+KINTERRUPT_SERVICE_ROUTINE] /*这里调用具体的中断服务。。。。*/
/* Check if the ISR timed out */
add ebx, _KiISRTimeout/*检测ISR的处理是否超时。。。ReactOS设置了最大的ISR处理时间为55个tick......如果超过了55个tick..则说明ISR有问题了。。。。*/
cmp _KeTickCount, ebx
jnc IsrTimeout
ReleaseLock:
/* Release the lock */
RELEASE_SPINLOCK(esi)
/* Exit the interrupt */
INT_EPILOG 0
SpuriousInt:
/* Exit the interrupt */
add esp, 8
INT_EPILOG 1
#ifdef CONFIG_SMP
IntSpin:
SPIN_ON_LOCK(esi, GetIntLock)
#endif
IsrTimeout:
/* Print warning message */
push [edi+KINTERRUPT_SERVICE_ROUTINE]
push offset _IsrTimeoutMsg
call _DbgPrint
add esp,8
/* Break into debugger, then continue */
int 3
jmp ReleaseLock
/* Cleanup verification */
VERIFY_INT_END kid, 0
.endfunc
好了。。到这里具体的中断处理的注册过程就分析完了。。。现在来总结下。。。。
Windows为驱动开发提供的接口是IoConnectInterrupt 函数。。。。。。ReactOS的实现是先调用KeInitializeInterrupt初始化中断对象KIINTERUPT....。。此过程将KiInterruptTemplate....的那条指令mov edi,0这条指令的寻址地址改写为中断对象的地址。。。然后调用KeConnectInterrupt来建立实际的连接,这里根据中断的类型是为共享的还是非共享的。。。。改写KiInterruptTemplate后的跳转指令的跳转地址。。。。。。也就是跳转到实际的处理函数。。然后如果是第一次初始化。。。则还要改写IDT.....
1. 什么是Checked Build?
Windows 2000 Professional、Windows XP Professional和Windows Server 2003都有一个特殊的调试版本,被称为检查版本(Checked Build)。这是Windows操作系统代码在设置了“DBG”标志后重新编译的得到的一个版本,因此包含了调试信息,而且编译时没有做任何的代码优化。
之所以提供这样的检查版本,主要是为了设备驱动的开发人员,因为它可以针对那些被设备驱动程序或其他系统代码所调用的的内核模式函数执行更为严格的错误检查。
检查版本还可以被用来针对特定的组件跟踪到进一步的细节信息。(详情可以参见Microsoft Knowledge Base中编号为314743的文章,“HOWTO: Enable Verbose Debug Tracing in Various Drivers and Subsystems”)。
2. 如何获取Checked Build?
Checked Build 可以通过MSDN订阅获取。Windows XP Professional Service Pack 2 Checked Build可以从网上免费获得(http://www.microsoft.com/downloads/details.aspx?familyid=7a4d8d12-9f5d-42bb-b31c-7b31657c869c&displaylang=en)。
3. 如何安装部分Checked Build?
检查版本包含了操作系统组件的调试信息而且在编译时没有进行任何的代码优化,因此检查版本比发行版本大而且运行更加缓慢。幸运的是,我们不必要安装整个Checked Build。可以只拷贝内核映像文件(ntoskrnl.exe)和正确的HAL(hal.dll)的检查版本到普通的零售版本中。这样做的好处是,设备驱动程序和其他的内核代码受到了检查版本的严格检查,但是又不必运行其它所有组件的较慢的调试版本。
部分检查版本的安装可以通过以下的步骤完成:
1) 确认要安装的系统文件。
在安装部分检查版本之前,我们必须知道在我们已经安装的零售版系统上的相关系统文件(ntoskrnl.exe等)和HAL文件的版本。
在Windows NT的发行介质中(如安装光盘)一般都提供了多个版本的系统文件和HAL映像文件,这是为了适应多种类型的处理器和硬件平台的需要。操作系统在安装时会检查系统的硬件,根据实际情况来选择合适的系统文件和HAL映像文件拷贝到系统目录(%systemroot%system32/)中。
操作系统根据是否是多处理器平台和是否支持PAE(Physical Address Extension)来选择合适的系统文件。
ntoskrnl.exe
单x86处理器,使用不超过4GB的物理内存。
ntkrnlpa.exe
单x86处理器,支持PAE。
ntkrnlmp.exe
多处理器,使用不超过4GB的物理内存。
ntkrpamp.exe
多处理器,支持PAE。
同样,HAL也有与之对应的不同文件。
我们可以在系统的安装日志(%systemroot/repair/setup.log)中找到当前的零售版系统在安装时所使用的文件。如,在我的系统上有,部分相关的安装日志如下,
…
"WINDOWS"system32"hal.dll = "halacpi.dll","181f2"
"WINDOWS"system32"ntkrnlpa.exe = "ntkrnlpa.exe","1f6612"
"WINDOWS"system32"ntoskrnl.exe = "ntoskrnl.exe","220d8c"
…
从上面可以看出,我的系统所使用的系统文件是ntkrnlpa.exe 和 ntoskrnl.exe,HAL映像文件是halacpi.dll。如果系统所在的平台是多处理器的,则有可能是如下的安装记录:
…
"WINNT"system32"hal.dll = "halmacpi.dll","2bedf"
"WINNT"system32"ntkrnlpa.exe = "ntkrpamp.exe","1d66a6"
"WINNT"system32"ntoskrnl.exe = "ntkrnlmp.exe","1ce5c5"
...
这说明操作系统在安装时把halacpi.dll、ntkrnlpa.exe和ntoskrnl.exe从安装介质上拷贝到了系统目录中。选择正确的版本的文件很重要,否则系统就无法正常启动。
2) 拷贝检查版本的系统文件和HAL映像文件
在确定了系统文件和HAL映像文件所使用的版本之后,我们就可以把相应的检查版本的文件拷贝到系统文件中。
解压Windows XP Checked Build(我使用的是WindowsXP-KB835935-SP2-DEBUG-ENU),在 ."i386" 目中找到文件 halacpi.dl_、ntkrnlpa.ex_和ntoskrnl.ex_三个文件,把他们拷贝到%systemroot%system32/中。以上的三个文件分别以.dl_和.ex_为后缀,说明它们是被压缩过的,使用expand.exe对它们解压。在cmd中,更换当前目录至%systemroot%system32/,然后运行命令:
expand halacpi.dl_ halacpi.chk
expand ntkrnlpa.ex_ ntkrnlpa.chk
expand ntoskrnl.ex_ ntoskrnl.chk
解压得到的三个文件halacpi.chk、ntkrnlpa.chk和ntoskrnl.chk就是安装部分Checked Build需要的文件。
3) 修改Boot.ini文件,添加Checked Build的启动项。
安装部分的Checked Build就是用检查版本的系统文件(*.chk)来替换零售版的系统文件和HAL映像文件在启动时被加载。这只要在boot.ini文件中添加一个系统启动项,使用/kernel和/hal 选项就可以做到。在boot.ini文件中添加如下语句:
multi(0)disk(0)rdisk(0)partition(1)"WINDOWS="Windows XP Checked Build" /fastdetect /kernel=ntoskrnl.chk /hal=halacpi.chk
重新启动系统,选择Windows XP Checked Build启动项就启动了部分安装Checked Build的系统。
参考文献:
[1]. Mark E. Russinovich, David A. Solomon, 深入解析Windows操作系统(译:潘爱民), 电子工业出版社, 北京, 2007.
[2]. Windows NT DDK, “Installing Just the Checked Operating System and HAL”.
From:http://blog.csdn.net/redutopia/archive/2008/08/31/2855443.aspx
http://www.osronline.com/article.cfm?id=282
THE NT INSIDER
http://www.osronline.com/section.cfm?section=17
If you asked a random collection of techno-dweebs to name the system service API that is native to Windows NT chances are the vast majority would say something that eventually translated to "The Win32 API". Those of you properly schooled in NT systems internals will know that this just is not the case. The Win32 API is implemented by a "Client-Side DLL" that is specific to the Win32 Subsystem. The Win32 Subsystem is just one of Windows NT’s Operating System (OS) Emulation Subsystems. All of NT’s OS Environment Subsystems (Win32, POSIX, OS/2, and DOS/WoW) utilize services provided by the Windows NT Executive. These services are accessed by the OS Environment Subsystems via Windows NT’s actual native system service API, which is called "the NT API".
A Special Purpose
From the ground up, Windows NT was designed to be an operating system that facilitates the emulation of other operating systems and their APIs. The NT OS itself works hard at providing a robust infrastructure, while not imposing undo constraints on its OS Environment Subsystem clients and their applications. For example, there is no uniform set of wildcards imposed on NT file systems. This allows one to build a file system that includes "*" or "?" in the names of files.
The native NT API was designed for the use of OS Environment Subsystems to provide services to their clients. Thus, when a program running under control of the Win32 OS Environment Subsystem wants to create a new process it uses the Win32 function call CreateProcess(...). The parameters to this function are designed to be easy to use and make sense within the Win32 OS Environment Subsystem’s framework. When the Win32 OS Environment Subsystem receives the call, it can check its own information on the calling process to see if (for example) the caller has the resources and privileges necessary to have the request granted. If all his internal requirements are met, the Win32 OS Environment Subsystem then issues a function call to the native NT API function NtCreateProcess(...) to request Windows NT to create a process on behalf of the user.
Not all client function calls are processed or even seen directly by the OS Environment Subsystem however. If NT’s native handling of a function is close enough to meet the OS Environment Subsystem’s requirements, and no additional protection or resource checks are required by the OS Environment Subsystem, the mapping between the subsystem’s function call and the native NT function call can be performed right in the client-side DLL. This is the case, for example, with file I/O requests to the Win32 subsystem. Win32 appears to rely for the most part on Windows NT’s native protection, quota, synchronization, and handle management schemes. Thus, a Win32 application’s ReadFile(...) call is translated within Win32’s client-side DLL to a call to the native NT function NtReadFile(...). The overhead of a call to the Win32 OS Environment Subsystem is thus saved for this very common, and performance critical, operation.
One of the most interesting things about the NT API is that it has never been comprehensively documented by Microsoft. This must make NT the world’s only commercially available operating system with an undocumented set of native system services! In one way, the existence of the NT API doesn’t really matter to users: Programs can do anything they legitimately need to by using the interface provided by their OS Environment Subsystem. On the other hand, it is through the native interfaces provided by an operating system that we get a feel for how the operating system actually works. Not to mention that in order to build your own reasonably efficient OS Environment Subsystem, you would have to know the native NT API.
Native NT File I/O
So, what does the NT API look like? Well, let’s take a look at a few of its most basic function calls.
First, NT’s native function to either access an existing file on disk or create a new such file is the NtCreateFile()function. The prototype for this function appears in Figure 1.
NTSYSAPI NTSTATUS NTAPI NtCreateFile(PHANDLE FileHandle,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
PIO_STATUS_BLOCK IoStatusBlock,
PLARGE_INTEGER AllocationSize,
ULONG FileAttributes,
ULONG ShareAccess,
ULONG CreateDisposition,
ULONG CreateOptions,
PVOID EaBuffer,
ULONG EaLength
);
Figure 1 -- NtCreateFile(...)
Notice that the parameters for NtCreateFile(...) are identical to those for the ZwCreateFile(...) function, which is extensively and clearly documented in the Windows NT DDK. This is more than mere coincidence. Each of the native NT System Services comes in a ZwXxxx and NtXxxx variant.
According to NT’s Kernel-Mode Glossary (which contains a wealth of fascinating trivia, by the way) the NtXxxx functions check the supplied parameters and access modes for validity and explicitly set the previous mode to USER mode. The ZwXxxx function variants do not. Hence, NT Drivers call ZwCreateFile(...)when they are opening a file on their own behalf. OS Environment Subsystems (or, indeed native NT applications) would call NtXxxxx since they are calling from user mode.
Since the NT DDK is pretty clear on the meaning of the ZwCreateFile(...) function parameters, we’ll avoid describing each of them here. However, recall that this function is used to access lots of things other than files on disk. In addition to disk files, NtCreateFile() may be used to access a device, partition, directory, or even a socket, a pipe, or a mailslot.
One thing that might at first appear unusual about the NtCreateFile(...) function is that the name of the file to open is not one of the immediate parameters. Rather, the UNICODE_STRING specification for the file appears in the OBJECT_ATTRIBUTES structure supplied in the ObjectAttributes argument. This OBJECT_ATTRIBUTES structure is initialized with the function call InitializeObjectAttributes(...), which is also documented in the NT DDK. The prototype for InitializeAttributes(...) is shown in Figure 2.
VOID InitializeObjectAttributes(POBJECT_ATTRIBUTES InitializedAttributes,
PUNICODE_STRING ObjectName,
ULONG Attributes,
HANDLE RootDirectory,
PSECURITY_DESCRIPTOR SecurityDescriptor
);
Figure 2 -- InitializeAttributes(...)
The ObjectName parameter takes a pointer to a UNICODE_STRING which contains either a fully qualified path specification for the file to be opened, or a partial file specification relative to a previously opened directory. If the latter, the RootDirectory contains the handle to that previously opened directory. Note that there is no default directory for the file being opened as there is in Win32.
Using the OBJECT_ATTRIBUTES structure, there are two possible ways to open the file "fred.txt" in the directory C:"temp. Either you build a UNICODE_STRING for ObjectName that contains the fully qualified file path specification, such as ""DosDevices"C:"Temp"fred.txt" or else you first open the directory ""DosDevices"C:"Temp"" via a CreateFile(...) request, obtaining a handle to this open (directory) file. You then build a UNICODE_STRING for ObjectName that contains just the file name portion "fred.txt" and pass it, along with the handle to the open directory to InitializeObjectAttributes(...).
Here, once again, NT shows it’s true colors as an operating system designed to facilitate the emulation of other operating systems. By not providing a default path specification, Windows NT allows it’s OS Environment Subsystems to the path for a file without intrusion by the underlying system. In addition, whether or not the file specification is case sensitive is determined by the Attributes parameter to the InitializeObjectAttributes(...) function. This allows the OS Environment Subsystem to supply its default behavior for that of Windows NT.
By the way, there is also a "short cut" API that can be used to access an already existing file, and hence dispenses with some of the lesser-used parameters. This is the NtOpenFile()function, which appears in Figure 3.
NTSYSAPI NTSTATUS NTAPI NtOpenFile(PHANDLE FileHandle,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
PIO_STATUS_BLOCK IoStatusBlock,
ULONG ShareAccess,
ULONG OpenOptions
);
Figure 3 -- NtOpenFile(...)
Note that things like a buffer for the ExtendedAttributes, and the CreateDisposition are absent from this call, making it quick and easy to code.
With the file opened, and the FileHandle returned, we can now proceed to issue read and write requests to the file.
NT’s native API to read from a file is the NtReadFile(...) function. The function to write to a file is the NtWriteFile(...) function. The prototypes for these two are functions are identical, differing only in name. The prototype for NtReadFile(...)is shown in Figure 4.
NTSYSAPI NTSTATUS NTAPI NtReadFile(HANDLE FileHandle,
HANDLE Event,
PIO_APC_ROUTINE ApcRoutine,
PVOID ApcContext,
PIO_STATUS_BLOCK IoStatusBlock,
PVOID Buffer,
ULONG Length,
PLARGE_INTEGER ByteOffset,
PULONG Key
);
Figure 4 -- NtReadFile(...)
This function is also documented in the NT DDK in it’s ZwReadFile(...) variant. However, the documentation is not quite as extensive as it might be. Some notes on each of the parameters are shown in List 1.
Arguments to NtWriteFile(...)
FileHandle
The file handle returned from the CreateFile(...) call. If you have to ask, you’re reading the wrong publication;
Event
Handle to a previously created event, to use for synchronization. NTOS sets the state of this event to Signalled when the request is complete.
ApcRoutine
An optional pointer to a user Asynchronous Procedure Call (APC) function to be called when the request completes. This is what Win32 refers to as a FileIoCompletionRoutine(...). This function will only be called if the calling thread is in an alertable wait state. A wait is "alertable" if the Alertable argument to the wait function (such as KeWaitForSingleObject(...)) is set toTRUE.
ApcContext -- If ApcRoutine was supplied, above, this is a context argument passed to that function when it is called.
IoStatusBlock -- A pointer to an IO_STATUS_BLOCK to receive the result for the operation. The returned data is not valid until the request completes.
Buffer -- Pointer to the user buffer for the operation;
Length -- Length in bytes of Buffer;
ByteOffset -- An optional pointer to a LARGE_INTEGER containing the byte offset into the file at which this operation is to begin.
Key -- An optional value for a key, corresponding to a previously granted lock taken out on the file.
List 1 -- Parameters
Beyond obviously writing to or reading from the file handle provided, the precise behavior of the NtReadFile(...) and NtWriteFile(...) call depends on the values supplied during the NtCreateFile(...). For example, if during the NtCreateFile(...) operation the "Synchronize" flag set in the DesiredAccess parameter, and one of the FILE_SYNCHRONOUS_IO_xxxx flags has been specified in the CreateOptions, calls to NtReadFile(...) and NtWriteFile(...) will complete synchronously and the I/O system will track the current read/write offset in the file.
There are two small differences between the NtReadFile(...)function and Win32’s ReadFile(...) and ReadFileEx(...) functions. While these differences are far from earth shattering, they still serve as an interesting example of how an OS Environment Subsystem customizes its interface to meet its users expectations or needs.
In NtReadFile(...) there is an explicit Key parameter. This Key is a value supplied by the user during a previously successful NtLockFile(...) function call. Supplying this Key allows the user to bypass a lock previously taken out on a specific region of the file. In the Win32 ReadFile(...) function, no Key value can be supplied. If a Win32 process has taken out a lock on a file, the Key for that lock is apparently automatically supplied by the Win32 Subsystem’s Client-Side DLL. Similarly, Win32 does not allow the user to specify the ApcContext, choosing to uniformly supply a pointer to its OVERLAPPED structure automatically instead.
A neat facility available through the native NT API but not through Win32 is the ability to cancel all I/O requests that you have outstanding on a particular file handle. The function to accomplish this is in Figure 5.
NTSYSAPI
NTSTATUS NTAPI NtCancelIoFile(HANDLE FileHandle,PIO_STATUS_BLOCK IoStatusBlock);
Figure 5 -- NtCancelIoFile()
If you’ve ever wondered how you accomplish an I/O cancel operation, now you know!
Finally, to close a previously opened file handle, the NtClose(...) function is used, also shown in Figure 6.
NTSYSAPI NTSTATUS NTAPI NtClose(HANDLE HandleToClose);
Figure 6 – NtClose(...)
The operation of this call is similar to that of the Win32 function CloseHandle(...) function. Again, the differences between the native NT call and the Win32 function can be seen: The Win32 function returns a BOOLEAN and raises an exception (ugh!) if an invalid handle value is provided. The native NT function simply returns an NTSTATUS value that indicates the outcome of the call.
Building Native NT Programs
While all this new-found knowledge might be interesting, it would all be totally academic if you couldn’t actually USE it. So how do you actually build a native NT program? Well, the list below shows how we do it at OSR:
- You’ll need to define the function prototypes. Each of the NT API functions needs to be defined as shown above.
- When you compile, you’ll need to include ntddk.h, or else redefine the many types and structures that are found only there and are required for these functions. For example, the structure OBJECT_ATTRIBUTES appears to be defined in ntddk.h and nowhere else.
- You’ll need to link against ntdll.lib, which resolves your function references to calls into ntdll.dll.
- Want to free yourself completely of subsystem control? When you link your program, define the subsystem under which it runs to be "native" (using the /Subsystem:native linker option).
And that’s all there is to it! If you’d like to grab a very simple sample application that writes "hello world" to a file using the native NT API, you can download it from our web site.
Just Because You Can
So, is this the new way we should all write our user-mode code from now on? What does one achieve by "going native" and directly using the Windows NT system service API?
Well, you could certainly achieve a lot of hassles. Since the NT API is neither documented nor supported, there’s nobody at Microsoft you can complain to if the interface changes arbitrarily or doesn’t work as (not) advertised. Plus, with the absence of a default directory for your file opens you get to manually specify the path name to each of the files you want to open. Remember, the NT API really wasn’t designed to be an end-user interface.
Is there any good reason at all, then, to use the NT API directly? Well, of course. For one, there’s the ability to cancel I/O requests, which you can’t even get at from Win32. Playing with the NT API also gives you a window into the handiwork of the NT developers. So much of NT is buried underneath other stuff, much of which might be considered pretty ugly by comparison. Using the NT API helps you get a better "feeling" for what NT itself does as an operating system, and how it works. Also, we have in our travels found at least one real application that required the use of the NT API.
Sidebar Discussion -- Nt vs. Zw Continued
The NT Insider, Vol 10, Issue 5, September - October 2003 | Published: 10-Nov-03| Modified: 10-Nov-03
One of the most fun things about writing for The NT Insider is the ability to generate comment and controversy around what you write. We exchanged a few emails as a result of last issue’s article Nt vs. Zw - Clearing Confusion On The Native API.
In that article we recommended that driver writers use the Zw variants of the routines so that their kernel mode credentials and buffers are properly handled. This of course assumes that you are servicing a request on behalf of a kernel mode component and are passing in kernel mode buffers. We wanted to emphasize that there are in fact many times when you might need to service a request on behalf of a user mode component, using the supplied user mode credentials and buffers. In these cases, it is much more desirable to use the Nt variant of the system service call because doing so places the onus of checking the user’s parameters on the Windows system service code. Remember, it’s your responsibility to validate buffers passed into a Zw call!
Some folks have gone so far as to suggest that, even though the Nt form of the system service calls aren’t specifically documented (and the prototypes aren’t supplied), they should be the version you use by default. In fact, we tend to agree that this is the most conservative course. This is at least slightly controversial, however, and even our friends up at Microsoft don’t necessarily agree on which approach is best.
Another point we’d like to emphasize has to do with the use of kernel handles. Not only does using kernel handles allow for access to the handles from all process contexts, it also adds increased security. If you do not specify OBJ_KERNEL_HANDLE when creating a handle, the handle is valid in the current process context and accessible from that process. A malicious or buggy application could delete the handle or replace it to point off to something that it should not be able to access. However, as was pointed out in the article and sample code, user mode components have no access to kernel handles so using OBJ_KERNEL HANDLE closes this security hole.
Nt vs. Zw - Clearing Confusion On The Native API
The NT Insider, Vol 10, Issue 4, July-August 2003 | Published: 15-Aug-03| Modified: 27-Aug-03
Click Here to Download: Code Associated With This Article Zip Archive, 14KB
The NT native API is nothing new. It’s been discussed ad nauseum, it’s been exploited by umpteen different utilities, and portions of it have even migrated into the realm of the fully documented and supported in the DDK. Come to think of it, why am I even writing about this then? I think I will just pop in my Kung Faux DVD and watch the Ill Master take care of business…Oh yeah, now I remember why I started writing this: Believe it or not, people are still confused about certain aspects of the native API. Common questions include:
- Why are there two flavors, NtXxx and ZwXxx?
- Why do my calls to ZwXxx sometimes fail, but sometimes work?
- What does the Zw stand for? Was the name of the original NT developer really Zimbanza Woobie!?
OK, well, maybe people don’t really ask that last one very often. But for those that do, Zw is entirely random and the developers chose it specifically because it could never mean anything. The other questions do often come up on the NTDEV and NTFSD mailing lists though (see http://www.osronline.com/lists for more info on the NTDEV and NTFSD peer help lists), so it is about time the record was set straight. In order to do this we are going to do a bit of disassembly. All listings will be from an XP SP1 Free build. Also, note that we’ve got sample driver code to accompany this article that shows you how to use the native system services from Kernel Mode. See the description (and URL) of the sample code provided at the end of this article.
This article assumes that the reader already understands that there is a native API and understands how the native API relates to the other subsystems in Windows. Enough information already exists on this out there that it is not worth repeating in this article.
Vanilla or Chocolate
First, let’s do a bit of math that even I can handle. We have two sets of APIs, NtXxx and ZwXxx, and two modes to call them from, User and Kernel. This means that we have four different scenarios under which we can call these routines. Using XxReadFile as the example, we have:
-
User Mode application calls NtReadFile
-
User Mode application calls ZwReadFile
-
Kernel Mode driver calls NtReadFile
-
Kernel Mode driver calls ZwReadFile
What exactly are the differences in these scenarios? Let us start by talking about the land where no driver writer feels safe, User Mode.
Calling From User Mode
As you (probably) know, User Mode applications link with NTDLL.LIB. Sticking with our example of XxReadFile, let’s compare the disassembly of the NtReadFile and ZwReadFile routines within NTDLL:
0: kd> u ntdll!NtReadFile
ntdll!NtReadFile:
77f761e8 b8b7000000 mov eax,0xb7
77f761ed ba0003fe7f mov edx,0x7ffe0300
77f761f2 ffd2 call edx
77f761f4 c22400 ret 0x24
That looks to me to be a stub that calls another routine and returns. Further inspection will definitely be necessary, but let’s just check out ZwReadFile before we move on.
0: kd> u ntdll!ZwReadFile
ntdll!NtReadFile:
77f761e8 b8b7000000 mov eax,0xb7
77f761ed ba0003fe7f mov edx,0x7ffe0300
77f761f2 ffd2 call edx
77f761f4 c22400 ret 0x24
Well look at that! They both point to the exact same place, which means that from a User Mode program it does not matter which routine you call because you are going to end up in the same place anyway. If you pick any other system service call you will notice that they all have this exact format, so our example will apply to any API you choose. The good news is that this article just got a bit shorter and I’ll be chillin’ with the Ill Master in less time than I thought.
Now let’s see what exactly is at address 0x7ffe0300, which is where we jump when we make these calls (and, as mentioned previously, where we jump when we make any native API call from User Mode).
0: kd> ln 0x7ffe0300
(7ffe0300) SharedUserData!SystemCallStub
Exact matches:
SharedUserData!SystemCallStub
0: kd> u SharedUserData!SystemCallStub
SharedUserData!SystemCallStub:
7ffe0300 8bd4 mov edx,esp
7ffe0302 0f34 sysenter
7ffe0304 c3 ret
Now how’s that for a straight-forward routine: Something (turns out it’s the code that represents which system service was called) gets put into EAX by the caller, then this routine puts a pointer to the top of the User Mode stack into EDX. Ohh, SYSENTER, uhm, ah, of course…Must be a new instruction. Let me see, that was added in…1997?? Forging ever onward, let us check the Intel documentation for SYSENTER. It says here that the SYSENTER instruction switches the current thread into Kernel Mode and executes the routine pointed to by the SYSENTER_EIP_MSR, which is MSR 0x176.
This is a good time to point out why hooking INT 2E is a bad idea and why old INT 2E hooks will not work. On systems that support SYSENTER, INT 2E is simply not used anymore. Your hook is useless if no one is ever going to call it!
Going back to WinDBG, let us execute the rdmsr command and see what is in the SYSENTER_EIP_MSR:
0: kd> rdmsr 176
msr[176] = 00000000:8053a270
Very interesting. Let’s see what that address is:
0: kd> ln 8053a270
(8053a270) nt!KiFastCallEntry | (8053a2fb) nt!KiSystemService
Exact matches:
nt!KiFastCallEntry
I will spare all of you the disassembly of KiFastCallEntry. It is an interesting read so I suggest it if you are curious, but all the code is going to do is build a trap frame so that when we exit Kernel Mode we can continue executing from where we left off. I will show the very last line of the function though:
053a2f9 eb5c jmp nt!KiSystemService+0x5c (8053a357)
We can see here that KiFastCallEntry does not actually return, it just does an unconditional jump to some offset into KiSystemService. Again sparing the reader large amounts of disassembly, the code in KiSystemService eventually takes the service number that was put into EAX on the first line of the call to XxReadFile and looks up its entry in the system service table, KiServiceTable. Each entry in this table is a pointer to a native API, also known as "system service", routine. Before calling the "system service" routine, the system service dispatch code copies the parameters that are being passed to the system service from the top of the User stack to the top of the Kernel stack. Ah! Guess that’s why a pointer to the top of the stack is saved into EDX before executing the SYSENTER.
Using the debugger extension DLL accompanying this article, we can see that index 0xb7 points to the kernel version of NtReadFile:
0: kd> !osrexts.sst
0: 0x805912c2 (nt!NtAcceptConnectPort)
1: 0x805d87b0 (nt!NtAccessCheck)
2: 0x805dc3e4 (nt!NtAccessCheckAndAuditAlarm)
...
b7: 0x8056b2ec (nt!NtReadFile)
...
And, if we look at the address of nt!NtReadFile (address 0x8056b2ec), we see:
0: kd> u nt!NtReadFile
nt!NtReadFile:
8056b2ec 6a58 push 0x58
8056b2ee 6858044e80 push 0x804e0458
8056b2f3 e8e09ffcff call nt!_SEH_prolog (805352d8)
8056b2f8 33ff xor edi,edi
8056b2fa 897de4 mov [ebp-0x1c],edi
8056b2fd 897de0 mov [ebp-0x20],edi
8056b300 897dd8 mov [ebp-0x28],edi
8056b303 897ddc mov [ebp-0x24],edi
8056b306 64a124010000 mov eax,fs:[00000124]
8056b30c 8945d4 mov [ebp-0x2c],eax
8056b30f 8a8040010000 mov al,[eax+0x140]
8056b315 8845d0 mov [ebp-0x30],al
8056b318 57 push edi
8056b319 8d45cc lea eax,[ebp-0x34]
8056b31c 50 push eax
8056b31d ff75d0 push dword ptr [ebp-0x30]
Ah, Finally! It looks like the function that actually implements the read file system service.
So, to summarize the flow of a native API call from User Mode
User Mode program calls either NtXxx or ZwXxx, both of which point to the same location
All native API calls from User Mode have a body that simply loads an index into EAX, executes SystemCallStub, and returns
SystemCallStub
saves a pointer to the top of the User Mode stack into EDX and executes a
SYSENTER instruction
SYSENTER
disables interrupts, switches the thread into Kernel Mode and executes the instruction located in the
SYSENTER_EIP_MSR (which on XP SP1 is
KiFastCallEntry)
KiFastCallEntry
builds a trap frame so it knows where to go when returning back to User Mode, enables interrupts, and jumps into
KiSystemService
KiSystemService
, amongst doing other things, copies the parameters from the User stack (pointed to by EDX) and takes the value previously stored in EAX and executes the function located at
KiServiceTable[EAX]
The native API now executes in Kernel Mode with the previous mode of the thread set to User Mode. This indicates the caller came from User Mode. If you are going to remember one thing about this exercise, remember this! We’ll talk about it much more later in this article.
Now that we have gone through a gross amount of detail for the User Mode portion, we should be able to zip right through the Kernel Mode variants.
Calling From Kernel Mode
As you (should) know, Kernel Mode components link with NTOSKRNL.LIB. Let’s continue to use XxReadFile and see what the two variants look like from the kernel side of things. First, let’s try NtReadFile:
0: kd> u nt!NtReadFile
nt!NtReadFile:
8056b2ec 6a58 push 0x58
8056b2ee 6858044e80 push 0x804e0458
8056b2f3 e8e09ffcff call nt!_SEH_prolog (805352d8)
8056b2f8 33ff xor edi,edi
...
Well, this looks familiar! It’s the function that implements NtReadFile that was eventually called from User Mode (because it is where the system service table points to). Therefore, notice that if we call NtReadFile from a driver, we just execute the function, bypassing any common system service dispatcher type of entry point.
Going on what I have seen before in User Mode, where NtXxxx and ZwXxxx were identical, when I disassemble nt!ZwReadFile I’d probably expect to see exactly what I saw in nt!NtReadFile. Let’s check:
0: kd> u nt!ZwReadFile
nt!ZwReadFile:
80504d4c b8b7000000 mov eax,0xb7
80504d51 8d542404 lea edx,[esp+0x4]
80504d55 9c pushfd
80504d56 6a08 push 0x8
80504d58 e89e550300 call nt!KiSystemService (8053a2fb)
80504d5d c22400 ret 0x24
Blast! I guess I have got a bit longer before I can lounge.
We see a familiar instruction in the beginning, move 0xb7 into EAX. Then we put a pointer to the parameters that appear on the Kernel stack into EDX, push the EFLAGS and a constant value onto the stack, and finally call KiSystemService!? That was the function that we wound-up calling from KiFastCallEntry when we did the SYSENTER from User Mode.
So why aren’t we executing a SYSENTER here? Duh! Because we are already in Kernel Mode, so what is the point of entering it again? The most important thing that is going to happen when we go this route is that we are going to call the native API from Kernel Mode, execute in Kernel Mode, and in the course of going through KiSystemService our previous mode will be set to Kernel Mode. Note that this is definitely not the case if we just call the NtXxx version from Kernel Mode. In that case, our previous mode stays untouched and we go right to the function and start executing.
So, to summarize the flow of a native API call from Kernel Mode:
Case A:
Case B:
So, it’s clear that calling NtXxx directly has less overhead, but calling ZwXxxx changes previous mode. So, what’s up with that? It seems like previous mode must be something pretty important.
Previous Mode
Time to step back and figure out what all of this means. An important fact to know is that Kernel Mode components by default trust all other Kernel Mode components. Because system services are always processed in Kernel Mode, Windows keeps track of whether the request originated from User Mode or Kernel Mode to determine if the caller is to be implicitly trusted. The system uses the previous mode indicator to determine the mode from which a system service call came. When a call comes from User Mode, previous mode is set to User. When a system service processing routine needs to determine whether or not to implicitly trust its caller, it checks the value of previous mode. If previous mode is set to User, the system service processing routine knows the call came from User Mode and thus any parameters passed in to the function need to be validated before they can be used.
This is why the previous mode being set is really the most important part about what we have talked about so far. No matter what a User Mode application does, the system treats its system service request as a User request, coming from User Mode, and goes out of its way to validate the request. All buffers are subject to validation, all access checks are performed, and absolutely no part of the request is implicitly trusted. However, a Kernel Mode request is not as scrutinized and it is assumed that the passed in parameters are valid.
If a Kernel component calls the ZwXxx version of a native API, all is well. The previous mode is set to Kernel and the credentials of the Kernel are used. The system service processing routine that is called assumes that any parameters that are passed are valid, because the request came from a Kernel Mode component (and Kernel Mode components implicitly trust each other).
The NtXxxx version of the native system service is the name of the function itself. Thus, when a Kernel Mode component calls the NtXxxx version of the system service, whatever is presently set into previous mode is unchanged. Thus, it is quite possible that the Kernel component could be running on an arbitrary User stack, with the requestor mode set to User. The system service will not know any better, attempt to validate the request parameters, possibly using the credentials of the arbitrary User Mode thread, and thus possibly fail the request. Another problem here is that one step in the validation process for a User Mode request is that all passed in buffers have either ProbeForRead or ProbeForWrite executed on them, depending on the buffer’s usage. These routines raise exceptions if executed on Kernel Mode addresses. Therefore, if you pass in Kernel Mode buffers with your request mode set to User, your calls into the native API return STATUS_ACCESS_VIOLATION.
The moral of this bedtime story is that if you are in User Mode, use whatever variant you think makes your code look pretty. In Kernel Mode, use the ZwXxx routines and get your previous mode set properly, to Kernel Mode.
If I keep this up I am going to be seriously late for my date with Queenie, but she is just going to have to wait because there is still more to cover.
I’ll Handle This
All of the native API calls work with handle values, which index into one of two types of handle tables. A Handle either describes an entry in a table that is effectively a part of the EPROCESS structure (which means it describes an object that is specific to a particular process context) or it describes an entry in a global handle table (which means it describes an object that is visible to all process contexts). This makes for some interesting scenarios.
Say you have an existing driver and you decide that being able to optionally log to a file would be a nice feature. First thing you do is setup two IOCTLs, one to enable the logging and the other to disable the logging. In the handler for the IOCTL, to enable logging, you have the driver call ZwCreateFile (remember to use the Zw versions!), which returns you a handle to use to write to the file. So far so good.
InitializeObjectAttributes(&oa, &logFileName, OBJ_CASE_INSENSITIVE,
NULL, NULL);
code = ZwCreateFile(&devExt->LogFileHandle, GENERIC_WRITE,
&oa, &iosb, NULL, FILE_ATTRIBUTE_NORMAL,
0, FILE_OVERWRITE_IF,
FILE_NON_DIRECTORY_FILE | FILE_SYNCHRONOUS_IO_NONALERT,
NULL, 0);
From here, you set up a flag in your device extension that indicates that you are logging to a file, and start to add calls to ZwWriteFile to all of your dispatch entry points.
if (devExt->LoggingEnabled) {
code = ZwWriteFile(devExt->LogFileHandle, NULL, NULL, NULL,
&iosb, (PVOID)logMessage,
logMessageLen),
NULL, NULL);
}
You note that a restriction of ZwWriteFile is that you must call it at PASSIVE_LEVEL, so you setup work items to log your timer DPC and DpcForIsr. Then you enable logging on your device and something weird happens. All of your calls to ZwWriteFile in your dispatch entry points succeed, but the ones in your work items return STATUS_INVALID_ HANDLE! How can a handle switch back and forth between being valid and invalid when you have done nothing but open it and write to it?
Remember that you created that handle in your dispatch entry point. Therefore, you could have been running in the process context of the calling application when you created that handle. In this case, your handle references an object in your User Mode application’s handle table, which is located via its EPROCESS. Your work items are running in the SYSTEM process context, so your call to ZwWriteFile is correctly failed with STATUS_INVALID_HANDLE. This is because the handle that you’re passing in is meaningless in the SYSTEM process’ context.
So what is the answer? Give up on Windows and start a revolution to bring back MULTICS? Luckily, it doesn’t have to come to that. There is already a built-in solution to this problem. All you need to do is specify OBJ_KERNEL_HANDLE as one of your object’s attributes and that handle will be good in any context you might end up calling it in. This flag is the cue to the Object Manager that you want the handle to go into the global handle table, making it visible in all process contexts.
InitializeObjectAttributes(&oa, &logFileName,
OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE,
NULL, NULL);
Accompanying Samples
To see some of what we’ve talked about in action, this article has an accompanying sample for you to experiment with. The sample driver creates a log file in the root directory of the C: drive in response to an IOCTL. At the beginning of the main C file is a compile time flag USER_HANDLE. If this flag is not set, then the driver creates the handle as a Kernel Mode handle by using OBJ_KERNEL_HANDLE. Otherwise, the driver creates a User Mode handle that is valid only in the application’s context. The file is then written to using both NtWriteFile and ZwWriteFile from various parts of the driver. Each call has a full explanation of what NTSTATUS values we expect to be returned and why for both the User and Kernel handle cases. The driver portion of the sample is a legacy driver (non-WDM compliant) and must be installed with a utility such as OSR’s Driver Loader.
Also included in the samples download is a WinDBG extension DLL that locates the system service table and displays the system services located within it. To use it, simply put osrexts.dll into WinDBG’s extension DLL directory and execute !osrexts.sst in the command window.
In Summary
I hope that this article has finally put to rest the most common problems that people experience with the native API and cleared up the NtXxx versus ZwXxx question once and for all.
Now, where’s that DVD…