Windows CE 6 Kernel Architecture

Written by Henrik Viklund

摘自：http://www.addlogic.se/articles/articles/windows-ce-6-kernel-architecture.html

With CE 6's new kernel layout, the file system, the bulk of the drivers and GWES have been moved into the kernel. The kernel has also been restructured to allow better encapsulation from the OAL, and a number of actions have been taken to make the kernel more secure and robust. According to MS, the kernel performance compared to CE 5 can be summarized:

size has increased less than 5%
some performance gain in process switching is expected
inter-process calls will have some overhead attached to it
Thread switching, memory allocations and system calls will have about the same performance

Moving PSL:s into the kernel

In CE 5 part of the core functionality is implemented as sepparate processes, so called PSL:s or Process Server Libraries. The file system, the graphical window manager (gwes) and the device driver manager all run as separate processes in user space and register their API with the kernel. In CE 6, these servers have been integrated into the kernel and are instead loaded as dll:s by "oal.exe":

CE6_architecture

Before continuing it is important to sort out a few definitions. The concept of "kernel mode" as known from CE5 does not really exist in CE6. In Windows CE 5.0 and earlier, "kernel mode" is an access level attached to a thread. If a thread is "in kernel mode" it can access kernel address space (0x80000000 and above). In CE5 you’d call SetKMode to put your thread into (or out of) kernel mode at will.

In Windows CE 6.0 the implementation is pretty much the same, except that the SetKMode API is no longer supported. What this mean is that you can no longer choose to put a thread into/out of kernel mode yourself. In CE 6.0, the basic rule is: your thread is in kernel mode while executing code inside the kernel process, like when calling most system APIs, and not in kernel mode when executing code inside a user process.

The term “kernel mode” is often used loosely, like when talking about "kernel mode drivers”, "kernel mode servers" and "kernel mode addresses." Whenever people use these phrases they're referring to code and addresses that are only accessible to the kernel process (address 0x80000000 and above).

Now, converting the PSL:s to kernel mode servers will greatly reduce the overhead of system calls between these components. By using a special kernel version of "coredll.dll" -"kcoredll.dll" -calls between the components don't have to go through the "thunk and trap" code the user space version had to. It will also reduce the overhead of API calls to these servers since many of the API calls don't have to switch process as many times as in CE 5, and it opens up for increased code sharing between the kernel servers.

The downside is that moving these components into the magic land of kernel mode opens up for potential security and stability problems. As drivers now will run in kernel mode and thus have access privileges to all memory, a seemingly harmless driver may become a powerful tool in the hands of a hacker. So, in CE 6 it is more important than ever to know how to design well-behaved, secure drivers.

To give some protection against security vulnerabilities and/or system instability that untrusted- or poorly written drivers may cause, an option to load a driver in a user space mode instead of kernel mode has been implemented. This of course adds system call overhead to this driver since the kernel needs to switch processes, rather than just do a direct call into the driver.

Security and robustness

It is becoming increasingly popular for corporations to incorporate mobile or distributed solutions as part of their day-to-day business. With more and more sensitive information being hosted on, or accessed through, a mobile device the security of the device has become just as important as the security of a desktop system. So, making the new kernel as secure and robust as possible has been a top priority for Microsoft. Working with input from Microsoft's "Secure Windows Team", a number of areas have been addressed to improve the general robustness and to harden the kernel against potential security attacks:

The parameter validation for system calls has been improved.
Per-process page and handle tables - By not sharing handles over all processes, the risk of one process manipulating another process handles will be minimized.
Secure stacks - To safeguard stack manipulation, system calls will now have their own kernel space stacks.
Robust Heaps - The control structures now reside in a separate space from the heap data, rather than being interleaved with the heap data.
Safe Remote Heaps for OS Servers - The new kernel allows OS servers to allocate heaps that can be made read only for client processes, which make them harder to tamper with, and also increases the overall robustness of the OS.

There has been a lot of work done to enhance security, but in one area, the trust model in CE 6, it almost feels like they have regressed a bit rather than improved. The trust model has been reduced to a two-level (1-tier) model. Either a module is trusted, or it is not trusted (not allowed to run at all). The reason for this seemingly reduced functionality is that Microsoft is moving away from the old trust model towards a privilege based ACL model, and the current implementation in CE 6 can be seen as an intermediate step towards the full implementation.

Per-process page tables

This is really a consequence of the new memory architecture. Since each process has its own page table, pointers are unique to each process, and you cannot access memory in another process without some cooperation from the other process.

Per-process handles

Each process has its own handle table in CE 6. This improves security since it is not possible for one process to directly tamper with another process' handle, as in CE5 and before. It also improves robustness much in the same way: In earlier versions a "sloppy application" might actually release handles for other processes by mistake. Say, for example that the process implements a chunk of code that does a release of the handle even if it has already been released. This might not seem like a big deal at first glance, but since all handles are shared among processes in CE 5, this particular handle once released by the "sloppy" process, may now be allocated by another process. When the sloppy process releases the handle the second time, it actually is messing up the other process. By giving each process its own handle table, this problem is eliminated. In addition, handles are now referenced and its usage is tracked so that applications or drivers can check that it is not still used by anyone before it closes the handle.

Introducing per-process handles also complicates things a bit. If you need to pass a handle between processes, you now need to allocate a process-specific copy, duplicate, of the handle before it is valid to use in your process' context.

Restricted physical memory mapping

In CE 5 a trusted user process could map physical memory. In CE 6 the VirtualCopy and it's related API:s have been restricted to kernel components. In other words, accessing physical memory from a user process now must be done by using a kernel mode driver as proxy. However, to allow user mode drivers to map physical memory the udevice.exe application architecture allows for mapping kernel space memory with some restrictions.

Loader

The loader is responsible for loading executables and dll's. The loader now uses an enhanced security model that uses encrypted signatures to validate files before they are loaded and executed. With this new secure loader, Microsoft has laid the foundation for a future code based security architecture, i.e. security is enforced based on what code is running, rather than who uses it (user based security).

Monotonic clock

The kernel now implements a monotonic clock - a clock that is only ticking up, and is independent of the user clock. This provides a secure and reliable way of calculating elapsed time.

Huge file support

Another feature that is maybe not directly connected to the kernel layout but nevertheless worth mentioning is that with CE 6 it is now possible to map "huge" files. The inability to effectively manage really big memory mapped files efficiently has traditionally made the development of applications that rely on huge data stores to operate a real hassle (such as geodetic devices with its digital maps, or set top boxes that store and replay video streams). With CE 6, this limitation is finaly removed.

CE 6 PQOAL

As I mentioned earlier, one of the design goals with the new kernel was to increase the robustness of the kernel. As you probably know, the kernel makes use of various hardware specific services to operate such as memory cache, hardware interrupts etc. With CE 5 something called the "Production Quality OEM Adaption Layer", PQOAL, a common framework for providing these functions was introduced in an effort to formalize the architecture for how the kernel interfaces these hardware specific functions. The OAL is linked with the generic kernel code to form the full kernel module. While Microsoft encouraged developers to adhere to PQOAL it was not a requirement to use it. In CE 5 PQOAL, the kernel is basically monolithic, so to utilize the different levels of debug and profiling support available in the kerne, the kernel comes in three flavours:

"kern.exe": OAL + Kernel
"kernkitl.exe": OAL + Kernel + KITL
"kernkitlprof.exe": OAL + Kernel + KITL + Profiler

In CE6 PQOAL the kernel has instead been split into three main components:

"oal.exe"
"kernel.dll"
"kitl.dll"

CE6 PQOAL

This split effectively separates the OAL specifics and development/debug specifics from the rest of the kernel. Also, with CE6, the kernel profiling feature no longer needs a specially instrumented version of the kernel, you can profile anyway. Using the PQOAL in CE 6 is still just a recommendation by Microsoft, but I think most device manufacturers will find it is a good idea to take the step and adapt to the PQOAL when porting their old designs to CE6.

By splitting up the OAL from the generic parts two interesting things happen. First, the foundation for a well defined interface between the OAL and the kernel module is laid. It is no longer possible to just "extern" some kernel specific function in the OAL, as in the old monolithic architecture where the OEM sometimes use undocumented kernel functions to "backdoor" its way around some OAL "quirks" -something that may affect stability and compatibility of the kernel. In CE 6, the kernel functions are exported to the "oal.exe" as a function table (NKGLOBAL) as part of an initial handshake procedure. In the same way, OAL functions that are needed to support the core kernel are exported to "kern.dll" in a similar function table (OEMGLOBAL).

The second benefit in splitting the kernel into separate modules is that by keeping the generic kernel parts in one module paves way for a slightly better kernel update path. Now, updates and fixes to the core kernel can be distributed directly to the OEM:s as pre-built libs, thoroughly tested by MS.

Separating KITL from the kernel also has a number of advantages such as improved debug zone support for the OAL and KITL. However, the probably most exciting advantage -to be able to dynamically load KITL support- is not yet implemented. The basic structure is there, but "on-demand KITL" will probably not be available in CE6 when it retails. Hopefully it will be wedged in to a future update of the OS. Fingers crossed!

posted on 2007-12-26 00:01 WindowsCE 阅读(1938) 评论(1) 编辑收藏举报