DirectX10 Preview翻译
The Direct3D 10 December 2005 Technology Preview
The Direct3D 10 Technology Preview showcases the newest set of graphics API's for games and other high-performance multimedia applications on next-generation graphics hardware. This technology preview provides reference material, conceptual content, developer libraries, tutorials and samples that demonstrate how to use Direct3D 10. Additional content will provided in upcoming SDK releases.
Samples and applications built with the Direct3D 10 December 2005 Technology Preview require the Windows Vista December 2005 CTP to run. The Windows Vista December 2005 CTP is available to MSDN subscribers.
This documentation set is intended for developers using the C/C++ programming language.
D3D10技术预览揭示了最新的图形API集,这些API被用于开发基于下一代图形硬件的游戏或者其他高性能多媒体程序。本技术预览提供了参考材料、概念定义、开发库、入门教程和例子程序,教你如何使用D3D10。其他内容会在即将发布的SDK中提供。
D3D 2005年12月版中的例子和程序只能在Win Vista 2005 CTP 12月版上运行。MSDN用户可以获得Win Vista 2005 CTP 12月版。
这个文档是面向使用C/C++的开发人员。
Legal Information
Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
除非特别申明,本文档中例举的公司、组织、产品、域名、邮箱地址、标志、人物、地点和事件都是虚构的,没有任何真实的公司、组织、产品、域名、邮箱地址、标志、人物、地点和事件与此相关。遵守版权法是使用者的责任。没有版权许可,无论为什么目的,本文档的任何部分都不能在任何检索系统中被复制,存储或者引用,或者使用任何手段把本文档转换为任何形式(如电子形式的,机械形式的,胶片,录影带等等),除非拥有微软公司的书面允许。
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.
微软公司拥有和本文档主题相关的各种专利,专利程序,商标,版权或其他知识产权。除非明确拥有微软提供的书面许可证书,本文档并不提供使用这些专利,商标,版权和其他知识产权的权利。
1995-2005 Microsoft Corporation. All rights reserved.
1995-2005 微软公司。保留所有权利。(现在是2006年^_^)
Microsoft, MS-DOS, Windows, Windows NT, Direct3D, DirectAnimation, DirectDraw, DirectInput, DirectMusic, DirectPlay, DirectShow, DirectSound, DirectX, Visual C++, Visual Studio, Win32, Xbox, Xbox 360 and XNA are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
Microsoft, MS-DOS, Windows, Windows NT, Direct3D, DirectAnimation, DirectDraw, DirectInput, DirectMusic, DirectPlay, DirectShow, DirectSound, DirectX, Visual C++, Visual Studio, Win32, Xbox, Xbox 360 and XNA是注册商标或者是微软公司在美国或者其他公司的商标。
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
本文档提及的实际的公司或者产品是它们各自拥有者的商标。
Direct3D 10 Graphics
Discover the Direct3D 10 graphics features in one of these sections:
你可以从以下章节中查看D3D10的细节
Programming Guide - This section contains architecture descriptions, functional block diagrams, descriptions of the building blocks in the pipeline, code snippets, and sample applications.
编程向导-这一节包括D3D架构描述,功能图,流水线功能模块的描述,代码片断和例子程序。
Reference - This section contains the reference pages for the Direct3D 10, D3DX10, and DXGI APIs. This includes the syntax for the API methods, functions, instructions, and data structures. It includes an explanation of how each API method works and often includes code snippets.
参考文档-这一节包括D3D10,D3DX10和DXGI接口的参考文档。包括这些API函数的语法,功能,指令和数据结构。它还包括对这些API的解释,以及一些代码片断。
Programming Guide
The programming guide contains information about how to use the Direct3D 10 programmable pipeline to create real-time 3D graphics for games.
编程向导告诉你如何使用D3D10编程流水线来在游戏中创建实时3D图形。
Reference
The Direct3D 10 API is described in the following sections:
D3D10 API在如下章节中描述:
Direct3D reference
D3DX reference
DXGI reference
API Features
The Direct3D 10 graphics pipeline represents a fundamental architecture change, rebuilt from the ground-up in hardware and software to power the next-generation of games and 3D multimedia applications. It is built upon the Windows Vista Display Driver Model (WDDM) infrastructure, enabling new performance and behavioral enhancements and guaranteeing of full virtualization of GPU memory.
D3D10图形流水线给出了绘制架构的根本性变化,为了增强了下一代游戏和3D多媒体程序的绘制能力,它的软件和硬件都由底层重新构建。它基于Windows Vista显示驱动模型(WDDM)构建,增强了功能和性能,保证了GPU显存的完全虚拟化(应该指从编程角度可以无视内存和显存差别)。
Developers familiar with Direct3D 9 will discover a series of functional enhancements and performance improvements in Direct3D 10, including:
熟悉D3D9的开发人员能够发现D3D10中一系列的功能增强和性能改进,包括:
- The ability to process entire primitives (with adjacency), amplify and de-amplify data in the new geometry shader stage.
处理整个图元(包括邻接信息)的能力,在新的GS阶段改变物体几何
使用SO阶段从绘制流水线中生成顶点到显存中的能力
- New objects and paradigms provided to minimize CPU overhead spent on validation and processing in the runtime and driver.
提供新的对象和输入模板来减小运行时使驱动生效处理数据的CPU消耗。
- Organization of pipeline state into 5 immutable state objects, enabling fast configuration of the pipeline.
把绘制流水线状态组织成5个不变的绘制状态对象,使流水线的描述更为简洁。
- Organization of shader constant variables into constant buffers, minimizing bandwidth overhead for supplying shader constant data.
把Shader常数变量组织到常数缓存中,减少了给Shader提供常数的带宽。
- The ability to perform per-primitive material swapping and setup using a geometry shader.
使用GS能够执行逐图元的材质变换和构建。
- New resource types (including shader-indexable arrays of textures) and resource formats.
新的资源类型(包括Shader可读取的带索引的纹理数组)和资源格式。
- Increased generalization of resources in memory and ubiquity of resource access - resource views enable interpretation of resources in memory as different types or representations.
增加内存资源的通用性,资源使用统一格式(资源视图)来读取在内存中不同类型和格式的资源。
- A full set of required functionality: legacy hardware capability bits (caps) have been removed in favor of a rich set of guaranteed functionality. To enable this and other design improvements, the Direct3D 10 API only targets Direct3D 10-class hardware and later.
一系列完整的功能需求:legacy hardware capability bits (caps)被移除并引入了更为丰富的性能验证功能;为了使以上设计生效,D3D10 API只对将来D3D10类的硬件生效。
- Layered Runtime - The Direct3D 10 API is constructed with layers, starting with the basic functionality at the core and building optional and developer-assist functionality (debug, etc.) in outer layers.
层次运行。D3D10 API由几个层次构成,从核心层次使用D3D10最基本的功能,在其他层次构建可选择的和辅助开发的功能(比如调试D3D10等)。
- Full HLSL integration - All Direct3D 10 shaders are written in HLSL and implemented with the shader common core.
完全的HLSL集成。所有D3D10 Shader使用HLSL编写,使用通用Shader核心实现。
- An increase in the number of render targets, textures, and samplers. There is also no shader length limit.
增加了绘制对象(窗口),纹理和纹理采样的数量,Shader没有指令数限制。
- Full support for:
- Integer and bitwise shader operations
- Readback of depth/stencil and multisampled resources in the shader
- Multisample Alpha-to-Coverage
完全支持
Shader整数和位操作;在Shader中读回Depth和Stencil值和多采样资源;多采样Alpha覆盖。
There are additional behavioral differences that Direct3D 9 developers should also be aware of. For a more complete list, refer to Direct3D 9 to Direct3D 10 Considerations.
D3D9的开发人员应该清楚其他的接口变化。改动完全列表请参照Direct3D 9 to Direct3D 10 Considerations。
State Objects
In Direct3D 10, device state is grouped into five immutable State Objects:
在D3D10中,设备状态被分为5种不变的状态对象:
- Input Layout Object - This group of state (see D3D10_INPUT_LAYOUT_DESC) affects the input assembler state. This includes state such as the number of elements in the input buffer and the signature of the input data. The input assembler is a new stage in the pipeline whose job is to stream primitives from memory into the pipeline. To find out more about the input assembler stage and the input layout object, see Create the Input Layout.
输入层对象(IO):这组状态(请看D3D10_INPUT_LAYOUT_DESC)作用于输入集成器的状态。这些状态包括输入缓存的元素个数,输入数据的描述。输入集成器使在绘制流水线中的新阶段,它的功能使把显存中的图元流数据读入到绘制流水线中。关于输入集成器和输入层次对象的更多功能,请看Create the Input Layout。
- Rasterizer Object - This group of state (see D3D10_RASTERIZER_DESC) affects the rasterizer stage. This object includes state such as fill or cull modes, enabling a scissor rectangle for clipping, and setting multisample parameters. This stage rasterizes primitives into pixels, performing operations like clipping and mapping primitives to the viewport. To find out more about the rasterizer stage and the rasterizer state object, see Set Rasterizer State.
光栅化对象:这组状态(参见D3D10_RASTERIZER_DESC)影响光栅化阶段。这个对象包括填充模式,剔除模式,裁减模式和多采样参数等绘制状态。这个阶段把图元光栅化成像素,执行裁减和匹配图元到视口上的操作。更多关于光栅化阶段和光栅化状态对象的信息,请看Set Rasterizer State。
- DepthStencil Object - This group of state (see D3D10_DEPTH_STENCIL_DESC) configures interactions with the depth buffer and sets up stencil testing. To find out more about the output merger stage and the depth-stencil state object, see Set Depth Stencil State.
DepthStencil对象:这组状态(看D3D10_DEPTH_STENCIL_DESC)描述了对深度缓存进行交互并设置模板测试状态的操作。关于输出合并阶段和深度模板对象的更多信息,参见Set Depth Stencil State
- Blend Object - This group of state (see D3D10_BLEND_DESC) defines how Pixel Shader outputs are blended with the current render target value in the Output Merger. See Set Blend State.
Blend对象:这组状态定义了PS输出和当前RT中的值进行混合的操作,看Set Blend State
- Sampler Object - This group of state (see D3D10_SAMPLER_DESC) describes a texture Sampler. Samplers are used by the shader stages to filter textures in memory. Set sampler state in a shader by calling VSSetSamplers, PSSetSamplers or GSSetSamplers.
采样对象:这组状态(看。。。。)描述了纹理采样操作。纹理采样在Shader阶段使用,把内存中的纹理经过滤波读入。设置纹理采样状态需要调用VSSetSampler,PSSetSampler或者GSSetSamplers。
|
Differences between Direct3D 9 and Direct3D 10:
In Direct3D 10, the sampler object is no longer bound to a specific texture - it just describes how to do filtering given any attached resource.
|
D3D9和D3D10的区别:
D3D10中采样对象并不绑定在一张特定纹理上,它只是描述了对指定资源的滤波方式。
A state object is immutable, that is, once it is created it cannot be changed without being destroyed and recreated. This facilitates create-time validation and mapping, allows state-setting to be pipelined, and makes caching of these objects in hardware possible, minimizing state-setting overhead at runtime.
状态对象是不变的,这就是说,一旦它被创建就不能被消出或者重建。这使得创建时的生效和匹配更为方便,允许绘制状态成为流水线,并且可能在硬件中暂时存储这些状态,减少运行时的状态设置消耗。
For example, you could create several sampler objects with various sampler-state combinations. Changing the pipeline sampler state is then done by calling the appropriate Set API which takes a handle to the object. No state is passed, just an object handle so changing state becomes as fast as sending an object handle to the device. By encapsulating the state in these state objects, the number of calls is significantly reduced and the time for each call is also reduced since no state is passed.
举个例子,你可以创建几个采样对象使用不同的采样状态组合。改变流水线采样状态只需要调用合适的设置函数装入对应的对象。传入的不是绘制状态,而是一个状态对象,所以改变状态的速度和对设备传入一个操作对象的速度一样快。通过把状态封装到这些对象里,调用API的数量明显减少,调用API时间也因为没有状态传入而减少。
You can create up to 4096 of each type of state objects on a device. The Direct3D 10 Effects system will automatically manage efficient creation and destruction of state objects for your application.
你能够在一个设备上为每种状态对象创建4096个实例。D3D10 Effect系统会有效管理应用程序里状态对象的创建和析构。
API Layers
The Direct3D 10 runtime is constructed with layers, starting with the basic functionality at the core and building optional and developer-assist functionality in outer layers.
D3D运行库由多个层次构成,从底层核心的基本功能开始,直到在外部层次构建可选择和辅助开发的功能为止。
Core Layer
The core layer exists by default; providing a very thin mapping between the API and the device driver, minimizing overhead for high-frequency calls. As the core layer is essential for performance, it only performs critical validation.
核心层默认存在,提供很少一部分API和设备驱动的匹配,减少高频率函数调用的消耗。核心层关乎性能,因此只执行一些关键操作。
The remaining layers are optional. As a general rule, layers add functionality, but do not modify existing behavior. For example, core functions will have the same return values independent of the debug layer being instantiated, although additional debug output may be provided if the debug layer is instantiated.
剩下的层次是可选的。作为一个通用规则,层次增加新的功能,但不改变已有功能。举个例子说,核心层函数返回值不会因为调试层的加入而改变,即便调试层会提供附加的调试输出信息。
Create layers when a device is created by calling D3D10CreateDevice and supplying one or more D3D10_CREATE_DEVICE_FLAG values.
调用D3D10CreateDevice生成D3D设备时使用多个D3D10_CREATE_DEVICE_FLAG标记来创建多个层次
Debug Layer
This layer provides extensive additional parameter and consistency validation (such as validating shader linkage and pipeline binding, validating parameter consistency, and reporting error descriptions). Its debug output is also provided as a queue of report strings, accessible via the ID3D10InfoQueue interface. Any errors produced by the core runtime layer will also be highlighted with warnings by the debug layer.
这一层提供附加的参数和一致性检查(比如Shader链接检查,参数一致性检查并报告错误)。调试输出提供一个报告字符串队列,可以通过ID3DinfoQueue接口获得。任何核心层运行时错误会在调试层被显式警告。
This layer is implemented in D3D10SDKLayers.DLL, which is only available with the SDK installed.
这个层次在D3D10SDKLayer.DLL中实现,这个动态链接库在SDK安装时提供。
The debug layer performs extensive additional parameter and consistency validation, and returns error reports in a queue of report strings.
废话,懒得翻译了。
Shader-Reflection Layer
This layer enables an application to use Get methods to retrieve shader bytecode from the device.
这个层次让应用程序通过Get函数从设备上获得Shader的二进制代码。
Switch-to-Ref Layer
This layer provides the ability to transition between hardware and reference rasterizer implementations of the graphics pipeline for debugging purposes. All device state, resources and objects are maintained through this transition.
这个层次提供了为了调试而从硬件和参考光栅化实现之间的绘制流水线转换的功能。所有设备状态、资源和对象转换后都保留。
This layer is implemented in D3D10SDKLayers.DLL, which is only available if you have the SDK installed.
上面说过了,废话。
Thread-Safe Layer
This layer is designed to allow multi-threaded applications to access the device from multiple threads.
这一层允许多线程程序读取设备。
Direct3D 10 enables an application to exercise explicit control over the device synchronization primitive with device functions that can be invoked at any time over the lifetime of the device, including enabling and disabling the use of the critical section (temporarily enabling/disabling multithread protection), and a means to take and release the critical section lock and thereby hold the lock over multiple Direct3D 10 API entrypoints.
D3D10允许程序通过在设备的生命期内任何时刻都能调用的设备函数从外部同步控制设备图元,包括对临界区域(临时打开/关闭多线程保护)的生效和失效的控制,对邻接区域的开锁和关锁,并在多个D3D10函数入口点控制锁状态。
This layer is enabled by default, but if not present has no performance impact on single-thread accessed devices. Use D3D10_CREATE_DEVICE_SINGLETHREADED (in D3D10CreateDevice) to turn this layer off.
这个层次默认生效,但是如果没有的话对单线程性能也不会产生影响。使用D3D10_CREATE_DEVICE_SINGLETHREADED关闭这个层。
|
Differences between Direct3D 9 and Direct3D 10:
Unlike Direct3D 9, the Direct3D 10 API defaults to fully thread-safe.
|
D3D9和D3D10的区别:
和D3D9不一样,D3D10接口默认是线程间安全的。
Reference Counting
Direct3D10 pipeline Set functions do not hold a reference to the DeviceChild objects. This means that each application must hold a reference to the DeviceChild object for as long as the object needs to be bound to the pipeline. When the reference count of an object drops to zero, the object will be unbound from the pipeline and destroyed. This style of reference holding is also known as weak-reference holding because each pipeline binding location holds a weak reference to the interface/object that is bound to it.
D3D10流水线的设置函数并不保存设备子对象(比如绘制状态对象)的引用。这意味着应用程序必须在该子对象绑定在绘制流水线期间保存它的引用。当引用数量为0时,对象会从流水线上卸下并被删除。这种引用保持被成为弱引用保持,因为每个流水线保持一个绑定在它上的弱引用。
For example:
pDevice->CreateRasterizerState( ..., &pRasterizerState ); //创建光栅化状态对象
pDevice->RSSetState( pRasterizerState );//设置
pDevice->RSGetState( &pCurRasterizerState );//取得
// pCurRasterizerState will be equal to pRasterizerState.
pCurRasterizerState->Release();//析构
pRasterizerState->Release();//再次析构,会不会出错啊?如果不出错,我觉得就是败笔
//万一程序里有个引用没有执行Release(忘了,或者无意中被删了)会导致D3D资源泄漏。
// Since app released the final ref on this object, it is unbound.
pDevice->GetRasterizerState( &pCurRasterizerState );
// pCurRasterizerState will be equal to NULL.
|
Differences between Direct3D 9 and Direct3D 10:
In Direct3D 9, pipeline Set functions hold a reference to the device's objects; in Direct3D10 pipeline Set functions do not hold a reference to the DeviceChild objects.
|
D3D9和D3D10的区别:
D3D9会保存一个引用,而D3D10不会。
Pipeline Stages
The Direct3D 10 programmable pipeline is designed for generating graphics for realtime gaming applications. The conceptual diagram below illustrates the data flow from input to output through each of the programmable stages.
D3D19编程流水线为实时游戏程序生成图形设计。下面的设计图描述了从输入到输出的每步编程阶段的数据流程。
Figure 1. Direct3D 10 Pipeline Stages
All of the stages are configurable via the Direct3D 10 API. Stages featuring common shader cores (the rounded rectangular blocks) are programmable using the HLSL programming language. As you will see, this makes the pipeline extremely flexible and adaptable. The purpose of each of the stages is listed below.
上述所有阶段通过D3D10 API描述。通用Shader核心(椭圆框部分)使用HLSL编程实现。这样你会发现绘制流水线非常灵活机动。每个阶段的设计目标在下面列出。
- Input Assembler Stage - The input assembler stage is responsible for supplying data (triangles, lines and points) to the pipeline.
输入集成阶段- 输入集成界软负责为绘制流水线提供数据(三角形,线,点等)。
- Vertex Shader Stage - The vertex shader stage processes vertices, typically performing operations such as transformations, skinning, and lighting. A vertex shader always takes a single input vertex and produces a single output vertex.
VS阶段-VS阶段处理顶点,做坐标变换,表面细节和逐顶点光照计算。VS输入为一个顶点,输出也是一个顶点。
- Geometry Shader Stage - The geometry shader processes entire primitives. Its input is a full primitive (which is three vertices for a triangle, two vertices for a line, or a single vertex for a point). In addition, each primitive can also include the vertex data for any edge-adjacent primitives. This could include at most an additional three vertices for a triangle or an additional two vertices for a line. The Geometry Shader also supports limited geometry amplification and de-amplification. Given an input primitive, the Geometry Shader can discard the primitive, or emit one or more new primitives.
GS阶段-GS处理完整的图元。它的输入是图元(包括三角形三个顶点,或者直线的两个顶点,或者点绘制的一个顶点)。同时,每个图元还可以包含它邻接边的顶点信息。对于三角形可以附加三个顶点,而对于直线,可以附加两个顶点。GS同样提供简单的几何膨胀和收缩操作。给定一个输入图元,GS能够取消绘制这个图元,或者产生更多的新图元。
- Stream Output Stage - The stream output stage is designed for streaming primitive data from the pipeline to memory on its way to the rasterizer. Data can be streamed out and/or passed into the rasterizer. Data streamed out to memory can be recirculated back into the pipeline as input data or read-back from the CPU.
流输出阶段:流输出阶段为了让图元数据在光栅化过程中可以从流水线输出到显存中而设计。数据能北输出或者输入光栅化。输出到显存的流数据可以作为流水线输入数据循环使用,或者读回到CPU。
- Rasterizer Stage - The rasterizer is responsible for clipping primitives, preparing primitives for the pixel shader and determining how to invoke pixel shaders.
光栅化阶段-光栅化对裁减后的图元起作用,它把图元输出到PS,并确定使用什么PS绘制。
- Pixel Shader Stage - The pixel shader stage receives interpolated data for a primitive and generates per-pixel data such as color.
PS阶段:PS阶段接受图元光栅化时的插值数据并生成逐像素的数据(比如颜色)。
- Output Merger Stage - The output merger stage is responsible for combining various types of output data (pixel shader values, depth and stencil information) with the contents of the render target and depth/stencil buffers to generate the final pipeline result.
输出合并阶段-输出合并阶段把不同的数据数据(PS颜色,深度和模板缓存信息)和绘制对象及Depth/Stencil缓存中的数据合并,并输出最终结果。
Input Assembler Stage
At the front of the pipeline, there is a new stage that streams primitives from memory into the pipeline. The input assembler stage (IA) reads vertex data from user filled buffers and assembles the data into vertices and primitives (points, lines or triangles).
在流水线开始之前,有一个把图元数据流从内存输入到流水线的新阶段。IA阶段从用户填充的顶点缓存中读取顶点数据,并把数据集成为顶点和图元(点,线和三角形)。
The data is streamed out to the pipeline for processing by the shader stages. The input assembler can also generate new primitive types (a line list with adjacency or a triangle list with adjacency) to feed the geometry shader adjacency data. This is done by calling executing a Draw API.
输入数据在流水线中用Shader阶段操作。IA同样能够生成新图元(比如带邻接信息的线链表或者三角形链表)让GS能够读取邻接信息。这些操作通过执行一些绘制API完成。
While the IA is generating primitives, it may attach some useful information that is consumed by the shader cores. These system-generated values include information that identifies each primitive, different instances of a primitive, and different vertices. This data is then provided to the shade cores to minimize processing time by processing only those primitives, instances, or vertices that have not already been processed.
当IA生成图元时,它可能会附加一些Shader核心使用的信息在顶点上。这些系统生成的信息包括图元的编号,图元的实例编号和不同的顶点编号。这些数据被Shader核心使用,使它们单独处理未处理的图元,实例和顶点时减少处理时间。
IA Stage API
These are the steps required to initialize and execute the input assembler stage:
这些是初始化和执行输入集成阶段的步骤:
创建输入Buffer,创建并初始化提供输入数据的输入缓存。
- Create the Input Layout - Create an input-layout object that does two things: define how the input data is organized as it is streamed into the IA stage, and compares the input data to the vertex shader inputs to make sure they are compatible.
创建输入层次:创建输入层次对象做两件事,定义输入数据作为数据流输入IA阶段的组织格式,并比较输入数据和VS是否匹配。
绑定对象到IA阶段:绑定已创建的对象(输入还军和输入层次对象)到IA阶段上。
定义几何拓扑:告诉IA怎么把输入数据集成为图元
IA阶段创建系统生成值的例子:表现了系统值怎么绑定到一个三角带上。
Create Input Buffers
Step 1: Declare the vertex data
// Create vertex buffer
SimpleVertex vertices[] =
{
D3DXVECTOR3( 0.0f, 0.5f, 0.5f ),
D3DXVECTOR3( 0.5f, -0.5f, 0.5f ),
D3DXVECTOR3( -0.5f, -0.5f, 0.5f ),
};
This is a simple structure of vertices. One triangle contains 3 vertices.
Step 2: Create a vertex buffer
D3D10_BUFFER_DESC bd;
bd.Usage = D3D10_USAGE_DEFAULT;
bd.ByteWidth = sizeof( SimpleVertex ) * 3;
bd.BindFlags = D3D10_BIND_VERTEX_BUFFER;
bd.CPUAccessFlags = 0;
bd.MiscFlags = 0;
D3D10_SUBRESOURCE_UP InitData;
InitData.pSysMem = vertices;
InitData.SysMemPitch = sizeof( vertices );
InitData.SysMemSlicePitch = sizeof( vertices );
if( FAILED( g_pd3dDevice->CreateBuffer( &bd, &InitData, &g_pVertexBuffer ) ) )
return FALSE;
A vertex buffer is organized as an array of elements, each element contains the data associated with one vertex. There are two parts to creating the buffer resource.
定点缓存被组织为一系列元素的数组,每个元素包含顶点所需的数据。创建顶点缓存由两部分构成。
First, the buffer description is initialized, with settings that define how the application expects to use the buffer. These settings are important for speed and type checking. For example, the usage D3D_USAGE_DEFAULT is for a resource that is not expected to be updated by the CPU very often. This determines what type of memory the runtime will create the resource in. Video ram for example would be used for a resource that is constantly changing, so that the GPU is never interrupted by the need to get data; the D3D_USAGE_DEFAULT flag ensures that the resource cannot be mapped and can only be modified with UpdateSubresource.
首先,初始化缓存的描述结构,设置应用程序如何使用缓存的定义。这些设定对于类型检查和绘制速度来说非常重要。举个例子,对于D3D_USAGE_DEFAULT是对那些不频繁被CPU更新的数据使用的,这决定了在运行时哪种类型的缓存被用来放置定点缓存。再举个例子说,视频缓存是可能经常改变的,GPU在访问它的数据时不需要通过中断。D3D_USAGE_DEFAULT标记定义这个缓存不能被(GPU)mapping,并且只能通过UpdateSubresource来更新。
Mapping in Direct3D 10 is analogous to locking in Direct3D 9. The default usage specifies the resource as one that cannot be locked; in other words, the application does not expect the resource to get locked and updated by the CPU. This means that the GPU can use the resource without fear of the resource stalling the GPU pipeline because the CPU wants access to it. For more about resource usages, see Resources.
D3D10中mapping的含义和D3D9中的Lock类似。D3D_USAGE_DEFAULT定义了不能被锁定的资源。换句话说,应用程序不能锁定资源并通过CPU来更新资源。这意味着GPU能够使用该资源而不必担心在GPU流水线中的等待CPU访问资源结束。更多的资源usage,参见Resources
In this example, the other important setting is the binding flag. This flag (D3D10_BIND_VERTEX_BUFFER) means that any resource (a buffer in this case) created using this description, can only be bound to the pipeline as a vertex buffer resource. This restricts the class of operations that can be done to this resource and once again enables the GPU to schedule its use for maximum performance.
在这个例子中,另一个重要的设置是绑定标记。这个标记表示任何使用D3D10_BIND_VERTEX_BUFFER创建的资源(这里是顶点缓存)只能被绑定到绘制流水线中的顶点缓存资源。这限制了能够对该资源的操作方式,并使GPU能够合理调度它以达到最大绘制效率。
The CPU access flag determines whether or not the CPU can access the buffer; a buffer that is declared as read only can reside anyplace where it can be read quickly.
CPU读取标记定义CPU能否读取这个缓存。一个被定义为只读的缓存可以被放置在任何可以快速读取它的地方。
Second, create the vertex buffer by calling CreateBuffer with the buffer description and a second description of the subresource. Each resource is made up of an array of subresources, and each subresource is made up or an array of elements. Each different resource has a specific hierarchy of resource and subresources and elements (according to figure 1 on the resources page). The subresource description not only points to the actual resource data, but also contains information about the size and layout of the data.
第二步,调用CreateBuffer使用缓存描述符和子资源描述符来创建顶点缓存。每种缓存由一组子资源构成,而每个子资源由一组元素构成。每个不同的资源拥有一系列堆结构的吱吱员和元素(资源页的图1)。子资源描述符不仅只想实际的资源数据,也包括数据的大小和层次。
Using this figure for a buffer, you can see the memory pitch and the memory slice for a buffer. This description gives the pipeline a clear idea of how to walk the resource and read/write to it. Since a buffer is a bag-of-bits, there is a 1D structure to its layout. As a result, the system memory pitch and system memory slice pitch are both the same; the size of the vertex data declaration. A buffer has the easiest memory pitch and memory slice pitch layout since it is a 1D layout.
使用这种描述方式来描述缓存,你能够看到内存的pitch和slice。这些描述给流水线一个清晰的概念,告诉它怎么遍历资源和读取资源。因为缓存是二进制数据位的集合,因此它的层次只有一维结构。因此,内存pitch和slice含义相同,表示顶点数据的大小。因为它的一维结构,所以它简单拥有pitch和slice pitch。
Create the Input Layout
The input layout describes how the data will get interpreted by the IA stage as it is streamed in from user memory. This layout is described by D3D10_INPUT_ELEMENT_DESC, which includes information like: the format of the data, what semantics are specified, and how to interpret instancing data. Tutorial 2 creates the input layout in 2 steps:
输入层次描述了数据如何被IA阶段从用户内存中读取作为输入数据流。这里的层次被描述为D3D10_INPUT_ELEMENT_DESC,包含这样的信息:数据格式,语法定义(用途),怎样转换为实际图形数据。练习2分两步创建输入层次。
First, declare the input layout as shown here:
首先,申明输入层次。
// Define the input layout
D3D10_INPUT_ELEMENT_DESC layout[] =
{
{ L"POSITION", 0,
DXGI_FORMAT_R32G32B32_FLOAT,
0, 0,
D3D10_INPUT_PER_VERTEX_DATA, 0 },
};
This example uses the input layout description to interpret the data stored in a single vertex buffer. The members in the description include:
这个例子使用输入层次描述符描述了存在一个单个数据缓存中的数据。这些描述包括:
- Semantic Name and semantic index - Identifies how to interpret the data. You can use any number of arbitrary semantics, the semantics from Direct3D 9, or the additional semantics required by the hardware in Direct3D 10. The new semantics required by the hardware are called system values; one such example is the position semantic: SV_POSITION. All system value semantics begin with the SV_ prefix.
语义名和语义索引:定义了如何解释这些数据。你能够使用任何语义定义,比如D3D9的语义或者D3D10硬件需要的定义。硬件需要的新的语义叫做系统值,比如位置信息SV_POSITION。所有系统值语义使用SV_开头。
- Format - This is the format of the data stored in the buffer. Direct3D 10 has many predefined format types including: 16, 32, 64 and 128 bit formats, signed and unsigned formats, typed and typeless formats, and integer and floating-point formats. Typeless formats are available when you want to allocate the proper amount of space for the data, but do not yet know what type the data will be at the time the input layout object is created.
格式:这是在缓存中存储数据的格式。D3D10由许多预定义的格式类型,包括16,32,64,128位数据,有符号和无符号数据,有类型和无类型数据,以及整数和浮点数。无类型数据当你需要给数据分配一定空间而在输入层次对象创建时不知道数据类型的时候使用。
- InputSlot and AlignedByteOffset- These two parameters define the input entry point of the stage and any offset from the beginning of the stream to the data (this means you can use a header in your data buffers if you like). Every pipeline stage uses input slots and output slots to identify the input and output ports for streaming data. Each slot is a zero-based integer, and every stage has limitations on how many slots are supported (see d3d10.h).
输入槽和对齐偏移:这两个参数定义了输入数据在这个阶段的的入口点和从数据开始到数据的偏移(这表示如果你愿意的话,你可以使用相同的数据入口)。每个流水线阶段使用输入槽和输出槽定义数据流数据的输入和输出口。每个槽是一个基于0的整数,每个阶段限定了多少数目的数据槽被支持(参见d3d10.h)。
- The input slot class - Tells the input assembler how to apply the data read from the input buffer. Data is identified as vertex data (read the data directly) and non-instanced or instanced. As the input assembler reads the vertex buffer, these options determine how much data to stream onto the next pipeline stage.
输入槽类型:表示IA如何应用从输入缓存读取的数据。数据被定义为顶点数据(直接读取数据),非实例数据和实例数据(实例数据表示可以被重用的集合模型,比如相同树可以在不同的位置画许多次)。当IA读取顶点数据时,这些设置决定了多少数据被送到下一个流水线阶段中。
- InstanceDataStepRate - If the input buffer data is instanced, the input assembler needs to know how many instances to draw before incrementing the pointer.
实例数据阶段率:如果输入缓存数据可以实例化,IA需要知道在跳到另一个数据前需要画多少次。
Second, use the input layout declaration to generate the input layout object by calling ID3D10Device::CreateInputLayout, as shown here:
第二,使用输入层次申明创建输入层次对象,通过调用ID3D10Device::CreateInputLayout:
ID3D10InputLayout ** ppInputLayout;
g_pd3dDevice->CreateInputLayout( layout, 1, pShaderBytecode, &ppInputLayout );
To get the pointer to the shader byte code, use D3D10CompileShader to compile the shader and return an ID3D10Blob interface. Then use ID3D10Blob::GetBufferPointer() to get a pointer to the shader byte code.
为了得到Shader编码的指针,使用D3D10CompileShader编译Shader并返回ID3D10Blob接口。然后使用ID3D10Blob::GetBufferPointer()得到Shader编码的指针。
This function takes the following parameters:
这个函数包含一下几个参数:
- The input layout from step 1. As said before, this describes how the IA will interpret the data in the vertex buffer.
从步骤1中创建的输入层次。前面说过,这描述了IA如何读取顶点缓存中的数据。
- The number of element declarations in the input buffer. Each input buffer is laid out as an array of elements (possibly with a header and therefore an offset - see step 1). Since this example uses a single vertex buffer, each element is the data stored for a single vertex.
数据缓存中定义的数据元素个数。每个输入缓存由一组元素构成(可能由头或者偏移-见步骤1)。因为这个例子使用一个简单的顶点缓存,每个元素都被存储为一个顶点。
- The shader input signature. To type check the data coming from the input stream with the data that will be generated for the next pipeline stage (which is a vertex shader), the input element description is compared against the shader input declaration (using the shader signature). The shader signature is part of the compiled shader. You cannot get this directly, instead you create a shader reflection object, which can get a pointer to the compiled shader. This pointer is then supplied to CreateInputLayout.
Shader输入符号:为了检查从输入数据流到下一步绘制流水线(VS)的数据类型是否匹配,输入元素描述符会和Shader数据申明做比较。Shader输入符号是编译好的Shader的一部分,你不能直接获取,但是你可以创建一个能够得到编译好的Shader指针的Shader映射对象,这个指针由CreateInputLayout提供。
- If the function is successful, it returns a pointer to the Input Layout object. This interface will be used in a moment to set this object in the device.
如果函数成功,则返回输入层次对象的指针。这个指针会被设置到设备中。
Binding Objects To The IA Stage
With the input buffer resources and the input layout object created, you just need to set these objects to the device, which binds these objects to the IA stage. This is done by calling ID3D10Device::IASetInputLayout.
有了创建好的输入缓存资源和输入层次对象,你可以把这些对象设置到设备中,也就是把他们绑定到IA阶段上。使用ID3D10Device::IASetInputLayout完成这些工作。
// Set the input layout
g_pd3dDevice->IASetInputLayout( g_pVertexLayout );
// Set vertex buffer
UINT stride = sizeof( SimpleVertex );
UINT offset = 0;
g_pd3dDevice->IASetVertexBuffers( 0, 1, &g_pVertexBuffer, &stride, &offset );
Setting the input layout object to the device only required a pointer to the object. Setting the vertex buffer is a little more complicated.
设置输入层次对象只需要它的指针,设置顶点缓存稍微复杂一点。
This example only required a single vertex buffer. But you can see from the name of the API method, that SetVertexBuffers takes an array of vertex buffers. Setting one (or more) vertex buffers requires that you bind each buffer to a unique input slot on the IA, and specify anything special about the way the data is stored in the buffer (like any offset to the start of the data and the size of each element in the declaration). With these two pieces of information, the IA stage knows how to step through the VB one vertex at a time.
这个例子只需要一个简单的顶点缓存。但是你可以从API函数的名字中看见,SetVertexBuffers函数需要输入一组顶点缓存。设置一个(或者多个)顶点缓存需要你把每个缓存绑定到IA上某个数据槽上,并且定义不同数据缓存存储的方式(比如申明偏移,起点,大小等)。使用这两个信息,IA阶段知道怎么遍历顶点缓存读取顶点数据。
In addition, if your application uses an index buffer, follow a similar set of steps for the vertex buffer: create the index buffer by calling ID3D10Device::CreateIndexBuffer (only one of these is allowed) and set it to the device with ID3D10Device::SetIndexBuffer.
同时,如果你的程序使用索引缓存,使用和添加顶点缓存类似的步骤:使用ID3D10Device::CreateIndexBuffer创建(只可以创建一个),使用ID3D10Device::SetIndexBuffer设置到设备上。
Specify The Primitive Topology
With the size of the input data fully specified in the input layout and the input buffer declarations, the IA still needs to know how a primitive is described by the data. This is called the primitive topology because it tells you how to assemble a primitive from vertices.
除了通过输入层次和输入缓存申明知道输入数据的大小外,IA还需要知道图元是如何描述的。这被叫做图元拓扑,因为它告诉你如何把顶点组织为图元。
Direct3D 10 supports several primitive topologies including primitives built from points, lines or triangles, connected in strips (continuously connected primitives) or lists (a list of unconnected primitives), with or without adjacent primitives. See illustrations for each of these in primitive topologies.
D3D10支持几种图元拓扑,包括由点,线和三角形构成的,组织为条带(连续的图元)或链表(不连续的图元),具备或者不具备邻接信息的图元格式。这些拓扑详见primitive topologies。
Set the primitive topology by calling IASetPrimitiveTopology:
通过调用IASetPrimitiveTopology来设置图元拓扑。
IASetPrimitiveTopology(D3D10_PRIMITIVE_TOPOLOGY_TRIANGLELIST)
This example defines the data as a triangle list without adjacency. The rest of the choices are listed in D3D10_PRIMITIVE_TOPOLOGY.
这个例子定义了数据作为没有邻接信息的三角形条带保存。其他的选择在D3D10_PRIMITIVE_TOPOLOGY里列出。
Example of the Input Assembler Stage Generating System Values
As stated earlier, system values are generated by the IA to allow certain efficiencies in shader operations by attaching data such as:
前面提到过,系统值由IA生成来提高Shader操作效率。系统值由如下几个:
实例编号(VS可见),顶点编号(VS可见),图元编号(GS/PS可见)
- InstanceID (visible to VS)
- VertexID (visible to VS)
- PrimitiveID (visible to GS/PS)
A subsequent shade stage may look for these system values to optimize processing in that stage. For instance, the VS stage may look for the InstanceID to grab additional per-vertex data for the shader or to perform other operations; the GS and PS stages may use the PrimitiveID to grab per-primitive data in the same way.
一些绘制阶段会读取这些值来优化绘制过程。对于实例,VS阶段会读取实例编号来增加附加的逐顶点信息让Shader做其他操作。GS和PS阶段会使用图元编号来增加逐图元的信息。
Here's an example of the IA stage showing how system values may be attached to an instanced triangle strip:
这是IA阶段的一个例子,显示了系统值如何被添加到一个三角形条带上。
Figure 1. IA Example
This example shows two instances of geometry that share vertices. The figure at the top left shows the first instance (U) of the geometry - the first two tables show the data that the IA generates to describe instance U. The input assembler generates the VertexID, PrimitiveID, and the InstanceID to label this primitive. The data ends with the strip cut, which separates this triangle strip from the next one.
这个例子描述了两个共享顶点的几何模型实例。左上角的图形表示第一个实例物体U,前两个表显示了IA如何描述实例U。IA生成VertexID,PrimitiveID和InstanceID来标记这些图元。数据在条带切断的时候结束,把这个三角形条带和另外一个划分开来。
The rest of the figure pertains to the second instance (V) of geometry that shares vertices E, H, G, and I. Notice the corresponding InstanceID, VertexID and PrimitiveIDs that are generated.
剩下的图像描述了第二个图形V的集合,使用了顶点E,H,G和I。注意对应的InstanceID,VertexID和PrimitiveID是系统生成的。
Primitive Topologies
Primitive topologies describe how to data is organized into primitives. Direct3D supports the following primitive types:
图元拓扑描述了输入数据如何组织成图元,D3D支持这些图元类型:
The winding direction of a triangle indicates the direction in which the vertices are ordered. It can either be clockwise or counter clockwise.
转动的方向表示三角形顶点的排序方向。可以是顺时针也可以是逆时针。
A leading vertex is the first vertex in a sequence of three vertices.
引导顶点就是顶点序列中三个顶点的第一个。
Point List
A point list is a collection of vertices that are rendered as isolated points. Use them in 3D scenes for star fields, or dotted lines on the surface of a polygon. Your application can apply materials and textures to a point list.
点链表是一组独立绘制的顶点的集合。它们在3D场景中绘制星形物或者多边形表面的点划线。在程序中可以对点应用材质和纹理。
Line List
A line list is a list of isolated, straight line segments. Line lists are useful for such tasks as adding sleet or heavy rain to a 3D scene. Applications create a line list by filling an array of vertices. Note that the number of vertices in a line list must be an even number greater than or equal to two. You can apply materials and textures to a line list.
线链表是一组独立线段的集合。线链表对于3D场景中添加条状物或者绘制大雨非常有效。程序通过填充一组顶点到顶点缓存中来创建线链表。注意顶点缓存中的顶点数必须是一个大于或者等于2的奇数。你可以对线链表使用材质或者纹理。
Line List with Adjacency
A line list that also contains adjacency information.
线链表可以拥有邻接信息。
The adjacency information specifies the neighboring vertices around a primitive and is used by a geometry shader to calculate things like edges that require knowledge of a primitive and any geometry that shares edges or vertices with it.
邻接信息定义了图元的邻接顶点,在GS中使用邻接信息来计算边缘需要知道图元和任何与之共享边缘的顶点。
Line Strip
A line strip is a primitive that is composed of connected line segments. Use line strips for creating polygons that are not closed.
线条带是有连接的线段组成的图元。使用线条带创建的多边形不是闭合的。
Line Strip with Adjacency
A line strip that also contains adjacency information.
线条带也可以拥有邻接信息。
Triangle List
A triangle list is a list of unconnected triangles. A triangle list must have at least three vertices and the total number of vertices must be divisible by three.
三角形链表是不连接的三角形的集合。三角形链表使用至少三个顶点,并且总顶点数必须可以被3整除。
Triangle lists are also useful for creating primitives that have sharp edges.
三角形链表可以用来创建有尖锐边缘的物体。
Triangle List with Adjacency
A triangle list that also contains adjacency information.
懒得说了。
Triangle Strip
A triangle strip is a series of connected triangles. Because the triangles are connected, the application does not need to repeatedly specify all three vertices for each triangle.
三角形条带是一系列连接的三角形。因为三角形是相互连接的,所以程序不需要对每个三角形独立定义所有三个顶点。
Most objects are composed of triangle strips. This is because triangle strips can be used to specify complex objects in a way that makes efficient use of memory and processing time.
许多物体由三角形条带构成。这是因为三角形条带能够定义复杂的物体,使内存使用和处理时间更有效。
Triangle Strip with Adjacency
A triangle strip that also contains adjacency information.
唉。
Shader Stages
All Direct3D10 shader stages expose all features of the shader Model 4.0 common shader core; the Direct3D 10 pipeline contains 3 programmable shader stages:
所有的D3D10 Shader阶段体现了SM4.0通用Shader内核的细节。D3D10Shader流水线包含3个可编程Shader阶段。
Vertex Shader Stage
The vertex shader stage processes vertices from the input assembler, performing per-vertex operations such as transformations, skinning, morphing, and per-vertex lighting. Vertex shaders always operate on a single input vertex and produce a single output vertex. The vertex shader stage must always be active for the pipeline to execute. If no vertex modification or transformation is required, a pass-through vertex shader must be created and set to the pipeline.
VS阶段处理从IA阶段输入的顶点,执行对每个顶点的操作,诸如几何变换,表面变换,变形和逐顶点光照。VS只对一个顶点执行操作,输出一个顶点。如果没有顶点变换需求,则需要创建一个空的VS设置到流水线中。
Each vertex shader input vertex can be comprised of up to 16 32-bit vectors (up to 4 components each) and each output vertex can be comprised of as many as 16 32-bit 4-component vectors. All vertex shaders must have a minimum of one input and one output, which can be as little as one scalar value.
每个VS输入顶点最多能够有16个32位的向量(4个通道)构成;每个输出顶点可以由16个32位4通道的向量构成。所有的VS至少拥有一个只具有一个通道的输入向量和输出向量。
The vertex shader stage can consume two system generated values from the input assembler: VertexID and InstanceID (see System Values and Semantics). Since VertexID and InstanceID are both meaningful at a vertex level, and IDs generated by hardware can only be fed into the first stage that understands them, these ID values can only be fed into the vertex shader stage.
VS能够识别两个系统生成值:VertexID和InstanceID(参见System Value and Semantics)。因为VertexID和InstanceID在顶点处理阶段都是有意义的,而且硬件生成的ID只能输入到第一个能够识别它们的阶段中,因此这些ID只能在VS阶段输入。
Vertex shaders are always run on all vertices, including adjacent vertices in input primitive topologies with adjacency. The number of times that the vertex shader has been executed can be queried from the CPU using the VSInvocations pipeline statistic.
VS对所有顶点执行,包括具有邻接信息的输入图元的邻接顶点。VS执行的次数能够通过在CPU上调用VSInvocations统计数据来获得。
The vertex shader can perform load and texture sampling operations where screen-space derivatives are not required (using HLSL intrinsic functions samplelevel, samplecmplevelzero, samplegrad).
Vs能够在不需要屏幕位置导数的情况下执行采样纹理操作(使用HLSL内置函数samplelevel, samplecmplevelzero, samplegrad)。
Geometry Shader Stage
The geometry shader runs application-specified shader code with vertices as input and the ability to generate vertices on output. Unlike vertex Shaders, which operate on a single vertex, the geometry shader's inputs are the vertices for a full primitive (two vertices for lines, three vertices for triangles, or single vertex for point). Geometry shaders can also bring in the vertex data for the edge-adjacent primitives as input (an additional two vertices for a line, an additional three for a triangle).
GS对一组顶点执行操作,并输出一组顶点。和VS不一样的是,它不是仅对一个顶点操作,而是对整个图元的所有顶点操作(线条有两个顶点,三角形有三个顶点,点有一个顶点)。GS同样能够得到邻接顶点的信息(线条可以多得到两个顶点,三角形可以多得到三个顶点)。
The geometry shader stage can consume the SV_PrimitiveID System Value that is auto-generated by the IA. This allows per-primitive data to be fetched or computed if desired.
BS阶段能够读取IA阶段自动生成的SV_PrimitiveID系统值。这似的逐图元的数据能够在需要的情况下被读取和计算。
The geometry shader stage is capable of outputting multiple vertices forming a single selected topology (GS output topologies available are: tristrip, linestrip, and pointlist). The number of primitives emitted can vary freely within any invocation of the geometry shader, though the maximum number of vertices that could be emitted must be declared statically. Strip lengths emitted from a GS invocation can be arbitrary, and new strips can be created via the RestartStrip HLSL intrinsic function.
GS阶段能够输出多个顶点组成一个选定的图元拓扑(GS输出的图元拓扑保多:三角形条带,线条带和点链表)。图元输出的个数能够自由变化,但是最大顶点数必须静态的申明。GS输出的条带长度是绝对的,新的条带可以通过HLSL内置函数RestartStrip来创建。
Geometry shader output may be fed to the rasterizer stage and/or to a vertex buffer in memory via the stream output stage. Output fed to memory is expanded to individual point/line/triangle lists (exactly as they would be passed to the rasterizer).
GS输出到光栅化阶段或者通过流输出阶段输出到显存中的顶点缓存中。输出到内存的是独立的顶点、直线或者三角形链表(当然也可能被送到光栅化过程中)。
When a geometry shader is active, it is invoked once for every primitive passed down or generated earlier in the pipeline. Each invocation of the geometry shader sees as input the data for the invoking primitive, whether that is a single point, a single line, or a single triangle. A triangle strip from earlier in the pipeline would result in an invocation of the geometry shader for each individual triangle in the strip (as if the strip were expanded out into a triangle list). All the input data for each vertex in the individual primitive is available (i.e. 3 vertices for triangle), plus adjacent vertex data if applicable/available.
当一个GS被激活时,它对每个传过来的或者被流水线预先生成的图元执行一次操作。每次GS操作把输入数据当作图元处理,也就是当作一个点,一条直线或者一个三角形。从前端流水传来的三角条带会引起GS对条带中每个独立的三角形做一次操作(就好像把条带扩展成为了三角链表)。每个独立图元的每个顶点的所有输入数据都可以取到(比如三角形的三个顶点),并且在预先申明的情况下还附加了邻接顶点信息。
A geometry shader outputs data one vertex at a time by appending vertices to an output stream object. The topology of the streams is determined by a fixed declaration, choosing one of: PointStream, LineStream, or TriangleStream as the output for the GS stage. There are three types of stream objects available, PointStream, LineStream and TriangleStream which are all templated objects. The topology of the output is determined by their respective object type, while the format of the vertices appended to the stream is determined by the template type. Execution of a geometry shader instance is atomic from other invocations, except that data added to the streams is serial. The outputs of a given invocation of a geometry shader are independent of other invocations (though ordering is respected). A geometry shader generating triangle strips will start a new strip on every invocation.
GS通过一次输出一个顶点,把顶点附加到输出流对象的最后来输出数据。输出流的拓扑通过设置绘制状态来确定,从PointStream, LineStream和TriangleStream中选择一个。这里有三种输出流对象,PointStream, LineStream和TriangleStream,它们都是模板对象。输出的拓扑通过它们对应的对象类型确定,添加到流上的顶点格式由模板类型决定。GS实例的执行操作相对同步其他的GS操作来说是独立的,但是添加流数据是按顺序进行的。GS的输出和其他GS的调用是相互独立的。每个GS每次会重新创建一个三角条带。
When a geometry shader output is identified as a System Interpreted Value (e.g. SV_RenderTargetArrayIndex or SV_Position), hardware looks at this data and performs some behavior dependent on the value, in addition to being able to pass the data itself to the next shader stage for input. When such data output from the geometry shader has meaning to the hardware on a per-primitive basis (such as SV_RenderTargetArrayIndex or SV_ViewportArrayIndex), rather than on a per-vertex basis (such as SV_ClipDistance[n] or SV_Position), the per-primitive data is taken from the leading vertex emitted for the primitive.
因为GS的输出是一个系统识别值(比如SV_RenderTargetArrayIndex or SV_Position),硬件会找到这些数据并且执行和系统值相关的操作,同时也为了能够把数据传送到下一个Shader阶段作为输入。因为GS输出的数据是基于图元的(比如SV_RenderTargetArrayIndex 或SV_ViewportArrayIndex)而不是基于顶点的(比如SV_ClipDistance[n]或SV_Position),逐图元数据就要从输出的引导索引开始读取。
Partially completed primitives could be generated by the geometry shader if the geometry shader ends and the primitive is incomplete. Incomplete primitives are silently discarded. This is similar to the way the IA treats partially completed primitives.
部分完整的图元会当GS在图元没有生成完全结束时被GS生成。不完整的图元会被默认忽略,就和IA阶段对待部分完整的图元一样。
The geometry shader can perform load and texture sampling operations where screen-space derivatives are not required (samplelevel, samplecmplevelzero, samplegrad).
GS能够在不需要屏幕坐标位置导数的情况下进行采样纹理的操作(samplelevel, samplecmplevelzero, samplegrad)
Algorithms that can be implemented in the geometry shader include:
使用GS可以实现的算法有:
- Point Sprite Expansion 点精灵扩展
- Dynamic Particle Systems 动态粒子系统
- Fur/Fin Generation 皮毛生成
- Shadow Volume Generation Shadow Volume生成Shadow Volume
- Single Pass Render-to-Cubemap 单Pass绘制CubeMap
- Per-Primitive Material Swapping 逐图元材质变换
- Per-Primitive Material Setup - Including generation of barycentric coordinates as primitive data so that a pixel shader can perform custom attribute interpolation. 逐图元材质重建,包括生成图元的重心坐标数据,使PS能够执行其他属性的插值。
Pixel Shader Stage
A pixel shader is invoked by the rasterizer stage, to calculate a per-pixel value for each pixel in a primitive that gets rendered. The pixel shader enables rich shading techniques such as per-pixel lighting and post-processing. A pixel shader is a program that combines constant variables, texture values, interpolated per-vertex values, and other data to produce per-pixel outputs. The stage preceding the rasterizer stage (GS stage or the VS stage is the geometry shader is NULL) must output vertex positions in homogenous clip space.
PS在光栅化阶段后调用,用于计算一些被绘制图元的逐象素信息。PS包括丰富的光照技术,包括逐象素光照和后期处理。PS是包含常数变量,纹理值,顶点插值数据和其他逐象素数据的程序。PS前的光栅化阶段(GS或者当GS是空时为VS)必须向象素空间输出顶点位置。
A pixel shader can input up to 32 32-bit 4-component data for the current pixel location. It is only when the geometry shader is active that all 32 inputs can be fed with data from above in the pipeline. In the absence of the geometry shader, only up to 16 4-component elements of data can be input from upstream in the pipeline.
PS能够对每个象素输入32个32位4通道的数据。只有当GS激活时所有的32位输入才能被完全填充。没有GS的话,只有最多16个4通道的数据能够从流水线前端输入。
Input data available to the pixel shader includes vertex attributes that can be chosen, on a per-element basis, to be interpolated with or without perspective correction, or be treated as per-primitive constants. In addition, declarations in a pixel shader can indicate which attributes to apply centroid evaluation rules to. Centroid evaluation is relevant only when multisampling is enabled, since cases arise where the pixel center may not be covered by the primitive (though subpixel center(s) are covered, hence causing the pixel shader to run once for the pixel). Attributes declared with centroid mode must be evaluated at a location covered by the primitive, preferably at a location as close as possible to the (non-covered) pixel center.
对PS有效的输入数据包括被选中的逐元素的顶点数据,它们可以被透视校正或者不校正,然后再插值得到,或者作为逐图元的常量。同时,PS的申明还表示了哪个属性会被应用重心赋值规则得到。重心赋值规则只在多采样的时候生效,因为存在象素中心可能不被图元覆盖的情况(虽然子象素中心是被覆盖的,但这样会导致PS为这个象素再运行一次)。被申明为重心模式的属性必须在被图元覆盖的区域赋值,而且距离(没有被覆盖的)象素中心越近越好。
A pixel shader can output up to 8 32-bit 4-component data for the current pixel location to be combined with the render target(s), or no color (if the pixel is discarded). A pixel shader can also output an optional 32-bit float scalar depth value for the depth test (SV_Depth).
PS对于RT的当前象素位置能够输出8个32位4通道的数据,或者不输出颜色(如果象素取消绘制的话)。PS同样能够输出可选的32位浮点深度值,用作深度测试。
For each primitive entering the rasterizer, the pixel shader is invoked once for each pixel covered by the primitive. When multisampling, the pixel shader is invoked once per covered pixel, though depth/stencil tests occur for each covered multisample, and multisamples that pass the tests are updated with the pixel shader output color(s).
对于进入光栅化后每个图元,PS对于图元覆盖的每个象素只调用一次。当多采样时,PS对每个覆盖的象素调用一次,虽然D/S测试对每个覆盖的采样点调用一次,通过D/S测试的采样点使用PS的输出颜色更新。
If there is no geometry shader, the IA is capable of producing one scalar per-primitive system-generated value to the pixel shader, the SV_PrimitiveID, which can be read as input to the pixel shader. The pixel shader can also retrieve the the SV_IsFrontFace value, generated by the rasterizer stage.
如果没有GS的话,IA能够创建一个逐图元生成的值给PS,那就是SV_PrimitiveID,能够作为PS的输入被PS读取。PS同样也能够得到在光栅化过程仲生成的SV_IsFrontFace值。
One of the inputs to the pixel shader can be declared with the name SV_Position, which means it will be initialized with the pixel's float32 xyzw position. Note that w is the reciprocal of the linearly interpolated 1/w value. When the rendertarget is a multisample buffer or a standard rendertarget, the xy components of position contain pixel center coordinates (which have a fraction of 0.5f).
PS的输入之一可以被申明为SV_Position,意味着它可以被初始化为象素的32位浮点数xyzw位置。注意w时线形插值得到的1/w的倒数。当RT时一个多采样缓存或者标准RT时,xy通道包含象素重心点的坐标(小数位偏移0.5f)
The pixel shader instruction set includes several instructions that produce or use derivatives of quantities with respect to screen space x and y. The most common use for derivatives is to compute level-of-detail calculations for texture sampling and in the case of anisotropic filtering, selecting samples along the axis of anisotropy. Typically, hardware implementations run a pixel shader on multiple pixels (for example a 2x2 grid) simultaneously, so that derivatives of quantities computed in the pixel shader can be reasonably approximated as deltas of the values at the same point of execution in adjacent pixels.
PS的指令集包括几条生成和使用屏幕位置的x,y的导数的指令。使用最多的导数是纹理采样和多异向性滤波的LOD计算,选择沿轴方向的异向。典型的多采样硬件实现就是对多个象素(比如说2x2网格)运行类似的PS,所以PS里计算的导数理论上逼近对临近象素执行PS的实际值。
Stream Output Stage
The stream output stage (SO) is located in the pipeline right after the geometry shader stage and just before the rasterization stage.
流输出阶段在流水线后紧随GS之后在光栅化之前的一个阶段。
Figure 1. Pipeline Block Diagram - the crosshatched stage is the Stream Output stage
The purpose of the SO stage is to write vertex data streamed out of the GS stage (or the VS stage if the GS stage is inactive) to one or more buffer resources in memory. Data streamed out to memory can be read back into the pipeline in a subsequent rendering pass, or can be copied to a staging resource for readback to the CPU. Since variable amounts of data can be generated by a geometry shader, the amount of data streamed out can vary. The DrawAuto API allows this variable amount of data to be processed in a subsequent pass without the need to query (from the CPU) the amount of data written to stream output.
SO阶段的目的是把GS阶段输出的顶点数据(或者在GS没有激活的情况下是VS阶段)写到显存中的一个或多个缓存资源中。输出到显存中的流数据能够被下一个绘制pass读回到流水线中,或者能够拷贝到某些资源中被CPU读回。因为GS生成数据的数量是变化的,流输出的大小也是可变的。DrawAuto API允许可变数量的数据在下一个pass中处理而不需要(向CPU查询)流数据的的数量。
SO Stage API
These are the steps required to initialize and execute the stream output stage:
这些是初始化和执行SO阶段的步骤
Compile a Geometry Shader
Given the following geometry shader (from Tutorial13):
有如下的GS
struct GSPS_INPUT
{
float4 Pos : SV_POSITION;
float3 Norm : TEXCOORD0;
float2 Tex : TEXCOORD1;
};
[maxvertexcount(3)]
void GS( triangle GSPS_INPUT input[3], inout TriangleStream<GSPS_INPUT> TriStream )
{
GSPS_INPUT output;
//
// Calculate the face normal
//
float3 faceEdgeA = input[1].Pos - input[0].Pos;
float3 faceEdgeB = input[2].Pos - input[0].Pos;
float3 faceNormal = normalize( cross(faceEdgeA, faceEdgeB) );
for( int v=0; v<3; v++ )
{
output.Pos = input[v].Pos + float4(faceNormal*Explode,0);
output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = input[v].Norm;
output.Tex = input[v].Tex;
TriStream.Append( output );
}
TriStream.RestartStrip();
}
This shader calculates a face normal for each triangle, and outputs position, normal and texture coordinate data. A geometry shader looks just like a vertex or pixel shader, with the following exceptions:
这个Shader对每个三角形计算表面法向,并输出位置,法向和纹理坐标数据。GS和VS,PS类似,只是有如下区别:
- GS function return type - the function return type does one thing, declares the maximum number of vertices that can be output by the shader. In this case,
GS函数返回类型 函数返回类型只做一件事,申明Shader能够输出的最大数量的顶点数,在这个例子中
maxvertexcount[3]
defines the output to be a maximum of 3 vertices.
定义了最大输出三个顶点。
- GS input parameter declarations - This function takes two input parameters:
GS输出参数申明-这个函数带两个参数:
· triangle GSPS_INPUT input[3] , inout TriangleStream<GSPS_INPUT> TriStream
The first parameter is an array of vertices (3 in this case) defined by a GSPS_INPUT struct (which defines per-vertex data as a position, a normal and a texture coordinate). The first parameter also uses the triangle keyword which means the input assembler stage must output data to the geometry shader as one of the triangle primitive types (triangle list or triangle strip).
第一个参数是一个使用GSPS_INPUT结构(定义了逐顶点的数据,比如一个位置,一个法向和一个纹理坐标)定义的顶点数组(在这个例子中是3个)。第一个参数同样使用了triagnle关键字,意味着IA阶段必须给GS输入一种三角形图元类型(三角链表或者三角条带)。
The second parameter is a triangle stream defined by the type
第二个参数是一个三角形数据流类型
TriangleStream<GSPS_INPUT>
. This means the parameter is an array of triangles, each of which is made up of three vertices (that contain the data from the members of GSPS_INPUT).
这表示参数是一组三角形,每个由3个顶点构成(包含GSPS_INPUT中定义的数据)。
Use the triangle and trianglestream keywords to identify individual triangles or a stream of triangles in a GS.
使用triangle和grianglestream关键字在GS中定义独立的三角形和三角形流数据。
- GS intrinsic function - The lines of code in the shader function use common-shader-core HLSL intrinsic functions except the last two lines, which call Append and RestartStrip. These functions are only available to a geometry shader. Append informs the geometry shader to append the output to the current strip; RestartStrip creates a new primitive strip. A new strip is implicitly created in every invocation of the GS stage.
GS内置函数-Shader函数使用的代码都是使用通用shader核心的HLSL内置函数,除了最后两行,调用了Append和RestartStrip。这些函数只对GS有效。Append通知GS把输出顶点附加到当前条带最后。RestartStrip创建一个新的图元条带。一个新的条带在每次调用GS时被隐式生成。
The rest of the shader looks very similar to a vertex or pixel shader. The geometry shader uses a struct to declare input parameters and marks the position member with the SV_POSITION semantic to tell the hardware that this is position data. The input structure identifies the other two input parameters as texture coordinates (even though one of them will contain a face normal). You could use your own custom semantic for the face normal if you prefer.
剩下的Shader和VS及PS看起来很相似。GS使用结构来申明输入参数,使用SV_POSITION来标记位置成员,告诉硬件这是个位置数据。输入结构定义了其他两个输入参数作为纹理坐标(其中一个包含表面法向)。如果你喜欢的话,你能够使用你自己的语义来表示法向。
Having designed the geometry shader, call D3D10CompileShader to compile it like this:
设计完GS之后,调用D3D10CompileShader编译Shader,如下
DWORD dwShaderFlags = D3D10_SHADER_ENABLE_STRICTNESS;
ID3D10Blob** ppShader;
D3D10CompileShader( pSrcData, sizeof( pSrcData ),
"Tutorial13.fx", NULL, NULL, "GS", "gs_4_0",
UINT Flags, &ppShader, NULL );
Just like vertex and pixel shaders, you will need a shader flag to tell the compiler how you want the shader compiled (for debug, optimized for speed etc...), the entry point function, and the shader model to validate against. This example creates a geometry shader built from the Tutorial13.fx file, using the GS function. The shader is compiled for shader model 4.0.
就像VS和PS一样,你需要一个Shader标记来告诉编译器你希望如何编译Shader(为了调试,优化等等),还有入口函数和Shader模型等信息。这个例子从Tutorial13.fx文件中创建了一个GS,使用GS函数。Shader被编译为Shader Model4.0。
Create a Geometry Shader Object with Stream Output
Once you know that you will be streaming the data from the geometry, and you have successfully compiled the shader, the next step is to call CreateGeometryShaderWithStreamOutput to create the geometry shader object.
一旦你知道你会从GS中得到流数据,并且你已经编译好了Shader,下一步就是调用CreateGeometryShaderWithStreamOutput来创建GS对象。
But first, you need to declare the SO stage input signature. This signature matches or validates the GS outputs and the SO inputs at object creation time. Here's an example of the SO declaration:
首先你必须申明SO阶段输入符号,这个符号在对象创建的时候和GS的输出及SO的输入匹配或者使之生效。下面是一个SO申明的例子:
D3D10_STREAM_OUTPUT_DECLARATION_ENTRY pDecl[] =
{
// semantic name, semantic index, start component, component count, output slot
{ L"SV_POSITION", 0, 0, 4, 0 }, // output all components of position
{ L"TEXCOORD0", 0, 0, 3, 0 }, // output the first 3 of the normal
{ L"TEXCOORD1", 0, 0, 2, 0 }, // output the first 2 texture coordinates
};
D3D10Device->CreateGeometryShaderWithStreamOut( pShaderBytecode, pDecl, 3,
sizeof(pDecl), &pGS );
This function takes several parameters including:
这个函数包含以下几个参数:
- A pointer to the compiled geometry shader (or vertex shader if no geometry shader will be present and data will be streamed out directly from the VS). To get this pointer call D3D10CompileShader to compile a shader (and return an ID3D10Blob interface); then call ID3D10Blob::GetBufferPointer() to get a pointer to the shader byte code.
一个指向编译好的GS(或者VS,如果没有GS的话,那么数据会从VS中输入)的指针。调用D3D10CompileShader编译Shader(返回ID3D10Blob接口)然后调用ID3D10Blob::GetBufferPointer()得到这个指向Shader二进制代码的指针。
- A pointer to an array of declarations that describe the input data for the stream output stage. See D3D10_SO_DECLARATION_ENTRY. You can supply up to 64 declarations, one for each different type of element to be output from the SO stage.
指向描述输入到SO阶段的输入数据格式的一组申明的指针。参见D3D10_S0_DECLARATION_ENTRY。你能够最多提供64个申明,每个表示输入到SO阶段的不同类型的数据元素。
- The number of elements that are written out by the SO stage.
输入到SO阶段的元素个数
- A pointer to the geometry shader object created (see ID3D10GeometryShader Interface).
一个指向创建好的GS对象的指针(参见ID3D10GeometryShader接口)。
The stream output declaration defines the way data is written to a buffer resource. You can add as many components as you want to the output declaration. The SO stage supports writing out to a single buffer resource, or many buffer resources. When writing to a single buffer, the SO stage supports writing many different elements per-vertex. When writing to more than one buffer, the SO stage only supports writing one element to each buffer.
SO申明定义了数据写入缓存资源的方式。你能够添加足够多的申明项到输出申明中。SO阶段支持输出到单个(或多个)缓存资源中。当写单个缓存时,SO阶段支持逐顶点写多个不同的元素。当写入多个缓存时,SO只支持对每个缓存写一个元素。
Set The Output Targets
The last step is to set the SO buffers. Data can be streamed out into one or more buffers in memory for use later. This example shows how to create a single buffer that can be used for vertex data as well as for the SO stage to stream data into:
最后一步是设置SO缓存。数据能够输出到一个或多个缓存,被将来的程序使用。这个例子显示了如何创建一个能够用作让SO阶段输出流数据并保存顶点数据的缓存:
ID3D10Buffer *m_pBuffer;
int m_nBufferSize = 1000000;
D3D10_BUFFER_DESC bufferDesc =
{
m_nBufferSize,
D3D10_USAGE_DEFAULT,
D3D10_BIND_STREAM_OUTPUT,
0,
0
};
D3D10Device->CreateBuffer( &bufferDesc, NULL, &m_pBuffer );
Create a buffer by calling CreateBuffer. The buffer size is specified for megabyte with a default usage. This usage is typical for a buffer resource that is expected to be updated fairly frequently by the CPU. The binding flag identifies the pipeline stage that the resource can be bound to. Any resource used by the SO stage must also be created with the D3D10_BIND_STREAM_OUTPUT bind flag.
调用CreateBuffer创建缓存。默认用途(D3D10_USAGE_DEFAULT)的缓存大小以MB为单位。这种用途(D3D10_USAGE_DEFAULT)适用于希望很少被CPU更新的缓存资源。绑定标记定义了缓存绑定的流水线阶段。所以被SO阶段数用的资源必须使用D3D10_BIND_STREAM_OUTPUT绑定标记创建。
Once the buffer is successfully created, set it to the current device by calling SOSetTargets:
一旦缓存被成功创建,调用SOSetTargets把它设置到当前设备上。
UINT offset[1] = 0;
D3D10Device->SOSetTargets( 1, m_pBuffer, offset );
This call takes the number of buffers, a pointer to the buffers, and an array of offsets (one offset into each of the buffers to the start of the buffer data).
函数调用参数有缓存的输入,缓存指针和一个偏移量的数组(一个偏移量表示对应缓存相对缓存头数据偏移多少)。
Rasterizer Stage
The rasterizer stage transforms primitives to their pixel locations. To do so, the stage clips vertices to the view frustum, sets up the primitives for mapping to the 2D viewport, and determines how to invoke pixel shaders, if any are present. Some of these features are optional (like pixel shaders), however, the rasterizer always performs clipping, a perspective divide to transform the po