摘要: 一、基础技术 将光源作为相机,渲染场景,将lightWorldViewProjectPos——处于光源相机透视坐标系下的 z/w 值写入深度图中 注意此时光源相机的projectMatrix应为: D3DXMatrixPerspectiveFovLH(&mLightProjection, D3DX_PI / 2, 1, 0.01, 4000); 采用正常相机,利用深度图再次渲染场景。 将此...阅读全文
posted @ 2012-05-17 22:21 ActionFG 阅读(4) 评论(0) 编辑

 

from cywater:

传统Z-Test其实是发生在PS之后的,因此仅仅依靠Z-Test并不能加快多少渲染速度。而EZC则发生在光栅化之后,调用PS之前。EZC会提前对深度进行比较,如果测试通过(Z-Func),则执行PS,否则跳过此片段/像素(fragment/pixel)。不过要注意的是,在PS中不能修改深度值,否则EZC会被禁用。

这样,在整个流水线阶段,深度比较发生了2次,一次是EZC,一次是传统Z-Test(注: 区别可能在于EZC无法写入深度)

除了利用EZC帮助传统Z-Test加速之外(硬件自动调用,我称其为隐式用法),目前还有2个显示引申用法:一个类似于Deferred Shading(延迟着色)。比如说渲染头发的时候,有很多不透明的部分会挡住透明的部分,为了避免计算被挡住的透明部分(alpha混合很费)就可以使用EZC;另外一个典型应用是GPGPU中迭代法求解线性方程组,当某些矩阵单元的值已经满足要求时(比如收敛),则可以跳过计算。

显示引申使用EZC的典型框架如下:

Pass1: 准备z-buffer。(注:比普通渲染多一次pre Z pass,当然如果采用Defered Shading,那就不存在了

通常有两种作法:一种是直接“简单”渲染模型(不写frame buffer,只写z buffer);一种是人为指定depth/z(即作为阈值,例如迭代法求解线性方程组)

    Pass2: 正常渲染。(禁用Z-Write)

 

相关资料:

Applications of Explicit Early-Z Culling
Explicit Early-Z Culling and Dynamic Flow Control on Graphics Hardware

以及GPU Gems2的第30章

 

From OpenGPU 论坛:

      实际上在传统管线中,z test 不可能用来剔除 pixel shader 片断的执行。但是,在当前的一些显卡中(注:Geforce 6系列已支持),很多都把 z test 提前到了pixel shader 之前执行一次,被称为 Early -Z Culling 优化。
      但是从管线次序中可以清楚地看到,alpha test 成为了 z test 的限制,因为一旦打开了 alpha test, 对于在 alpha test 中失败的像素 Early-Z 将不能正确地被判断。因此对于部分显卡硬件,一旦关闭了 alpha test,Early-Z Culling就会被自动打开。
       为了充分体现出 Early-Z Culling 的威力,一般来应该在 prezwriting 的 pass 阶段就应该先执行一次 alpha test,并且关闭对 color target 的填充,在 pixel shader 中只把 alpha值返回出去供 alpha test 使用。
但是问题随之出现: 对于一块未知的3D适配器,无法通过caps得知其是否支持 early-z culling优化,因此一旦在硬件不支持这个特性的显卡上使用此技术,反而会导致效率下降。

This article was written in springnote.

posted @ 2012-02-06 11:41 ActionFG 阅读(53) 评论(0) 编辑

Depth Buffer
1、Learning to Love your Z-buffer

Introduction

One of the more common OpenGL programming problems that I see concerns the poor precision of the Z buffer.

Many of the early 3D adaptors for the PC have a 16 bit Z buffer, some others have 24 bits - and the very best have 32 bits. If you are lucky enough to have a 32 bit Z buffer, then Z-precision may not seem to be an issue for you. However, if you expect your program to be portable, you'd better give it some thought.

The precision of Z matters because the Z buffer determines which objects are hidden behind which others - and if you don't have enough precision to resolve the distance between two nearby objects, they will randomly show through each other - sometimes in large zig-zags, sometimes in stripes.

This is commonly called 'flimmering' or 'Z-fighting' and it's very disturbing to the user.

The Near Clip Plane

The near clip plane (zNear for short) is typically set using gluPerspective() or glFrustum() - although it's also possible to set it by setting the GL_PROJECTION matrix directly.

Some graphics programmers call zNear 'hither' and zFar 'yonder'.

Beginners frequently place zNear at a very short distance because they don't want polygons close to the eye to be clipped against the near plane - and because it isn't obvious why you'd want to do anything else.

Positioning of zNear too close to the eye is the cause of flimmering (in almost every case) - and the remainder of this document explains why that is.

The Resolution of Z.

What people often fail to realise is that in nearly all machines, the Z buffer is non-linear. The actual number stored in the Z buffer memory is related to the Z coordinate of the object in this manner:
(PS. 在进行pixel写操作时,即写入AGP Register时,数值会进行scale和bias,所以在depthTexture中取值范围为[0, 1])

 z_buffer_value = (1<<N) * ( a + b / z )
  Where:
     N = number of bits of Z precision
     a = zFar / ( zFar - zNear )
     b = zFar * zNear / ( zNear - zFar )
     z = distance from the eye to the object
  ...and z_buffer_value is an integer.

This means that Z (and hence the precision of Z) is proportional to the reciprocal of the z_buffer_value - and hence there is a LOT of precision close to the eye and very little precision off in the distance.

This reciprocal behaviour is somewhat useful because you need objects that are close to the eye to be rendered in great detail - and you need better Z precision for detailed objects.

However, an undesirable consequence of this is that many of your Z buffer's bits are wasted - storing insanely fine detail close to the near clip plane. If you pull the near clip closer to your eye, then ever more bits are dedicated to the task of rendering things that are that close to you, at considerable cost to the precision a bit further out.

It follows that in most cases, flimmering can be greatly reduced - or even eliminated by moving the near clip plane further from your eye.

How Bad Is It?

Talking about how large the error is for absolute values of zNear (in feet, meters, lightyears or angstroms) is meaningless. Given the math used to convert Z into the number in the Z buffer, it's also fairly meaningless to talk about absolute values in "OpenGL units" either.

Instead we have to think about the ratio of distances to objects in your scene to the value of zNear.

This equation applies:

delta = z * z / ( zNear * (1<<N) - z )
  Where:
     N     = number of bits of Z precision
     zNear = distance from eye to near clip plane
     z     = distance from the eye to the object
     delta = the smallest resolvable Z separation at this range.

This equation is approximate - it only applies if zNear is much smaller than zFar - which is true for nearly all applications.

For another way to think about this, suppose we choose to think about the range at which there is a n% error in Z due to the precision of the Z buffer. For ease of discussion, I'll call the ratio of that range to the value of zNear 'Zn%'.

Hence, Z5% is the "range at which there is a 5% error in Z" divided by zNear.

For a 16 bit Z buffer, the value of Z5% is about 3500. It varies *slightly* depending on the value of zFar, and for very small values of zFar, it does get a little bigger - but for practical applications, 3500 is a good rule-of-thumb.

What this means in practice, is that if you place zNear at 1 meter (in whatever units your database uses), then when an object is at 3,500 meters, there will be a 5% error in it's Z value.

For 16 bit Z:
Z10% = ~8000 * Z5% = 3500 Z1% = 666 Z0.1% = 66 Z0.01% = 6
* NB: The larger the range, the more the zFar distance starts to affect the precision - but for most practical applications, it doesn't make enough of a difference to matter.

The table tell us that value for Z1% is 666 and Z10% is ~8000. So with our 1 meter zNear, we can expect better than 1% precision below 666 meters, better than 5% precision below 3500 meters and better than 10% at under 8000 meters.

Now, if your zNear is at 10cm, you'll see a 5% error at just 350 meters (3500*0.1m)- and out at 3.5km meters, the error will be around 33% - over a kilometer in error. An airplane flying behind a huge mountain could suddenly pop into view in front of it when we are are only a couple of miles away!

(结论:zNear越小,发生zFighting的几率就越大)

You can see that the placement of zNear is really critical in a 16 bit Z system.

For a 24 bit Z buffer, the ratios are 256 times larger so Z1% is ~170,000, and Z5% is about a million.

For a 32 bit Z buffer, even Z1% is about 45 million and we are unlikely to care about Z5% and Z10% metrics!

 

2、gl_FragDepth


gl_FragCoord和gl_FragDepth分别是片元着色器的输入和输出变量。(PS. DepthTexture基本上可假定为设备空间的(Zc + 1)/2,[ 0,1 ])

gl_FragCoord是个vec4,四个分量分别对应x, y, z和1/w。其中,x和y是当前片元的窗口相对坐标,不过它们不是整数,小数部分恒为0.5。x - 0.5和y - 0.5分别位于[0, windowWidth - 1]和[0, windowHeight - 1]内。windowWidth和windowHeight都以像素为单位,亦即用glViewPort指定的宽高。w即为乘过了投影矩阵之后点坐标的 w,用于perspective divide的那个值。gl_FragCoord.z / gl_FragCoord.w可以得到当前片元和camera之间的距离。参见Fog in GLSL page 4

gl_FragCoord.z是固定管线计算出的当前片元的深度。它已经考虑了多边形偏移,并经过了投影变换。它位于[0.0, 1.0]之间。如果用gl_FragColor = vec4(vec3(gl_FragCoord.z), 1.0)将其可视化,多半会看到一片白。这是由于变换的非线性,大多数点的深度都非常接近于1。用gl_FragColor = vec4(vec3(pow(gl_FragColor.z, exp)), 1.0)并将exp取为合适的值,就能看到从黑到白的深度变化了。距离观察者近的颜色深,接近0.0;距离观察者远的颜色浅,接近1.0;这说明一直以来 的右手坐标系在投影变换后变成了左手坐标系。关于深度的变换和精确性参见OpenGL FAQ - 12 The Depth Buffer

根据GLSLangSpec.Full.1.30.08(p61),gl_FragCoord.z是固定功能计算所得的结果。如果片元着色器不写 gl_FragDepth,那么这个值将用在后续处理中。OpenGL Shading Language提到(p104),即使将gl_FragCoord.z赋值给gl_FragDepth也不能保证产生和固定功能完全相同的值。 但是,可以保证相对正确。加之片元着色器一旦写入gl_FragDepth,就必须保证在每个分支都有写入。因此,如果一个着色器需要在某些条件下自己计算深度,其它条件下的正确做法就是gl_FragDepth = gl_FragCoord.z。

有种自行计算gl_FragDepth的方法(近似值):参数近裁面n,远裁面f,相机坐标系下物体与相机的距离eyeDepth(大于0

gl_FragDepth = pow ( f / ( f - n ) - f * n / (( f - n ) * eyeDepth), 15.0 ); ( 里面的值即 1 中的 a + b / z,两式中的15是估计值)

 

3、从DepthTexture中反推世界坐标


  1. // 通过DepthTexture的深度反推世界坐标,uv为当前屏幕坐标(0, 1)之间
    float getEyeDepth(vec2 uv) {
  2. float z = texture2D(m_DepthTexture, uv).x;
    float depth = z ;
    vec2 wh = vec2(g_ViewPort.z - g_ViewPort.x, g_ViewPort.w - g_ViewPort.y);
    vec4 screenPos = vec4(gl_FragCoord.x/wh.x, gl_FragCoord.y/wh.y, depth, 1.0) * 2.0 - 1.0;
    vec4 viewPosition = g_ProjectionMatrixInverse * screenPos;

    z = -(viewPosition.z / viewPosition.w); // 得到相机与顶点的距离,正值>0
    // 得到相机坐标
    viewPosition = viewPosition / viewPosition.w;
    // 得到世界坐标
    vec4 worldCoord = g_ViewMatrixInverse * viewPosition;
    return worldCoord;
    }

This article was written in springnote.

posted @ 2012-02-06 11:39 ActionFG 阅读(27) 评论(0) 编辑
  • 1、It is expected that graphics hardware will have a small number of fixed vector locations for passing vertex inputs. Therefore, the OpenGL Shading language defines each non-matrix input variable as taking up one such vector location. There is an implementation dependent limit on the number of locations that can be used, and if this is exceeded it will cause a link error. (Declared input variables that are not statically used do not count against this limit.) A  scalar input counts the same amount against this limit as a vec4, so applications may want to consider packing groups of four unrelated float inputs together into a vector to better utilize the capabilities of the underlying hardware.  A matrix input will use up multiple locations.
    The number of locations used will equal the number of columns in the matrix.

    大致意思:顶点渲染程序所传入的(attribute)参数是有个数限制的,非矩阵的参数将占据一个向量位置。超过限制将会导致链接出错,所有最后将float型变量打包成向量传入。

 

  • There is an implementation dependent limit on the amount of storage for uniforms that can be used for each type of shader and if this is exceeded it will cause a compile-time or link-time error. Uniform variables that are declared but not used do not count against this limit. The number of user-defined uniform variables and the number of built-in uniform variables that are used within a shader are added together to determine whether available uniform storage has been exceeded.  (Uniform 变量数量)

     

  • Shader的自定义结构体如果没有定义其实例名称,则结构体内部声明的变量名称将处于全局命名空间中。在shader文件之外(如opengl API程序中),将使用结构体的名称加 “.” 来标示,而不是结构体实例名。如下所示:
  1. out Vertex {
        vec4 Position;  // API transform/feedback will use “Vertex.Position”
        vec2 Texture;
    } Coords;           // shader will use “Coords.Position”
    out Vertex2 {
        vec4 Color;     // API will use “Color”
    };
  • Each of these qualifiers may appear at most once.  If index is specified, location must also be specified.  If index is not specified, the value 0 is used.  For example, in a fragment shader,

  1. layout(location = 3) out vec4 color;


will establish that the fragment shader output color is assigned to fragment color 3 as the first (index zero) input to the blend equation.  

什么意思?

  • vertShader和geomShader之间的参数传递:名字不相同(geomShader多个“[ ]"数组符号),类型相同,修饰符相同(smooth,flat,noperspective)(geomShader中多”in“标示符)

    geomShader和fragShader之间的参数传递:名字相同,类型相同,修饰符相同,(geomShader中多“out”标示符,且无法使用inout)

 

  • geomShader的输入类型和输出类型:
  1.      /**    

  2.      * Geometry Shader Input Type:
         * GL_POINTS, GL_LINES, GL_LineStrip,GL_LineLoop,GL_TRIANGLES, GL_Triangles_Strip,GL_Triangle_Fan
         * ARBGeometryShader4.GL_LINES_ADJACENCY_ARB, ARBGeometryShader4.GL_LINE_STRIP_ADJACENCY_ARB,
         * ARBGeometryShader4.GL_TRIANGLES_ADJACENCY_ARB, ARBGeometryShader4.GL_TRIANGLE_STRIP_ADJACENCY_ARB,
         */

  3.     /**
         * Geometry Shader Output Type:
         * GL_POINTS, GL_LINE_STRIP, GL_TRIANGLE_STRIP
         */
  • Transform Feedback.
    (If a geometry shader is active, its output primtive type is used instead of the <mode> parameter passed to Begin for the purposes of this error check.)
意思是:如果geometry shader启用,那么为了不至于出现INVALID_OPERATION错误,Transform Feedback的输出图元类型将采用geometry shader的输出图元类型,而不是在Begin函数中所
输入的图元类型??
  1. CapturePrimitiveMode               allowed render primitive modes
          ----------------------      ---------------------------------
          POINTS                      POINTS
          LINES                       LINES, LINE_LOOP, and LINE_STRIP
          TRIANGLES                   TRIANGLES, TRIANGLE_STRIP,
                                      TRIANGLE_FAN, QUADS, QUAD_STRIP,
                                      and POLYGON
    

This article was written in springnote.

posted @ 2012-02-06 11:38 ActionFG 阅读(118) 评论(0) 编辑