Guys,
As this issue comes up fairly often I'll try to summarize depth textures support for both ATI and nVidia chipsets. If I get any of nVidia's depth implementation wrong then please let me know :)
Both ATI and nVidia HW support depth textures, although in a different way. The creation of the depth textures themselves is very similar:
* Exposed formats
- ATI exposes two FOURCC to create 16 or 24-bit depth textures:
#define FOURCC_DF16 ((D3DFORMAT) MAKEFOURCC('D','F','1','6')) 
#define FOURCC_DF24 ((D3DFORMAT) MAKEFOURCC('D','F','2','4')) 
DF16 is supported on R300 chipsets and up (9500+) while DF24 is supported on RV530 chipsets and up (X1600 and X1900).
- nVidia uses the predefined D3DFMT_D16 and D3DFMT_D24S8 formats.
GeForce3 chipsets and up support those.
In most cases a 16-bit format should be enough to accommodate most needs. There should be enough precision as long as your projection matrix is chosen carefully (using a front clip plane value as large as
possible) and your Z-range distributed sensibly. It is stronly recommended to prefer 16-bit shadow maps whenever possible as they will perform better in terms of performance, and are more widely supported.
* To check availability of those formats the CheckDeviceFormat() API should be used. 
- Thus for a 16-bit depth surface you would call for ATI:
hres = d3d->CheckDeviceFormat(Adapter, DeviceType, AdapterFormat, D3DUSAGE_DEPTHSTENCIL, D3DRTYPE_TEXTURE, FOURCC_DF16);
- And for nVidia:
hres = d3d->CheckDeviceFormat(Adapter, DeviceType, AdapterFormat, D3DUSAGE_DEPTHSTENCIL, D3DRTYPE_TEXTURE, D3DFMT_D16); Note that it is safer to check for nVidia device IDs as well as doing the above check since nVidia's depth textures functionality relies on "overloading" the meaning of an existing format (one key difference is that sampling from a nVidia depth texture will actually *not* return depth values).
* Texture surface creation
Again the only difference between ATI and nVidia implementation is which format to call:
- For ATI:
hres = d3ddevice->CreateTexture(ShadowMapWidth, ShadowMapHeight, 1, D3DUSAGE_DEPTHSTENCIL, FOURCC_DF16, D3DPOOL_DEFAULT, &pShadowMap);
- For nVidia:
hres = d3ddevice->CreateTexture(ShadowMapWidth, ShadowMapHeight, 1, D3DUSAGE_DEPTHSTENCIL, D3DFMT_D16,  D3DPOOL_DEFAULT, &pShadowMap);
* The intermediate setup (surface binding, viewport, etc.) should be the same between the two. 
* Once rendering has taken place the depth texture can be used as a normal texture using the SetTexture() API.
* The main difference between ATI and nVidia's depth textures implementations is in the shader to use.
- Sampling from ATI depth textures will return depth values. It is up to the shader to fetch depth samples and to perform comparisons with an incoming z value. This allows more flexibility when choosing the filter kernel to use and the weights to apply to each sample. The X1600 and X1900 support an additional feature called Fetch4 that returns four adjacent depth samples into the RGBA channels of the destination register with a single texture instruction. This enables high-performance shadow maps and/or larger kernels to be used.
- Sampling from nVidia depth textures will return Percentage-Closer-Filtered results as the comparison with an incoming Z value is automatically performed when sampling from depth textures.
It should be fairly straightforward to automate the creation process to cater for ATI or nVidia's versions of depth textures as this part of the process is very similar in code. The bulk of the work consists in adding #ifdefs to your HLSL shader code in order to support ATI and nVidia styles of calculating shadow contributions for each pixel. Both vendors have code and shader examples for their respective implementations (along with documentation) on their developer websites.
Two items of note to ensure high performance (based on real-life examples :)):
- Remember to disable color writes entirely when rendering shadow casters into your depth texture. In most cases you're only interested in the contents of the depth textures (the runtime requires a valid binding to a color buffer of the same dimensions as the depth buffer/texture).
"Forgetting" to disable color writes will cause unnecessary color buffer bandwidth to be consumed (it happens).
- About rendering transparent (alpha-tested) shadow casters into your depth textures: make sure to only enable alpha-testing (or texkill if the destination surface cannot be used with
D3DUSAGE_QUERY_POSTPIXELSHADER_BLENDING) for primitives that are supposed to be transparent. Leaving alpha-testing on (or using a texkill
shader) for all shadow casters objects will defeat early Z advantages as the pixel shader may get executed before the depth compare takes place.
It can be common to want to use the same flexible shader for all your shadow rendering but it pays to make that extra step :)
Nick
European Developer Relations, ATI Technologies MrT@ati.com 
 
                     
                    
                 
                    
                 
         
                
            
         浙公网安备 33010602011771号
浙公网安备 33010602011771号