代码改变世界

记32位程序(使用3gb用户虚拟内存)使用D3DX9导致的一个崩溃的问题

2015-12-10 01:14  风恋残雪  阅读(1267)  评论(0编辑  收藏  举报

为了增加32位程序的用户虚拟内存的使用量,我们使用了/LARGEADDRESSAWARE编译选项来使32位程序可能使用到3gb的内存,能否使用到3gb内存也跟平台、系统和设置有关系,现摘抄部分作为参考具体可参考微软官方网站[i]

 

Limits on memory and address space vary by platform, operating system, and by whether the IMAGE_FILE_LARGE_ADDRESS_AWARE value of the LOADED_IMAGE structure and 4-gigabyte tuning (4GT) are in use. IMAGE_FILE_LARGE_ADDRESS_AWARE is set or cleared by using the /LARGEADDRESSAWARE linker option.

4-gigabyte tuning (4GT), also known as application memory tuning, or the /3GB switch, is a technology (only applicable to 32 bit systems) that alters the amount of virtual address space available to user mode applications. Enabling this technology reduces the overall size of the system virtual address space and therefore system resource maximums. For more information, see What is 4GT.

Limits on physical memory for 32-bit platforms also depend on the Physical Address Extension (PAE), which allows 32-bit Windows systems to use more than 4 GB of physical memory.

Memory and Address Space Limits

The following table specifies the limits on memory and address space for supported releases of Windows. Unless otherwise noted, the limits in this table apply to all supported releases.

Memory type

Limit on X86

Limit in 64-bit Windows

User-mode virtual address space for each 32-bit process

2 GB

Up to 3 GB with IMAGE_FILE_LARGE_ADDRESS_AWARE and 4GT

2 GB with IMAGE_FILE_LARGE_ADDRESS_AWARE

cleared (default)

4 GB with IMAGE_FILE_LARGE_ADDRESS_AWARE

set

 

         使用更多的用户虚拟地址好处就不说了,下面说下我们遇到的一个比较难解决的崩溃,就是引擎在解析shader的时候,有机率崩溃,下面是示例代码:

  

 1 LPD3DXINCLUDE pInclude = NULL;
 2 DWORD dwFlag = 0;
 3 dwFlag |= D3DCOMPILE_PACK_MATRIX_ROW_MAJOR;
 4 LPD3DXBUFFER pErrorBuffer = NULL;
 5 LPD3DXCONSTANTTABLE pConstantTable = NULL;
 6 HRESULT hr = D3DXCompileShader (m_strSource.c_str(), m_strSource.size(), &macroVec[0],
 7     pInclude, m_strEntryFunc.c_str(), m_strProfile.c_str(), dwFlag, 
 8     &m_pShaderBuffer, &pErrorBuffer, &pConstantTable);
 9 
10 if (hr != S_OK)
11 {
12     // 错误处理
13 } // if
14 
15 D3DXCONSTANTTABLE_DESC desc;
16 pConstantTable->GetDesc(&desc);
17 for (UINT i = 0; i < desc.Constants; ++i)
18 {
19     D3DXCONSTANT_DESC constantDesc;
20     D3DXHANDLE handle = pConstantTable->GetConstant(NULL, i);
21     UINT numCount;
22     hr = pConstantTable->GetConstantDesc(handle, &constantDesc, &numCount);
23     VERIFY_D3D_RESULT (hr);
24     if (constantDesc.RegisterSet == D3DXRS_SAMPLER)
25     {
26         // 相应的操作
27     }
28 } // for

 

崩溃行在hr = pConstantTable->GetConstantDesc(handle, &constantDesc, &numCount);一开始崩溃看了一下最后跟到d3d9x里面崩溃,具体原因也就不好查找,所以也没仔细去管,但是后来发现崩溃的机率还是有点大的,一开始怀疑是编译的shader有问题,但是从内存中拿出编译好的二进制跟离线生成的二进制比较并没有任何差异,问题也暂时搁置。后来抓住一次崩溃的机会对d3d9x的汇编代码进行了跟踪调试,经过一系列的跟踪比对,发现一处代码很可疑,就是GetConstantDesc这个函数,函数声明为STDMETHOD(GetConstantDesc)(THIS_ D3DXHANDLE hConstant, D3DXCONSTANT_DESC *pConstantDesc, UINT *pCount) PURE,汇编代码如下:

 1 D3DXShader::CConstantTable::GetConstantDesc:
 2 0F52AFF0  mov         edi,edi  
 3 0F52AFF2  push        ebp  
 4 0F52AFF3  mov         ebp,esp  
 5 0F52AFF5  push        esi  
 6 0F52AFF6  mov         esi,dword ptr [ebp+10h]  
 7 0F52AFF9  push        edi  
 8 0F52AFFA  mov         edi,dword ptr [ebp+14h]  
 9 0F52AFFD  test        esi,esi  
10 0F52AFFF  jne         D3DXShader::CConstantTable::GetConstantDesc+1Ch (0F52B00Ch)  
11 0F52B001  test        edi,edi  
12 0F52B003  jne         D3DXShader::CConstantTable::GetConstantDesc+1Ch (0F52B00Ch)  
13 0F52B005  mov         eax,8876086Ch  
14 0F52B00A  jmp         D3DXShader::CConstantTable::GetConstantDesc+8Dh (0F52B07Dh)  
15 0F52B00C  mov         ecx,dword ptr [ebp+0Ch]  // ecx 中即为hConstant
16 0F52B00F  test        ecx,ecx  
17 0F52B011  je          D3DXShader::CConstantTable::GetConstantDesc+15h (0F52B005h)  
18 0F52B013  mov         edx,dword ptr [ebp+8]  
19 0F52B016  mov         eax,dword ptr [edx+4]  
20 0F52B019  or          eax,ecx ;注意此处代码,用来判断ecx的最高位是否是1
21 ; 如果为1则不跳转,否则跳转到绿色所示的代码处开始执行,这也是崩溃的开始,最终在调用
22 ; 红色所示的函数时/发生了崩溃。
23 0F52B01B  jge         D3DXShader::CConstantTable::GetConstantDesc+31h (0F52B021h)  
24 0F52B01D  neg         ecx  ; 至于此处为什么要取负值,请详见下面的具体说明。
25 0F52B01F  jmp         D3DXShader::CConstantTable::GetConstantDesc+44h (0F52B034h)  
26 0F52B021  lea         eax,[ebp+10h]  
27 0F52B024  push        eax  
28 0F52B025  push        ecx  
29 0F52B026  mov         ecx,edx  
30 0F52B028  call        D3DXShader::CConstantTable::FindConstantByName (0F52AE0Dh)  
31 0F52B02D  test        eax,eax  
32 0F52B02F  js          D3DXShader::CConstantTable::GetConstantDesc+8Dh (0F52B07Dh)  
33 0F52B031  mov         ecx,dword ptr [ebp+10h]  
34 0F52B034  xor         edx,edx  
35 0F52B036  xor         eax,eax  
36 0F52B038  inc         edx  
37 0F52B039  push        ebx  
38 0F52B03A  mov         ebx,ecx  
39 0F52B03C  test        ecx,ecx  
40 0F52B03E  je          D3DXShader::CConstantTable::GetConstantDesc+58h (0F52B048h)  
41 0F52B040  mov         ebx,dword ptr [ebx+24h]  
42 0F52B043  inc         eax  
43 0F52B044  test        ebx,ebx  
44 0F52B046  jne         D3DXShader::CConstantTable::GetConstantDesc+50h (0F52B040h)  
45 0F52B048  pop         ebx  
46 0F52B049  test        edi,edi  
47 0F52B04B  je          D3DXShader::CConstantTable::GetConstantDesc+6Ch (0F52B05Ch)  
48 0F52B04D  cmp         dword ptr [edi],0  
49 0F52B050  je          D3DXShader::CConstantTable::GetConstantDesc+64h (0F52B054h)  
50 0F52B052  mov         edx,dword ptr [edi]  
51 0F52B054  cmp         edx,eax  
52 0F52B056  jbe         D3DXShader::CConstantTable::GetConstantDesc+6Ah (0F52B05Ah)  
53 0F52B058  mov         edx,eax  
54 0F52B05A  mov         dword ptr [edi],eax  
55 0F52B05C  test        esi,esi  
56 0F52B05E  je          D3DXShader::CConstantTable::GetConstantDesc+8Bh (0F52B07Bh)  
57 0F52B060  jmp         D3DXShader::CConstantTable::GetConstantDesc+87h (0F52B077h)  
58 0F52B062  test        edx,edx  
59 0F52B064  je          D3DXShader::CConstantTable::GetConstantDesc+8Bh (0F52B07Bh)  
60 0F52B066  push        esi  
61 0F52B067  call        D3DXShader::CConstant::GetDesc (0F5275B6h)  
62 0F52B06C  test        eax,eax  
63 0F52B06E  js          D3DXShader::CConstantTable::GetConstantDesc+8Dh (0F52B07Dh)  
64 0F52B070  mov         ecx,dword ptr [ecx+24h]  
65 0F52B073  add         esi,30h  
66 0F52B076  dec         edx  
67 0F52B077  test        ecx,ecx  
68 0F52B079  jne         D3DXShader::CConstantTable::GetConstantDesc+72h (0F52B062h)  
69 0F52B07B  xor         eax,eax  
70 0F52B07D  pop         edi  
71 0F52B07E  pop         esi  
72 0F52B07F  pop         ebp  
73 0F52B080  ret         10h

 

上面的蓝色的代码处为什么会取负值呢,让我们把GetConstant的汇编代码帖出来读者就能看出来了:

 1 D3DXShader::CConstantTable::GetConstant:
 2 0F52B0D0  mov         edi,edi  
 3 0F52B0D2  push        ebp  
 4 0F52B0D3  mov         ebp,esp  
 5 0F52B0D5  mov         ecx,dword ptr [ebp+0Ch]  
 6 0F52B0D8  test        ecx,ecx  
 7 0F52B0DA  jne         D3DXShader::CConstantTable::GetConstant+23h (0F52B0F3h)  
 8 0F52B0DC  mov         ecx,dword ptr [ebp+10h]  
 9 0F52B0DF  mov         eax,dword ptr [ebp+8]  
10 0F52B0E2  cmp         ecx,dword ptr [eax+1Ch]  
11 0F52B0E5  jb          D3DXShader::CConstantTable::GetConstant+1Bh (0F52B0EBh)  
12 0F52B0E7  xor         eax,eax  
13 0F52B0E9  jmp         D3DXShader::CConstantTable::GetConstant+52h (0F52B122h)  
14 0F52B0EB  mov         eax,dword ptr [eax+18h]  
15 0F52B0EE  mov         eax,dword ptr [eax+ecx*4]  
16 0F52B0F1  jmp         D3DXShader::CConstantTable::GetConstant+50h (0F52B120h)  
17 0F52B0F3  mov         edx,dword ptr [ebp+8]  
18 0F52B0F6  mov         eax,dword ptr [edx+4]  
19 0F52B0F9  or          eax,ecx  
20 0F52B0FB  jge         D3DXShader::CConstantTable::GetConstant+31h (0F52B101h)  
21 0F52B0FD  neg         ecx  
22 0F52B0FF  jmp         D3DXShader::CConstantTable::GetConstant+44h (0F52B114h)  
23 0F52B101  lea         eax,[ebp+8]  
24 0F52B104  push        eax  
25 0F52B105  push        ecx  
26 0F52B106  mov         ecx,edx  
27 0F52B108  call        D3DXShader::CConstantTable::FindConstantByName (0F52AE0Dh)  
28 0F52B10D  test        eax,eax  
29 0F52B10F  js          D3DXShader::CConstantTable::GetConstant+17h (0F52B0E7h)  
30 0F52B111  mov         ecx,dword ptr [ebp+8]  
31 0F52B114  push        dword ptr [ebp+10h]  
32 0F52B117  call        D3DXShader::CConstant::GetConstantMember (0F52765Fh)  
33 0F52B11C  test        eax,eax  
34 0F52B11E  je          D3DXShader::CConstantTable::GetConstant+17h (0F52B0E7h)  
35 0F52B120  neg         eax  ; 此处对要返回的结果进行了取负值操作,所以在下面调用
36 ; GetConstantDesc 的时候需要对其进行取负值操作,得到最终的结果,至于为什么这样做
37 ; 目前不是特别清楚,希望了解的人可以告知。
38 0F52B122  pop         ebp  
39 0F52B123  ret         0Ch  
 1 //----------------------------------------------------------------------------
 2 // D3DXHANDLE:
 3 // -----------
 4 // Handle values used to efficiently reference shader and effect parameters.
 5 // Strings can be used as handles.  However, handles are not always strings.
 6 //----------------------------------------------------------------------------
 7 
 8 #ifndef D3DXFX_LARGEADDRESS_HANDLE
 9 typedef LPCSTR D3DXHANDLE;
10 #else
11 typedef UINT_PTR D3DXHANDLE;
12 #endif
13 typedef D3DXHANDLE *LPD3DXHANDLE;

 

 

HRESULT D3DXCompileShader(
_In_ LPCSTR pSrcData,
_In_ UINT srcDataLen,
_In_ const D3DXMACRO *pDefines,
_In_ LPD3DXINCLUDE pInclude,
_In_ LPCSTR pFunctionName,
_In_ LPCSTR pProfile,
_In_ DWORD Flags,
_Out_ LPD3DXBUFFER *ppShader,
_Out_ LPD3DXBUFFER *ppErrorMsgs,
_Out_ LPD3DXCONSTANTTABLE *ppConstantTable
);

 

Parameters

 

ppConstantTable [out]
Type: LPD3DXCONSTANTTABLE*
Returns an ID3DXConstantTable interface, which can be used to access shader constants. This value can be NULL. If you compile your application as large address aware (that is, you use the /LARGEADDRESSAWARE linker option to handle addresses larger than 2 GB), you cannot use this parameter and must set it to NULL. Instead, you must use the D3DXGetShaderConstantTableEx function to retrieve the shader-constant table that is embedded inside the shader. In this D3DXGetShaderConstantTableEx call, you must pass the D3DXCONSTTABLE_LARGEADDRESSAWARE flag to the Flags parameter to specify to access up to 4 GB of virtual address space.[ii]

也就是说如果开启了3gb用户虚拟地址空间,那么我们在获取Shader常量表时就必须使用D3DXGetShaderConstantTableEx并且加上D3DXCONSTTABLE_LARGEADDRESSAWARE标记。到此为止,崩溃问题已经完美解决了。希望没有白白浪费你的时间,能有所收获。当然,如果文中有描述不对的地方也请指正。