代码改变世界

C性能调优---GCC编译选项-fomit-frame-pointer

2013-11-26 21:02 islandscape 阅读(...) 评论(...) 编辑 收藏

  在看《C程序性能优化》一书时,作者提到使用gcc编译器选项-fomit-frame-pointer能够提高程序性能,自己有些不解,决定探个究竟。

  假设有如下简单程序:

#include <stdio.h>

int add(int a, int b)
{
        return a + b;
}

int main()
{
        int sum = 0;
        sum = add(1,2);
        printf("%d\n",sum);
        return 0;
} 

  不使用-fomit-frame-pointer选项编译出的二进制经过反汇编的代码如下:

00000000 <add>:
   0:	55                   	push    %ebp
   1:	89 e5                	mov     %esp,%ebp
   3:	8b 45 0c              	mov     0xc(%ebp),%eax
   6:	8b 55 08             	mov     0x8(%ebp),%edx
   9:	01 d0                	add     %edx,%eax
   b:	5d                   	pop     %ebp
   c:	c3                   	ret      

0000000d <main>:
   d:	55                   	push   %ebp
   e:	89 e5                	mov    %esp,%ebp
  10:	83 e4 f0             	and    $0xfffffff0,%esp
  13:	83 ec 20             	sub    $0x20,%esp
  16:	c7 44 24 1c 00 00 00 	movl   $0x0,0x1c(%esp)
  1d:	00 
  1e:	c7 44 24 04 02 00 00 	movl   $0x2,0x4(%esp)
  25:	00 
  26:	c7 04 24 01 00 00 00 	movl   $0x1,(%esp)
  2d:	e8 fc ff ff ff       	call   2e <main+0x21>
  32:	89 44 24 1c          	mov    %eax,0x1c(%esp)
  36:	b8 00 00 00 00       	mov    $0x0,%eax
  3b:	8b 54 24 1c          	mov    0x1c(%esp),%edx
  3f:	89 54 24 04          	mov    %edx,0x4(%esp)
  43:	89 04 24             	mov    %eax,(%esp)
  46:	e8 fc ff ff ff       	call   47 <main+0x3a>
  4b:	b8 00 00 00 00       	mov    $0x0,%eax
  50:	c9                   	leave  
  51:	c3                   	ret

  加上编译选项-fomit-frame-pointer反汇编得到的代码如下:

00000000 <add>:
   0:	8b 44 24 08          	mov    0x8(%esp),%eax
   4:	8b 54 24 04          	mov    0x4(%esp),%edx
   8:	01 d0               	add    %edx,%eax
   a:	c3                  	ret    

0000000b <main>:
   b:	55                   	push   %ebp
   c:	89 e5                	mov    %esp,%ebp
   e:	83 e4 f0             	and    $0xfffffff0,%esp
  11:	83 ec 20             	sub    $0x20,%esp
  14:	c7 44 24 1c 00 00 00 	movl   $0x0,0x1c(%esp)
  1b:	00 
  1c:	c7 44 24 04 02 00 00 	movl   $0x2,0x4(%esp)
  23:	00 
  24:	c7 04 24 01 00 00 00 	movl   $0x1,(%esp)
  2b:	e8 fc ff ff ff       	call   2c <main+0x21>
  30:	89 44 24 1c          	mov    %eax,0x1c(%esp)
  34:	b8 00 00 00 00       	mov    $0x0,%eax
  39:	8b 54 24 1c          	mov    0x1c(%esp),%edx
  3d:	89 54 24 04          	mov    %edx,0x4(%esp)
  41:	89 04 24             	mov    %eax,(%esp)
  44:	e8 fc ff ff ff       	call   45 <main+0x3a>
  49:	b8 00 00 00 00       	mov    $0x0,%eax
  4e:	c9                   	leave  
  4f:	c3                   	ret   

  可以看到不加-fomit-frame-pointer选项编译出来的代码少了一些,最主要的区别是少了栈帧的切换和栈地址的保存,栈是从高地址向低地址扩展,而堆是从低地址向高地址扩展。在x86体系结构中,栈顶寄存器是esp,栈底寄存器位ebp,esp的值要小于ebp的值。函数调用时先将函数返回值、传入参数依次压入栈中,CPU访问时采用0x8(%esp)方式访问传入的参数,使用-fomit-frame-pointer会由于没有保存栈调用地址,而导致无法追踪函数调用顺序,我想gcc,vs等编译器记录函数调用顺序都是采用这种方式吧。