绑核不均匀问题
最近遇到一个io绑核不均衡问题,现象如下:
top - 10:14:24 up 2 days, 13:42, 13 users, load average: 53.83, 50.37, 48.42
Tasks: 1217 total,   8 running, 1209 sleeping,   0 stopped,   0 zombie
%Cpu0  : 14.7 us, 16.4 sy,  0.0 ni,  2.7 id, 65.6 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu1  : 12.5 us, 12.5 sy,  0.0 ni,  3.4 id, 70.8 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu2  : 11.2 us, 14.2 sy,  0.0 ni,  6.4 id, 67.8 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu3  : 11.0 us, 15.1 sy,  0.0 ni,  4.0 id, 69.2 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu4  : 16.1 us, 79.0 sy,  0.0 ni,  0.7 id,  3.0 wa,  0.0 hi,  1.3 si,  0.0 st
%Cpu5  : 12.1 us, 13.1 sy,  0.0 ni,  0.0 id, 74.2 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu6  : 11.3 us, 12.3 sy,  0.0 ni,  4.3 id, 71.3 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu7  : 12.3 us, 12.6 sy,  0.0 ni,  9.3 id, 65.6 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu8  : 11.8 us, 10.5 sy,  0.0 ni, 24.3 id, 53.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu9  : 12.0 us, 14.0 sy,  0.0 ni, 30.0 id, 43.3 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu10 : 10.7 us, 13.4 sy,  0.0 ni, 27.5 id, 47.3 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu11 : 10.3 us, 13.6 sy,  0.0 ni, 11.0 id, 64.8 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu12 : 10.3 us, 13.6 sy,  0.0 ni,  8.0 id, 67.8 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu13 :  9.3 us, 13.9 sy,  0.0 ni,  9.6 id, 66.9 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu14 :  9.7 us, 13.8 sy,  0.0 ni,  6.0 id, 69.8 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu15 :  9.1 us, 14.4 sy,  0.0 ni,  8.4 id, 67.4 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu16 :  9.1 us, 12.8 sy,  0.0 ni, 11.7 id, 65.4 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu17 : 10.7 us, 12.7 sy,  0.0 ni,  7.0 id, 69.2 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu18 : 10.1 us, 13.8 sy,  0.0 ni,  5.7 id, 70.1 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu19 :  9.1 us, 13.8 sy,  0.0 ni,  4.7 id, 71.8 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu20 :  3.1 us,  4.1 sy,  0.0 ni,  6.8 id, 85.8 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu21 :  4.7 us,  6.8 sy,  0.0 ni, 12.2 id, 76.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu22 :  4.1 us,  5.1 sy,  0.0 ni, 14.7 id, 76.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu23 :  7.4 us,  9.8 sy,  0.0 ni,  3.0 id, 79.1 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu24 :  7.4 us, 11.8 sy,  0.0 ni,  6.8 id, 73.6 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu25 :  2.4 us,  3.4 sy,  0.0 ni,  7.1 id, 86.9 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu26 :  7.4 us, 10.1 sy,  0.0 ni,  4.7 id, 77.2 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu27 :  3.4 us,  4.0 sy,  0.0 ni, 11.1 id, 81.5 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu28 : 10.9 us, 15.2 sy,  0.0 ni, 21.5 id, 51.3 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu29 :  5.7 us,  8.4 sy,  0.0 ni, 19.3 id, 66.2 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu30 :  4.1 us,  5.1 sy,  0.0 ni, 23.6 id, 66.9 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu31 :  3.7 us,  8.4 sy,  0.0 ni, 31.4 id, 56.1 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu32 :  6.4 us, 10.0 sy,  0.0 ni, 82.9 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu33 :  2.0 us,  3.7 sy,  0.0 ni, 94.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu34 :  6.0 us,  9.3 sy,  0.0 ni, 84.1 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu35 :  4.4 us,  5.1 sy,  0.0 ni, 89.9 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu36 : 13.2 us, 12.8 sy,  0.0 ni, 73.6 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
我奇怪地发现,只有前面的cpu有io,剩余几个没有。
查看线程绑核,发现
 cpunum = (unsigned int)GetCpuNum();
    if(0 !=cpunum)){
	cpu_affi=1<<cpunum;
	 set_ret=SetAffinity(cpu_affi);
      }
按道理应该没问题,由于没有gdb调试,后来经其他同事查看,发现cpu_affi设置的是
unsigned int cpu_affi=0;
这就意味着撑死了也只能绑到前面32个核上,后面的核是绑不上的。
 
                    
                 
 
                
            
         浙公网安备 33010602011771号
浙公网安备 33010602011771号