代码改变世界

【Oracle】High CPU (%sys) Usage On Oracle Linux 6 UEK3 RAC Node

2022-08-02 22:33  abce  阅读(152)  评论(0编辑  收藏  举报

High CPU (%sys) Usage On Oracle Linux 6 UEK3 RAC Node (Doc ID 2241615.1)

现象

On Oracle Linux 6 server running as Oracle RAC node, High %sys CPU usage can be observed from top output ( 75.5%sy) :


zzz ***Wed Feb 8 16:25:14 AEDT 2017
top - 16:26:25 up 4 days, 17:34,  6 users,  load average: 120.68, 117.72, 93.18
Tasks: 2251 total,  92 running, 2158 sleeping,   0 stopped,   1 zombie
Cpu(s): 17.1%us, 75.5%sy,  0.0%ni,  4.4%id,  0.7%wa,  0.0%hi,  2.3%si,  0.0%st
Mem:  1056507280k total, 1052865872k used,  3641408k free,  1297960k buffers
Swap: 134217724k total,        0k used, 134217724k free, 760806128k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND          
8671 root      RT   0  825m  95m  57m S 91.9  0.0 827:48.71 osysmond.bin      
1808 oracle    20   0 20.3g 109m  23m R 85.5  0.0   2:06.39 oracle            
32912 oracle    20   0 3407m 115m  28m R 84.8  0.0   5:39.90 oracle            
56256 oracle    20   0 20.2g  28m  22m R 74.5  0.0 225:14.39 oracle            
88043 oracle    20   0 20.3g  82m  29m R 73.8  0.0   4:42.66 oracle            
13444 root      20   0 3349m  48m  18m S 50.4  0.0 161:49.10 klzagent          
33686 oracle    -2   0 3310m 3604 1384 S 48.7  0.0 170:45.79 oracle            
10906 oracle    -2   0 1306m  14m  12m S 47.8  0.0 170:04.91 oracle            
34583 oracle    -2   0 20.2g 3604 1384 S 46.9  0.0 169:49.54 oracle            
34597 oracle    -2   0 20.2g 3588 1384 S 46.2  0.0 170:16.02 oracle            
34227 oracle    -2   0 3310m 3584 1364 S 45.6  0.0 170:34.68 oracle            
34573 oracle    -2   0 20.2g 3608 1388 S 45.1  0.0 169:43.12 oracle            
34577 oracle    -2   0 20.2g 3600 1384 S 45.1  0.0 170:45.48 oracle

  

收集信息:

# perf record -a -g

执行上面命令会一直运行,所以执行一段时间后,需要手动执行ctrl c 进行退出

# perf report


从perf的输出结果,可以看到cpu被spin在第三方模块symev_write调用:

# Samples: 43K of event 'cycles'
# Event count (approx.): 31601940393
#
# Overhead          Command                           Shared Object                                                                        Symbol
# ........  ...............  ......................................  ............................................................................
#
   50.00%           oracle  [kernel.kallsyms]                       [k] __ticket_spin_lock                                                      
                    |
                    --- __ticket_spin_lock
                       |          
                       |--99.67%-- _raw_spin_lock
                       |          |          
                       |          |--99.62%-- symev_deliver
                       |          |          |          
                       |          |          |--97.50%-- symev_fd_event
                       |          |          |          |          
                       |          |          |          |--95.06%-- symev_write
                       |          |          |          |          system_call_fastpath
                       |          |          |          |          __write_nocancel
                       |          |          |          |          |          

cpu被spin在symev_write -> symev_fd_event ->symev_deliver ->_raw_spin_lock调用
列出第三方模块(Symantec Endpoint Protection client在服务器上有安装)

# lsmod
...
symap_or_uek_6_3_8_13_16_2_1_x86_64 39003 26
symev_or_uek_6_3_8_13_16_2_1_x86_64 78327 2 symap_or_uek_6_3_8_13_16_2_1_x86_64

 

解决方案:
移除第三方模块