随笔 - 94  文章 - 0 评论 - 1 trackbacks - 0

  1. 内核支持
Centos 6.4 x86_64 支持
# grep -i group /boot/config-2.6.32-358.el6.x86_64
# ==========================================
CONFIG_CGROUP_SCHED=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_CGROUP_NS=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_MEM_RES_CTLR=y
CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_PERF=y
CONFIG_SCHED_AUTOGROUP=y
CONFIG_CFQ_GROUP_IOSCHED=y
CONFIG_NET_CLS_CGROUP=y
CONFIG_NETPRIO_CGROUP=y
# ===========================================
 
公司编译 2.6.30 支持
# grep -i cgroup /boot/config-2.6.30-SINA
# ===========================================
CONFIG_CGROUP_SCHED=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_CGROUP_NS is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_MEM_RES_CTLR=y
CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
# ===========================================
  
Centos 5.4 x86_64 不支持
grep -i cgroup /boot/config-2.6.18-164.el5

 

Chapter 1. Introduction to Control Groups (Cgroups)

 
Red Hat Enterprise Linux 6 provides a new kernel feature: control groups, which are called by their shorter name cgroups in this guide. Cgroups allow you to allocate resources—such as CPU time, system memory, network bandwidth, or combinations of these resources—among user-defined groups of tasks (processes) running on a system. You can monitor the cgroups you configure, deny cgroups access to certain resources, and even reconfigure your cgroups dynamically on a running system. The cgconfig (control group config) service can be configured to start up at boot time and reestablish your predefined cgroups, thus making them persistent across reboots.
By using cgroups, system administrators gain fine-grained control over allocating, prioritizing, denying, managing, and monitoring system resources. Hardware resources can be smartly divided up among tasks and users, increasing overall efficiency.

The Linux Process Model

Additionally, every Linux process except init inherits the environment (such as the PATH variable)[1] and certain other attributes (such as open file descriptors) of its parent process.

The Cgroup Model

Multiple separate hierarchies of cgroups are necessary because each hierarchy is attached to one or more subsystems. A subsystem[2] represents a single resource, such as CPU time or memory. Red Hat Enterprise Linux 6 provides ten cgroup subsystems, listed below by name and function.
 

Available Subsystems in Red Hat Enterprise Linux

  • blkio — this subsystem sets limits on input/output access to and from block devices such as physical drives (disk, solid state, USB, etc.).
  • cpu — this subsystem uses the scheduler to provide cgroup tasks access to the CPU.
  • cpuacct — this subsystem generates automatic reports on CPU resources used by tasks in a cgroup.
  • cpuset — this subsystem assigns individual CPUs (on a multicore system) and memory nodes to tasks in a cgroup.
  • devices — this subsystem allows or denies access to devices by tasks in a cgroup.
  • freezer — this subsystem suspends or resumes tasks in a cgroup.
  • memory — this subsystem sets limits on memory use by tasks in a cgroup, and generates automatic reports on memory resources used by those tasks.
  • net_cls — this subsystem tags network packets with a class identifier (classid) that allows the Linux traffic controller (tc) to identify packets originating from a particular cgroup task.
  • net_prio — this subsystem provides a way to dynamically set the priority of network traffic per network interface.
  • ns — the namespace subsystem.

第 1 章 控​​​制​​​组​​​群​​​(Cgroup)简​​​介​​​

 

 

Red Hat Enterprise Linux 6 提​​​供​​​新​​​的​​​内​​​核​​​功​​​能​​​:控​​​制​​​族​​​群​​​(control group),本​​​手​​​册​​​中​​​使​​​用​​​其​​​简​​​称​​​cgroup。​​​Cgroup 可​​​让​​​您​​​为​​​系​​​统​​​中​​​所​​​运​​​行​​​任​​​务​​​(进​​​程​​​)的​​​用​​​户​​​定​​​义​​​组​​​群​​​分​​​配​​​资​​​源​​​ -- 比​​​如​​​ CPU 时​​​间​​​、​​​系​​​统​​​内​​​存​​​、​​​网​​​络​​​带​​​宽​​​或​​​者​​​这​​​些​​​资​​​源​​​的​​​组​​​合​​​。​​​您​​​可​​​以​​​监​​​控​​​您​​​配​​​置​​​的​​​ cgroup,拒​​​绝​​​ cgroup 访​​​问​​​某​​​些​​​资​​​源​​​,甚​​​至​​​在​​​运​​​行​​​的​​​系​​​统​​​中​​​动​​​态​​​配​​​置​​​您​​​的​​​ cgroup。​​​可​​​将​​​ cgconfig控​​​制​​​组​​​群​​​配​​​置​​​ )服​​​务​​​配​​​置​​​为​​​在​​​引​​​导​​​时​​​启​​​动​​​,并​​​重​​​新​​​建​​​立​​​您​​​预​​​先​​​定​​​义​​​的​​​ cgroup,这​​​样​​​可​​​使​​​其​​​在​​​重​​​启​​​过​​​程​​​中​​​保​​​留​​​它​​​们​​​。​​​

 

使​​​用​​​ cgroup,系​​​统​​​管​​​理​​​员​​​可​​​更​​​具​​​体​​​地​​​控​​​制​​​对​​​系​​​统​​​资​​​源​​​的​​​分​​​配​​​、​​​优​​​先​​​顺​​​序​​​、​​​拒​​​绝​​​、​​​管​​​理​​​和​​​监​​​控​​​。​​​可​​​更​​​好​​​地​​​根​​​据​​​任​​​务​​​和​​​用​​​户​​​分​​​配​​​硬​​​件​​​资​​​源​​​,提​​​高​​​总​​​体​​​效​​​率​​​。​​​

1.1. 如​​​何​​​管​​​理​​​控​​​制​​​组​​​群​​​

 

Cgroup 是​​​分​​​层​​​管​​​理​​​的​​​,类​​​似​​​进​​​程​​​,且​​​子​​​ cgroup 会​​​继​​​承​​​其​​​上​​​级​​​ cgroup 的​​​一​​​些​​​属​​​性​​​。​​​但​​​这​​​两​​​个​​​模​​​式​​​也​​​有​​​不​​​同​​​。​​​

Linux 进​​​程​​​模​​​式​​​

 

Linux 系​​​统​​​中​​​的​​​所​​​有​​​进​​​程​​​都​​​是​​​通​​​用​​​父​​​进​​​程​​​ init 的​​​子​​​进​​​程​​​,该​​​进​​​程​​​在​​​引​​​导​​​时​​​由​​​内​​​核​​​执​​​行​​​并​​​启​​​动​​​其​​​它​​​进​​​程​​​(这​​​些​​​进​​​程​​​会​​​按​​​顺​​​序​​​启​​​动​​​其​​​子​​​进​​​程​​​)。​​​因​​​为​​​所​​​有​​​进​​​程​​​都​​​归​​​结​​​到​​​一​​​个​​​父​​​进​​​程​​​,所​​​以​​​ Linux 进​​​程​​​模​​​式​​​是​​​一​​​个​​​单​​​一​​​层​​​级​​​结​​​构​​​,或​​​者​​​树​​​结​​​构​​​。​​​

 

另​​​外​​​,init 之​​​外​​​的​​​每​​​个​​​ Linux 进​​​程​​​都​​​会​​​继​​​承​​​其​​​父​​​进​​​程​​​的​​​环​​​境​​​(比​​​如​​​ PATH 变​​​量​​​)[1]和​​​某​​​些​​​属​​​性​​​(比​​​如​​​打​​​开​​​文​​​件​​​描​​​述​​​符​​​)。​​​

Cgroup 模​​​式​​​

 

Cgroup 与​​​进​​​程​​​在​​​以​​​下​​​方​​​面​​​类​​​似​​​:
  •  

    它​​​们​​​是​​​分​​​级​​​的​​​,且​​​
  •  

    子​​​ cgroup 会​​​集​​​成​​​其​​​ cgroup 的​​​某​​​些​​​属​​​性​​​。​​​

 

根​​​本​​​的​​​不​​​同​​​是​​​在​​​某​​​个​​​系​​​统​​​中​​​可​​​同​​​时​​​存​​​在​​​不​​​同​​​的​​​分​​​级​​​ cgroup。​​​如​​​果​​​ Linux 进​​​程​​​模​​​式​​​是​​​进​​​程​​​的​​​单​​​一​​​树​​​模​​​式​​​,那​​​么​​​ cgroup 模​​​式​​​是​​​一​​​个​​​或​​​者​​​更​​​多​​​任​​​务​​​的​​​独​​​立​​​、​​​未​​​连​​​接​​​树​​​(例​​​如​​​:进​​​程​​​)。​​​

 

需​​​要​​​多​​​个​​​独​​​立​​​ cgroup 分​​​级​​​,因​​​为​​​每​​​个​​​分​​​级​​​都​​​会​​​附​​​加​​​到​​​一​​​个​​​或​​​者​​​多​​​个​​​子​​​系​​​统​​​中​​​。​​​子​​​系​​​统​​​[2]代​​​表​​​单​​​一​​​资​​​源​​​,比​​​如​​​ CPU 时​​​间​​​或​​​者​​​内​​​存​​​。​​​Red Hat Enterprise Linux 6 提​​​供​​​ 9 个​​​ cgroup 子​​​系​​​统​​​,根​​​据​​​名​​​称​​​和​​​功​​​能​​​列​​​出​​​如​​​下​​​。​​​
Red Hat Enterprise Linux 中​​​的​​​可​​​用​​​子​​​系​​​统​​​
  •  

    blkio -- 这​​​个​​​子​​​系​​​统​​​为​​​块​​​设​​​备​​​设​​​定​​​输​​​入​​​/输​​​出​​​限​​​制​​​,比​​​如​​​物​​​理​​​设​​​备​​​(磁​​​盘​​​,固​​​态​​​硬​​​盘​​​,USB 等​​​等​​​)。​​​
  •  

    cpu -- 这​​​个​​​子​​​系​​​统​​​使​​​用​​​调​​​度​​​程​​​序​​​提​​​供​​​对​​​ CPU 的​​​ cgroup 任​​​务​​​访​​​问​​​。​​​
  •  

    cpuacct -- 这​​​个​​​子​​​系​​​统​​​自​​​动​​​生​​​成​​​ cgroup 中​​​任​​​务​​​所​​​使​​​用​​​的​​​ CPU 报​​​告​​​。​​​
  •  

    cpuset -- 这​​​个​​​子​​​系​​​统​​​为​​​ cgroup 中​​​的​​​任​​​务​​​分​​​配​​​独​​​立​​​ CPU(在​​​多​​​核​​​系​​​统​​​)和​​​内​​​存​​​节​​​点​​​。​​​
  •  

    devices -- 这​​​个​​​子​​​系​​​统​​​可​​​允​​​许​​​或​​​者​​​拒​​​绝​​​ cgroup 中​​​的​​​任​​​务​​​访​​​问​​​设​​​备​​​。​​​
  •  

    freezer -- 这​​​个​​​子​​​系​​​统​​​挂​​​起​​​或​​​者​​​恢​​​复​​​ cgroup 中​​​的​​​任​​​务​​​。​​​
  •  

    memory -- 这​​​个​​​子​​​系​​​统​​​设​​​定​​​ cgroup 中​​​任​​​务​​​使​​​用​​​的​​​内​​​存​​​限​​​制​​​,并​​​自​​​动​​​生​​​成​​​由​​​那​​​些​​​任​​​务​​​使​​​用​​​的​​​内​​​存​​​资​​​源​​​报​​​告​​​。​​​
  •  

    net_cls -- 这​​​个​​​子​​​系​​​统​​​使​​​用​​​等​​​级​​​识​​​别​​​符​​​(classid)标​​​记​​​网​​​络​​​数​​​据​​​包​​​,可​​​允​​​许​​​ Linux 流​​​量​​​控​​​制​​​程​​​序​​​(tc)识​​​别​​​从​​​具​​​体​​​ cgroup 中​​​生​​​成​​​的​​​数​​​据​​​包​​​。​​​
  •  

    ns -- 名​​​称​​​空​​​间​​​子​​​系​​​统​​​。​​​

相关资料:

https://access.redhat.com/site/documentation/zh-CN/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/

http://blog.csdn.net/t0nsha/article/details/8511433

http://yukarinpapas.blogspot.com/2011/05/limiting-kvm-guest-network-bandwidth-tc.html

http://doc.opensuse.org/documentation/html/openSUSE/opensuse-kvm/cha.libvirt.connect.html

http://www.linux-magazine.com.br/images/uploads/pdf_aberto/LM_92_69_71_03_redes-cgroups.pdf

http://www.admin-magazin.de/layout/set/print/Das-Heft/2011/06/Cgroups-zur-Ressourcenkontrolle-in-Linux/(offset)/4

http://www.oracle.com/technetwork/cn/articles/servers-storage-admin/resource-controllers-linux-1506602-zhs.html

 

安装包:

[root@localhost ~]# cat /etc/redhat-release
CentOS release 6.4 (Final)
 
#默认安装源
yum install libcgroup libcgroup-pam libcgroup-deve libvirt virt-install virt-manager

配置文件:

[root@localhost ~]# cat /etc/cgconfig.conf
 
mount {
 cpuset = /cgroup/cpuset;
 cpu = /cgroup/cpu;
 cpuacct = /cgroup/cpuacct;
 memory = /cgroup/memory;
 devices = /cgroup/devices;
 freezer = /cgroup/freezer;
 net_cls = /cgroup/net_cls;
 blkio = /cgroup/blkio;
 ns = /cgroup/ns;
 perf_event = /cgroup/perf_event;
 net_prio = /cgroup/net_prio;
}
 
 

重启与确认:

[root@localhost ~]# /etc/init.d/cgconfig restart
Stopping cgconfig service: [确定]
Starting cgconfig service: [确定]
 
[root@localhost ~]# lssubsys -am
cpuset /cgroup/cpuset
ns /cgroup/ns
cpu /cgroup/cpu
cpuacct /cgroup/cpuacct
memory /cgroup/memory
devices /cgroup/devices
freezer /cgroup/freezer
net_cls /cgroup/net_cls
blkio /cgroup/blkio
perf_event /cgroup/perf_event
net_prio /cgroup/net_prio
 

2. blkio实践

使用cgroup限制dd程序写入速度1Mb/s.

 

# 环境介绍:
# 根分区:253, 0 /dev/dm-0
  
# 创建:
# 建立组,dd名称跟进程名字无关。这里只是方便记忆。
 cgcreate -g blkio:/dd
  
# 设置限制:
# 限速1Mb/s写入
 cgset -r blkio.throttle.write_bps_device="253:0 1048576" dd
  
# 测试:
 
 
# 正常运行:
# 启动时自动添加PID到/cgroup/blkio/dd/tasks
 cgexec -g blkio:dd --sticky dd if=/dev/zero of=test bs=1M count=10 oflag=direct
  
# 正常运行方式2:
# 跟第一种区别在于,将负进程PID添加,因cgroup有继承,所以也可这样做。
 bash -c "echo \$$ > /cgroup/blkio/dd/tasks && dd if=/dev/zero of=test bs=1M count=10 oflag=direct"
   
# 调试方式运行:
# 可显示更多信息。
 export CGROUP_LOGLEVEL=DEBUG ; cgexec -g blkio:dd --sticky dd if=/dev/zero of=test bs=1M count=10 oflag=direct
  
# 结果
记录了10+0 的读入
记录了10+0 的写出
10485760字节(10 MB)已复制,10.1189 秒,1.0 MB/秒
 
  
   
# 删除:
 cgdelete -r blkio:/dd
 

3. cpu实践

使用cgroup限制bash程序占用cpu利用率.

 

# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g cpu:/bash
  
# 设置限制:
# 由于父级默认是1024,子级设置256,最终呈现结果是占用CPU 20%。
cgset -r cpu.shares="256" bash
 
  
# 测试脚本:
[root@localhost ~]# cat while.sh
 
#!/bin/bash
while true
do
 true;
done
 
# 复制
cp  while.sh while2.sh
 
# 运行:
# while2.sh 配比是1024 , 80%
taskset -c 0 ./while2.sh
# while.sh 配比256 ,20%
cgexec -g cpu:bash --sticky taskset -c 0 ./while.sh
   
计算公式: 256/(1024+256) = 20%
 
top结果:
53824 root 20 0 103m 1144 996 R 79.6 0.1 4:58.21 while2.sh
53822 root 20 0 103m 1152 996 R 19.9 0.1 3:55.88 while.sh
   
# 删除:
cgdelete -r cpu:/bash
  
 

4. cpuset实践

使用cgroup限制sleep程序占用cpu核心.

 

# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g cpuset:/bash
  
# 设置限制:
# 执行命令,获取内存node数量
numactl --hardware
# 必须指定数字范围,可以控制进程使用内存区域。
cgset -r cpuset.mems="0" bash
# 控制进程使用核心在0,1之间
cgset -r cpuset.cpus="0,1" bash
 
  
# 测试脚本:
[root@localhost ~]# cat sleeps.sh
#!/bin/bash
sleep 10 &
sleep 11 &
sleep 12 &
sleep 13 &
ps -eo "pid,comm,psr" | grep sleep
 
 
# 运行:
# 执行,进程分布将在指定范围。
cgexec -g cpuset:bash ./sleeps.sh
   
 
结果:
设置之前:
[root@localhost ~]# ./sleeps.sh
56044 sleeps.sh 1
56045 sleep 3
56046 sleep 0
56047 sleep 3
56048 sleep 2
 
设置之后:
[root@localhost ~]# ./sleeps.sh
55990 sleeps.sh 1
55991 sleep 0
55992 sleep 1
55993 sleep 0
55994 sleep 1
 
     
# 删除:
cgdelete -r cpuset:/bash
   
 

5. cpuacct实践

使用cgroup审计bash程序cpu使用.

 

# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g cpuacct:/bash
  
# 设置
# 重置数据
cgset -r cpuacct.usage="0" bash
 
   
# 测试脚本:
之前的while.sh
 
# 运行:
cgexec -g cpuacct:bash ./while.sh
   
 
结果:
单位USER_HZ,通常情况是100/1秒。
[root@localhost bash]# cat /cgroup/cpuacct/bash/cpuacct.stat
user 4591
system 0
  
单位:nanoseconds 纳秒。
[root@localhost bash]# cat /cgroup/cpuacct/bash/cpuacct.usage
200925455685
 
     
# 删除:
cgdelete -r cpuacct:/bash
    
 

6. devices实践

使用cgroup限制fdisk程序使用.

 

系统环境介绍
[root@localhost ~]# ls -l /dev/sda
brw-rw---- 1 root disk 8, 0 Jul 10 17:23 /dev/sda
 
实验分区主从设备号 8,0 。
  
 
# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g devices:/fdisk
  
 
# 初始输出
[root@localhost sed]# cgget -n -g devices /sed
devices.list: a *:* rwm
devices.deny:
devices.allow:
 
# 设置   
cgset -r devices.deny='b 8:0 mrw' sed
 
# 设置后输出
[root@localhost sed]# cgget -n -g devices /sed
devices.list:
devices.deny:
devices.allow:
 
# 这里的三个文件需要解释一下,我们的添加删除是针对allow和deny,这两个API文件。最终显示当前这个层级中设备权限列表是文件devices.list。
 
  
# 运行:
cgexec -g devices:fdisk fdisk -l /dev/sda
   
 
结果:
执行没有输出,可以用dd来测试,会有提示失败。
     
# 删除:
cgdelete -r devices:/fdisk
    
 

7. freezer实践

使用cgroup控制bash程序使用.

 

测试脚本
[root@localhost ~]# cat while.sh
#!/bin/bash
while true
do
 true;
done
 
 
 
# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g freezer:/bash
  
# 运行: cgexec -g freezer:bash ./while.sh
# 要先将进程运行起来,否则无法启动的。
  
# 初始输出
[root@localhost ~]# cgget -n -g freezer /bash
freezer.state: THAWED
 
# 设置   
cgset -r freezer.state="FROZEN" bash
  
# 设置后输出
[root@localhost ~]# cgget -n -g freezer /bash
freezer.state: FROZEN
 
  
结果:
通过THAWED(正常),FROZEN(挂起),两个状态的切换可以看到while.sh脚本
FROZEN(挂起):
root 20123 89.3 0.0 106092 1152 pts/3 D+ 12:56 4:05 /bin/bash ./while.sh
THAWED(正常):
root 20123 87.5 0.0 106092 1152 pts/3 R+ 12:56 4:07 /bin/bash ./while.sh
freezer.state 的第三个状态FREEZING,代表正在切换到挂起,是只读状态。不可以设置。
 
      
# 删除:
cgdelete -r freezer:/bash
    
 

8. ns介绍

cgroup在name space介绍.

Provides a simple namespace cgroup subsystem to provide hierarchical naming of sets of namespaces, for instance virtual servers and checkpoint/restart jobs.

 

9. memory实践

使用cgroup限制程序占用内存.

 

# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g memory:/test
  
# 设置限制:
# 限制使用内存10Mbyte.
cgset -r memory.limit_in_bytes="10m" test
  
# 测试程序:
test_memory.c
  
# 运行:
# 执行后进程会因为超限被杀死。
cgexec -g memory:test valgrind ./test_memory
   
结果:
在后面
 
# 运行2:
# 设置关闭oom,在运行。
cgset -r memory.oom_control="0" test
cgexec -g memory:test valgrind ./test_memory
  
# 进程会被阻塞,这个时候信号将被忽略。解决办法:cgset -r memory.oom_control="0" test
root 28026 0.8 5.4 432356 103804 pts/3 D+ 17:20 0:02 valgrind ./test_memory
 
 
# 另外,这个命令用来观察进程内存活动状态也是不错选择。 
cgget -n -g memory test
 
# 删除:
cgdelete -r memory:/test
   
test_memory.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>
#include <stdlib.h>
#define MALLOC_SIZE 1024 * 1024 * 300
  
int main(void)
{
 char *i = NULL;
 long int j;
  
 i = malloc(MALLOC_SIZE);
 if(i != NULL) {
 for(j = 0; j < MALLOC_SIZE; j++)
 *(i+j) = 'a';
 }
 sleep(5);
  
 return 0;
}
Cgroup之外:
[root@localhost ~]# valgrind ./test_memory
==27353== Memcheck, a memory error detector
==27353== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==27353== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==27353== Command: ./test_memory
==27353==
==27353== Warning: set address range perms: large range [0x4c26040, 0x17826040) (undefined)
==27353==
==27353== HEAP SUMMARY:
==27353== in use at exit: 314,572,800 bytes in 1 blocks
==27353== total heap usage: 1 allocs, 0 frees, 314,572,800 bytes allocated
==27353==
==27353== LEAK SUMMARY:
==27353== definitely lost: 0 bytes in 0 blocks
==27353== indirectly lost: 0 bytes in 0 blocks
==27353== possibly lost: 314,572,800 bytes in 1 blocks
==27353== still reachable: 0 bytes in 0 blocks
==27353== suppressed: 0 bytes in 0 blocks
==27353== Rerun with --leak-check=full to see details of leaked memory
==27353==
==27353== For counts of detected and suppressed errors, rerun with: -v
==27353== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6)
 
Cgroup之内:
[root@localhost ~]# cgexec -g memory:test valgrind ./test_memory
==27548== Memcheck, a memory error detector
==27548== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==27548== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==27548== Command: ./test_memory
==27548==
==27548== Warning: set address range perms: large range [0x4c26040, 0x17826040) (undefined)
已杀死
 
 
tailf /var/log/message
Jul 12 17:05:33 localhost kernel: memcheck-amd64- invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
Jul 12 17:05:33 localhost kernel: memcheck-amd64- cpuset=/ mems_allowed=0
Jul 12 17:05:33 localhost kernel: Pid: 27548, comm: memcheck-amd64- Tainted: G W --------------- 2.6.32-358.11.1.el6.x86_64 #1
Jul 12 17:05:33 localhost kernel: Call Trace:
Jul 12 17:05:33 localhost kernel: [<ffffffff810cb541>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Jul 12 17:05:33 localhost kernel: [<ffffffff8111cd40>] ? dump_header+0x90/0x1b0
Jul 12 17:05:33 localhost kernel: [<ffffffff8121d24c>] ? security_real_capable_noaudit+0x3c/0x70
Jul 12 17:05:33 localhost kernel: [<ffffffff8111d1c2>] ? oom_kill_process+0x82/0x2a0
Jul 12 17:05:33 localhost kernel: [<ffffffff8111d101>] ? select_bad_process+0xe1/0x120
Jul 12 17:05:33 localhost kernel: [<ffffffff8111d942>] ? mem_cgroup_out_of_memory+0x92/0xb0
Jul 12 17:05:33 localhost kernel: [<ffffffff81173594>] ? mem_cgroup_handle_oom+0x274/0x2a0
Jul 12 17:05:33 localhost kernel: [<ffffffff81170fd0>] ? memcg_oom_wake_function+0x0/0xa0
Jul 12 17:05:33 localhost kernel: [<ffffffff81173b79>] ? __mem_cgroup_try_charge+0x5b9/0x5d0
Jul 12 17:05:33 localhost kernel: [<ffffffff81174ef7>] ? mem_cgroup_charge_common+0x87/0xd0
Jul 12 17:05:33 localhost kernel: [<ffffffff81174f88>] ? mem_cgroup_newpage_charge+0x48/0x50
Jul 12 17:05:33 localhost kernel: [<ffffffff81143d5c>] ? handle_pte_fault+0x79c/0xb50
Jul 12 17:05:33 localhost kernel: [<ffffffff811483d6>] ? vma_adjust+0x556/0x5e0
Jul 12 17:05:33 localhost kernel: [<ffffffff8114434a>] ? handle_mm_fault+0x23a/0x310
Jul 12 17:05:33 localhost kernel: [<ffffffff810474c9>] ? __do_page_fault+0x139/0x480
Jul 12 17:05:33 localhost kernel: [<ffffffff8114a8da>] ? do_mmap_pgoff+0x33a/0x380
Jul 12 17:05:33 localhost kernel: [<ffffffff8151364e>] ? do_page_fault+0x3e/0xa0
Jul 12 17:05:33 localhost kernel: [<ffffffff81510a05>] ? page_fault+0x25/0x30
Jul 12 17:05:33 localhost kernel: Task in /test killed as a result of limit of /test
Jul 12 17:05:33 localhost kernel: memory: usage 9768kB, limit 9768kB, failcnt 107723
Jul 12 17:05:33 localhost kernel: memory+swap: usage 202500kB, limit 9007199254740991kB, failcnt 0
Jul 12 17:05:33 localhost kernel: Mem-Info:
Jul 12 17:05:33 localhost kernel: Node 0 DMA per-cpu:
Jul 12 17:05:33 localhost kernel: CPU 0: hi: 0, btch: 1 usd: 0
Jul 12 17:05:33 localhost kernel: CPU 1: hi: 0, btch: 1 usd: 0
Jul 12 17:05:33 localhost kernel: CPU 2: hi: 0, btch: 1 usd: 0
Jul 12 17:05:33 localhost kernel: CPU 3: hi: 0, btch: 1 usd: 0
Jul 12 17:05:33 localhost kernel: Node 0 DMA32 per-cpu:
Jul 12 17:05:33 localhost kernel: CPU 0: hi: 186, btch: 31 usd: 23
Jul 12 17:05:33 localhost kernel: CPU 1: hi: 186, btch: 31 usd: 80
Jul 12 17:05:33 localhost kernel: CPU 2: hi: 186, btch: 31 usd: 129
Jul 12 17:05:33 localhost kernel: CPU 3: hi: 186, btch: 31 usd: 177
Jul 12 17:05:33 localhost kernel: active_anon:40290 inactive_anon:51290 isolated_anon:0
Jul 12 17:05:33 localhost kernel: active_file:79471 inactive_file:22660 isolated_file:0
Jul 12 17:05:33 localhost kernel: unevictable:10973 dirty:0 writeback:1243 unstable:0
Jul 12 17:05:33 localhost kernel: free:234782 slab_reclaimable:18445 slab_unreclaimable:8032
Jul 12 17:05:33 localhost kernel: mapped:7088 shmem:55 pagetables:2308 bounce:0
Jul 12 17:05:33 localhost kernel: Node 0 DMA free:15696kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15300kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jul 12 17:05:33 localhost kernel: lowmem_reserve[]: 0 2004 2004 2004
Jul 12 17:05:33 localhost kernel: Node 0 DMA32 free:923432kB min:44720kB low:55900kB high:67080kB active_anon:161160kB inactive_anon:205160kB active_file:317884kB inactive_file:90640kB unevictable:43892kB isolated(anon):0kB isolated(file):0kB present:2052192kB mlocked:17320kB dirty:0kB writeback:4972kB mapped:28352kB shmem:220kB slab_reclaimable:73780kB slab_unreclaimable:32128kB kernel_stack:2016kB pagetables:9232kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jul 12 17:05:33 localhost kernel: lowmem_reserve[]: 0 0 0 0
Jul 12 17:05:33 localhost kernel: Node 0 DMA: 4*4kB 2*8kB 3*16kB 2*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15696kB
Jul 12 17:05:33 localhost kernel: Node 0 DMA32: 132*4kB 490*8kB 4055*16kB 2295*32kB 1112*64kB 603*128kB 320*256kB 227*512kB 124*1024kB 4*2048kB 73*4096kB = 923440kB
Jul 12 17:05:33 localhost kernel: 107889 total pagecache pages
Jul 12 17:05:33 localhost kernel: 4594 pages in swap cache
Jul 12 17:05:33 localhost kernel: Swap cache stats: add 172777, delete 168183, find 63726/74121
Jul 12 17:05:33 localhost kernel: Free swap = 2268392kB
Jul 12 17:05:33 localhost kernel: Total swap = 2506744kB
Jul 12 17:05:33 localhost kernel: 524272 pages RAM
Jul 12 17:05:33 localhost kernel: 45664 pages reserved
Jul 12 17:05:33 localhost kernel: 87685 pages shared
Jul 12 17:05:33 localhost kernel: 170356 pages non-shared
Jul 12 17:05:33 localhost kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Jul 12 17:05:33 localhost kernel: [27548] 0 27548 113277 1553 0 0 0 memcheck-amd64-
Jul 12 17:05:33 localhost kernel: Memory cgroup out of memory: Kill process 27548 (memcheck-amd64-) score 1000 or sacrifice child
Jul 12 17:05:33 localhost kernel: Killed process 27548, UID 0, (memcheck-amd64-) total-vm:453108kB, anon-rss:4792kB, file-rss:1420kB
 
 

10. perf_event介绍

cgroup在perf_event应用.

Redhat documentation.

When the perf_event subsystem is attached to a hierarchy, all cgroups in that hierarchy can be used to group processes and threads which can then be monitored with the perf tool, as opposed to monitoring each process or thread separately or per-CPU. Cgroups which use the perf_eventsubsystem do not contain any special tunable parameters other than the common parameters listed in Section 3.12, “Common Tunable Parameters”.

中文翻译

在 perf_event 子系统中添加一个子层级,可以使用监控工具 perf tool 监控这个层级中进程或线程。

关于perf的其它相关资料,打算深入研究的开发同学可以看看perf_events_status_update.pdf,Google的资料。

http://web.eece.maine.edu/~vweaver/projects/perf_events/perf_event_open.html

GPU performance monitoring.pdf

perf_events_status_update.pdf

 

11. net_prio实践(有问题)

使用cgroup限制程序使用.

 

# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g net_prio:/scp1
cgcreate -g net_prio:/scp2
   
# 设置
# eth1网卡上,scp1配置1优先级,最低。另一个配置较高100。
cgset -r net_prio.ifpriomap="eth1 1" scp1
cgset -r net_prio.ifpriomap="eth1 100" scp2
 
# 设置后输出
[root@localhost scp]# cgget -n -g net_prio /scp1
net_prio.ifpriomap: lo 0
 eth1 1
 ;vdsmdummy; 0
 bond0 0
 bond4 0
 bond1 0
 bond2 0
 bond3 0
 virbr0 0
 virbr0-nic 0
net_prio.prioidx: 2
[root@localhost scp]# cgget -n -g net_prio /scp2
net_prio.ifpriomap: lo 0
 eth1 100
 ;vdsmdummy; 0
 bond0 0
 bond4 0
 bond1 0
 bond2 0
 bond3 0
 virbr0 0
 virbr0-nic 0
net_prio.prioidx: 3
 
 
# 运行:
# 10.217.13.101 上运行  iperf -s
cgexec -g net_prio:scp1  iperf -t 30 -c 10.217.13.101
cgexec -g net_prio:scp2  iperf -t 30 -c 10.217.13.101
 
结果:
 
     
# 删除:
cgdelete -r net_prio:/scp1
cgdelete -r net_prio:/scp2  
 

12. net_cls实践

使用cgroup限制程序使用.

 

# 创建:
# 建立组,名称跟进程名字无关。这里只是方便记忆。
cgcreate -g net_cls:/htb
 
   
# 设置
# net_cls.classid values is 0xAAAABBBB; AAAA is the major handle number and BBBB is the minor handle number.
cgset -r net_cls.classid="0x100001" htb
 
# 设置前输出
[root@virtual scp2]# cgget -n -g net_cls:/htb
net_cls.classid: 0
 
# 设置后输出
[root@virtual scp2]# cgget -n -g net_cls:/htb
net_cls.classid: 1048577
 
# 设置队列类
[root@virtual ~]# tc qdisc show
qdisc mq 0: dev em1 root
qdisc mq 0: dev em2 root
 
[root@virtual ~]# tc qdisc add dev em1 root handle 10: htb
  
[root@virtual ~]# tc qdisc show
qdisc htb 10: dev em1 root refcnt 9 r2q 10 default 0 direct_packets_stat 6
qdisc mq 0: dev em2 root
  
[root@virtual ~]# tc class add dev em1 parent 10: classid 10:1 htb rate 40mbit 
 
 
[root@virtual ~]# tc filter add dev em1 parent 10: protocol ip prio 10 handle 1: cgroup
 
# 运行:
# 10.217.13.101 上运行  iperf -s
cgexec -g net_cls:htb iperf -t 30 -c 10.217.13.101
 
结果:
[root@virtual ~]# cgexec -g net_cls:htb iperf -t 30 -c 10.217.13.101
------------------------------------------------------------
Client connecting to 10.217.13.101, TCP port 5001
TCP window size: 22.9 KByte (default)
------------------------------------------------------------
[ 3] local 10.13.32.71 port 45454 connected with 10.217.13.101 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-30.0 sec 144 MBytes 40.2 Mbits/sec
 
     
# 删除:
cgdelete -r net_cls:/htb
  
 
posted on 2014-02-25 18:23 Xiushi 阅读(...) 评论(...) 编辑 收藏