pt-ioprofile

pt-ioprofile是用来观察特定进程的IO信息的。

该脚本是用shell写的,有两方面的作用:

pt-ioprofile does two things: 1) get lsof+strace for -s seconds, 2) aggregate the result. If you specify a FILE, then step 1) is not performed.

因实际依赖的是strace命令,该命令会对进程的性能造成较大的影响。所以,在负载较重的系统中,慎用。

 

使用方法:

直接执行pt-ioprofile即可,默认的是mysqld进程。

# pt-ioprofile

Sun Oct 30 11:08:25 CST 2016
Tracing process ID 1683
     total      pread       read     pwrite      write      fsync       open      close      lseek      fcntl filename
  4.591996   0.000000   0.000000   0.088374   0.000000   4.503148   0.000219   0.000070   0.000000   0.000185 /test/sbtest5.ibd
  2.720128   0.000000   0.000000   0.096999   0.000000   2.622662   0.000254   0.000053   0.000000   0.000160 /test/sbtest6.ibd
  0.009918   0.000193   0.000515   0.000235   0.002752   0.004530   0.000491   0.000182   0.001020   0.000000 /test/#sql-693_153.frm
  0.007725   0.000917   0.000000   0.002182   0.000000   0.003766   0.000285   0.000575   0.000000   0.000000 /test/
  0.004819   0.000162   0.000587   0.000097   0.001044   0.001735   0.000349   0.000157   0.000688   0.000000 /test/sbtest6.frm
  0.004201   0.000104   0.000309   0.000090   0.001048   0.001900   0.000265   0.000152   0.000333   0.000000 /test/sbtest5.frm
  0.000661   0.000092   0.000209   0.000000   0.000000   0.000000   0.000105   0.000091   0.000164   0.000000 /tmp/ibUARemw
  0.000643   0.000075   0.000254   0.000000   0.000000   0.000000   0.000128   0.000044   0.000142   0.000000 /test/sbtest4.frm
  0.000228   0.000000   0.000000   0.000000   0.000000   0.000000   0.000143   0.000085   0.000000   0.000000 /tmp/ib2bbcaJ
  0.000221   0.000000   0.000000   0.000000   0.000000   0.000000   0.000170   0.000051   0.000000   0.000000 /tmp/ibp3wGXR
  0.000219   0.000000   0.000000   0.000000   0.000000   0.000000   0.000168   0.000051   0.000000   0.000000 /tmp/ibEDs8K7

其中,

read:从文件中读出数据。要读取的文件用文件描述符标识,数据读入一个事先定义好的缓冲区。

write:把缓冲区的数据写入文件中。

pread:由于lseek和read调用之间,内核可能会临时挂起进程,所以对同步问题造成了问题,调用pread相当于顺序调用了lseek和read,这两个操作相当于一个捆绑的原子操作。

pwrite:由于lseek和write调用之间,内核可能会临时挂起进程,所以对同步问题造成了问题,调用pwrite相当于顺序调用了lseek 和write,这两个操作相当于一个捆绑的原子操作。

fsync:确保文件所有已修改的内容已经正确同步到硬盘上,该调用会阻塞等待直到设备报告IO完成。

open:打开一个文件,并返回这个文件的描述符。

close:close系统调用用于“关闭”一个文件,close调用终止一个文件描述符以其文件之间的关联。文件描述符被释放,并能够重新使用。

lseek:对文件描述符指定文件的读写指针进行设置,也就是说,它可以设置文件的下一个读写位置。

fcntl:针对(文件)描述符提供控制。

 

看看各参数的作用:

--aggregate

统计的方式,默认是sum,也可指定为avg。

个人感觉avg没太大意义。

short form: -a; type: string; default: sum

The aggregate function, either sum or avg.

If sum, then each cell will contain the sum of the values in it. If avg, then each cell will contain the average of the values in it.

 

--cell

显示的单位,默认是times,即IO操作的时间,也可指定为count(O操作的次数),size(IO操作的大小)

如下所示:

--cell=times

# pt-ioprofile 
Sun Oct 30 11:08:25 CST 2016
Tracing process ID 1683
     total      pread       read     pwrite      write      fsync       open      close      lseek      fcntl filename
  4.591996   0.000000   0.000000   0.088374   0.000000   4.503148   0.000219   0.000070   0.000000   0.000185 /test/sbtest5.ibd
  2.720128   0.000000   0.000000   0.096999   0.000000   2.622662   0.000254   0.000053   0.000000   0.000160 /test/sbtest6.ibd
  0.009918   0.000193   0.000515   0.000235   0.002752   0.004530   0.000491   0.000182   0.001020   0.000000 /test/#sql-693_153.frm
  0.007725   0.000917   0.000000   0.002182   0.000000   0.003766   0.000285   0.000575   0.000000   0.000000 /test/
  0.004819   0.000162   0.000587   0.000097   0.001044   0.001735   0.000349   0.000157   0.000688   0.000000 /test/sbtest6.frm
  0.004201   0.000104   0.000309   0.000090   0.001048   0.001900   0.000265   0.000152   0.000333   0.000000 /test/sbtest5.frm
  0.000661   0.000092   0.000209   0.000000   0.000000   0.000000   0.000105   0.000091   0.000164   0.000000 /tmp/ibUARemw
  0.000643   0.000075   0.000254   0.000000   0.000000   0.000000   0.000128   0.000044   0.000142   0.000000 /test/sbtest4.frm
  0.000228   0.000000   0.000000   0.000000   0.000000   0.000000   0.000143   0.000085   0.000000   0.000000 /tmp/ib2bbcaJ
  0.000221   0.000000   0.000000   0.000000   0.000000   0.000000   0.000170   0.000051   0.000000   0.000000 /tmp/ibp3wGXR
  0.000219   0.000000   0.000000   0.000000   0.000000   0.000000   0.000168   0.000051   0.000000   0.000000 /tmp/ibEDs8K7

--cell=sizes

# pt-ioprofile --cell=sizes
Sun Oct 30 11:27:10 CST 2016
Tracing process ID 1683
     total      pread       read     pwrite      write      fsync       open      close      lseek      fcntl filename
  30425088          0          0   30425088          0          0          0          0          0          0 /test1/sbtest3.ibd
  30425088          0          0   30425088          0          0          0          0          0          0 /test1/sbtest2.ibd
  15745024          0          0   15745024          0          0          0          0          0          0 /test1/sbtest4.ibd
   4194304    2097152          0    2097152          0          0          0          0          0          0 /test1/
     96928       1156       1130        236      18434          0          0          0      75972          0 /test1/#sql-693_164.frm
     92006       1734       1653         97       9217          0          0          0      79305          0 /test1/sbtest3.frm
     92006       1734       1653         97       9217          0          0          0      79305          0 /test1/sbtest2.frm
     70031       1156       1088         97       9217          0          0          0      58473          0 /test1/sbtest4.frm
     21975        578        565          0          0          0          0          0      20832          0 /test1/sbtest1.frm
         0          0          0          0          0          0          0          0          0          0 /tmp/ibyD2qvV
         0          0          0          0          0          0          0          0          0          0 /tmp/ibyc3Nz5
         0          0          0          0          0          0          0          0          0          0 /tmp/ibJI3Ky8
         0          0          0          0          0          0          0          0          0          0 /tmp/ibbLjhsI
         0          0          0          0          0          0          0          0          0          0 /tmp/ib8QdnAm
         0          0          0          0          0          0          0          0          0          0 /tmp/ib7oksHs

--cell=count

# pt-ioprofile --cell=count
Sun Oct 30 11:27:50 CST 2016
Tracing process ID 1683
     total      pread       read     pwrite      write      fsync       open      close      lseek      fcntl filename
       147          0          0         61          0         81          2          1          0          2 /test1/sbtest5.ibd
       146          0          0         61          0         80          2          1          0          2 /test1/sbtest6.ibd
       123          0          0         57          0         61          2          1          0          2 /test1/sbtest7.ibd
        86          4         10          4         42          2          4          4         16          0 /test1/#sql-693_164.frm
        57          4         10          2         21          1          4          3         12          0 /test1/sbtest6.frm
        57          4         10          2         21          1          4          3         12          0 /test1/sbtest5.frm
        56          4         10          2         21          1          3          3         12          0 /test1/sbtest7.frm
        21          2          0          2          0          5          5          7          0          0 /test1/
        14          2          5          0          0          0          1          2          4          0 /tmp/ibtLXAL6
        14          2          5          0          0          0          1          2          4          0 /tmp/ibDei5Io
         2          0          0          0          0          0          1          1          0          0 /tmp/ib2WdhKK
         2          0          0          0          0          0          1          1          0          0 /tmp/ib027CfV
         1          0          0          0          0          0          1          0          0          0 /test1/sbtest4.frm

 

short form: -c; type: string; default: times
The cell contents.
Valid values are:
VALUE CELLS CONTAIN
===== =======================
count Count of I/O operations
sizes Sizes of I/O operations
times I/O operation timing

 

--group-by

分组的单位,默认是filename,即对文件名进行统计,也可指定为all,即对所有操作进行统计,pid,对进程进行统计(不太确定这里的进程指的是什么?怀疑是线程,但线程不可能这么少,具体可参考下面的输出)

--group-by=filename

# pt-ioprofile 
Sun Oct 30 11:08:25 CST 2016
Tracing process ID 1683
     total      pread       read     pwrite      write      fsync       open      close      lseek      fcntl filename
  4.591996   0.000000   0.000000   0.088374   0.000000   4.503148   0.000219   0.000070   0.000000   0.000185 /test/sbtest5.ibd
  2.720128   0.000000   0.000000   0.096999   0.000000   2.622662   0.000254   0.000053   0.000000   0.000160 /test/sbtest6.ibd
  0.009918   0.000193   0.000515   0.000235   0.002752   0.004530   0.000491   0.000182   0.001020   0.000000 /test/#sql-693_153.frm
  0.007725   0.000917   0.000000   0.002182   0.000000   0.003766   0.000285   0.000575   0.000000   0.000000 /test/
  0.004819   0.000162   0.000587   0.000097   0.001044   0.001735   0.000349   0.000157   0.000688   0.000000 /test/sbtest6.frm
  0.004201   0.000104   0.000309   0.000090   0.001048   0.001900   0.000265   0.000152   0.000333   0.000000 /test/sbtest5.frm
  0.000661   0.000092   0.000209   0.000000   0.000000   0.000000   0.000105   0.000091   0.000164   0.000000 /tmp/ibUARemw
  0.000643   0.000075   0.000254   0.000000   0.000000   0.000000   0.000128   0.000044   0.000142   0.000000 /test/sbtest4.frm
  0.000228   0.000000   0.000000   0.000000   0.000000   0.000000   0.000143   0.000085   0.000000   0.000000 /tmp/ib2bbcaJ
  0.000221   0.000000   0.000000   0.000000   0.000000   0.000000   0.000170   0.000051   0.000000   0.000000 /tmp/ibp3wGXR
  0.000219   0.000000   0.000000   0.000000   0.000000   0.000000   0.000168   0.000051   0.000000   0.000000 /tmp/ibEDs8K7

--group-by=all

# pt-ioprofile --group-by=all
Sun Oct 30 11:28:48 CST 2016
Tracing process ID 1683
  8.044078 TOTAL
  7.781329 fsync
  0.245823 pwrite
  0.004445 write
  0.003389 read
  0.002998 open
  0.002324 lseek
  0.001886 close
  0.001679 pread
  0.000205 fcntl

--group-by=pid

# pt-ioprofile --group-by=pid
Sun Oct 30 11:32:29 CST 2016
Tracing process ID 1683
     total      pread       read     pwrite      write      fsync       open      close      lseek      fcntl pid
  5.468350   0.002442   0.004444   0.111051   0.007580   5.320658   0.002614   0.003022   0.016232   0.000307 2055
  1.093664   0.000000   0.000000   0.000000   0.000000   1.093664   0.000000   0.000000   0.000000   0.000000 1735
  0.997042   0.000000   0.000000   0.000000   0.000000   0.997042   0.000000   0.000000   0.000000   0.000000 1738
  0.940387   0.000000   0.000000   0.000000   0.000000   0.940387   0.000000   0.000000   0.000000   0.000000 1736
  0.507274   0.000000   0.000000   0.000000   0.000000   0.507274   0.000000   0.000000   0.000000   0.000000 1737

 

short form: -g; type: string; default: filename
The group-by item.
Valid values are:
VALUE GROUPING
===== ======================================
all Summarize into a single line of output
filename One line of output per filename
pid One line of output per process ID

 

--help

Print help and exit.

 

--profile-pid

指定线程的ID

short form: -p; type: int

The PID to profile, overrides --profile-process.

 

--profile-process

指定线程名

short form: -b; type: string; default: mysqld

The process name to profile.

 

--run-time

执行strace命令的时间,OPT_RUN_TIME就是--run-time指定的值。

源码如下:

 if [ "$proc_pid" ]; then
            echo "Tracing process ID $proc_pid"

            _lsof "$proc_pid" > "$samples" 2>&1
            if [ "$?" -ne "0" ]; then
               echo "Error: could not execute lsof, error code $?"
               EXIT_STATUS=1
               return 1
            fi

            strace -T -s 0 -f -p $proc_pid >> "$samples" 2>&1 &
            if [ "$?" -ne "0" ]; then
               echo "Error: could not execute strace, error code $?"
               EXIT_STATUS=1
               return 1
            fi
            strace_pid=$!
            # sleep one second then check to make sure the strace is
            # actually running
            sleep 1
            ps -p $strace_pid > /dev/null 2>&1
            if [ "$?" -ne "0" ]; then
               echo "Cannot find strace process" >&2
               tail "$samples" >&2
               EXIT_STATUS=1
               return 1
            fi
            # sleep for interval -1, since we did a one second sleep
            # before checking for the PID of strace
            if [ $((${OPT_RUN_TIME}-1)) -gt 0 ]; then
                sleep $((${OPT_RUN_TIME}-1))
            fi

            kill -s 2 $strace_pid

            sleep 1
            kill -s 15 $strace_pid 2>/dev/null

            # Sometimes strace leaves threads/processes in T status.
            kill -s 18 $proc_pid

            # Summarize the output we just generated.
            tabulate_strace "$samples" > $PT_TMPDIR/tabulated_samples

 

type: int; default: 30

How long to profile.

 

--save-samples

将strace和lsof获取的结果保存到指定的文件中

type: string

Filename to save samples in; these can be used for later analysis.

 

--version

Print the tool’s version and exit.

 

参考

1. http://blog.csdn.net/zongcai249/article/details/17598411

2. http://www.cnblogs.com/yanlingyin/archive/2012/08/04/2617209.html

3. http://blog.csdn.net/cywosp/article/details/8767327

posted @ 2016-10-31 18:51  iVictor  阅读(3223)  评论(0编辑  收藏  举报