Ceph性能测试工具和方法

前言

    本文主要针对Ceph集群进行块设备的性能和压力测试,使用场景可以理解为云平台(Openstack)的计算节点上的各虚机对其云硬盘(ceph rbd存储)进行读写性能测试,采用anssible加自行编写的测试工具(基于fio)的方式实现,整体的结构和场景如下图所示:

使用fio进行rbd块设备性能测试

    首先,先来看看Ceph自带的性能测试工具,如rados bench、rados load-gen以及rbd bench-write。对这几款工具进行了简单使用,感觉还是不够完善,这里不做过多介绍,最终采用fio进行测试。

    通过fio进行测试时,可以选择libaio或者rbd作为io引擎。libaio即Linux native asynchronous I/O,需要将ceph的块存储挂载到计算节点供fio使用(kernel rbd模式)。rbd作为引擎时通过librbd模式访问存储,对于Openstack的虚机来说一般采用这种方式,如下图所示:

    本文采用rbd作为fio的io引擎,这种方式更接近于真实的使用场景,且测试过程比较便利。但是,fio默认是不支持rbd引擎的,需要对其进行重新编译,对于Centos系统,方法如下:

$ yum install boost-devel
$ yum install librbd1-devel

$ git clone git://git.kernel.dk/fio.git
$ cd fio
$ ./configure
[...]
Rados Block Device engine     yes
rbd_poll                      yes
[...]
$ make

    可以看见,fio的引擎支持rbd了,这种方式会通过ceph.conf配置文件连接到ceph集群。

    接下来,需要给fio一个job文件,并执行,示例如下:

$ cat write_4k.fio
[write-4K]
description="write test with block size of 4K"
ioengine=rbd
clientname=admin
pool=rbd
rbdname=test
iodepth=32
runtime=120
rw=write #write 表示顺序写
bs=4K
$ rbd -p rbd create --size 20480 test

$ ./fio write_4k.fio
write-4K: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32
fio-3.7-65-gb8c7
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=20.4MiB/s][r=0,w=5217 IOPS][eta 00m:00s]
write-4K: (groupid=0, jobs=1): err= 0: pid=18306: Thu Jul 26 17:01:28 2018
  Description  : ["write test with block size of 4K"]
  write: IOPS=5443, BW=21.3MiB/s (22.3MB/s)(2552MiB/120008msec)
    slat (nsec): min=1184, max=327682, avg=6526.40, stdev=4958.16
    clat (usec): min=1295, max=129758, avg=5868.96, stdev=3117.13
     lat (usec): min=1310, max=129759, avg=5875.49, stdev=3117.19
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    4], 10.00th=[    5], 20.00th=[    5],
     | 30.00th=[    5], 40.00th=[    6], 50.00th=[    6], 60.00th=[    6],
     | 70.00th=[    7], 80.00th=[    7], 90.00th=[    8], 95.00th=[    9],
     | 99.00th=[   11], 99.50th=[   12], 99.90th=[   64], 99.95th=[   71],
     | 99.99th=[  111]
   bw (  KiB/s): min=14416, max=27680, per=99.99%, avg=21769.76, stdev=2257.58, samples=240
   iops        : min= 3604, max= 6920, avg=5442.40, stdev=564.40, samples=240
  lat (msec)   : 2=0.11%, 4=9.90%, 10=88.57%, 20=1.19%, 50=0.10%
  lat (msec)   : 100=0.10%, 250=0.02%
  cpu          : usr=4.62%, sys=4.62%, ctx=493577, majf=0, minf=27
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,653235,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=21.3MiB/s (22.3MB/s), 21.3MiB/s-21.3MiB/s (22.3MB/s-22.3MB/s), io=2552MiB (2676MB), run=120008-120008msec

Disk stats (read/write):
    dm-0: ios=0/5, merge=0/0, ticks=0/8, in_queue=8, util=0.01%, aggrios=0/8, aggrmerge=0/0, aggrticks=0/8, aggrin_queue=8, aggrutil=0.01%
  vda: ios=0/8, merge=0/0, ticks=0/8, in_queue=8, util=0.01%

 

利用fio编写性能测试工具

    开篇中提到,本次性能测试的场景为模拟云平台(Openstack)的计算节点上的各虚机对其云硬盘(ceph rbd存储)进行读写性能测试,这里假设一个fio进程即为一台虚拟机,通过rbd引擎访问ceph块设备存储类似于虚机对云硬盘的访问。编写一性能测试工具实现如下功能:

    1)调用fio进行性能测试(将编译后具有rbd引擎的fio执行程序和生成的工具执行程序位于同一路径下)

    2)并行执行,可设置并行数量,及控制使用宿主机的最大CPU核数,记录测试的开始时间和结束时间

    3)将fio的job文件定义成模板,为每个fio进程定制job文件以便操作各自的rbd存储(各计算节点要求不同的hostname,每个卷的名称为$hostname-$n,n最大为并行数量减一,执行测试前创建好需要的存储卷)

    4)将每个fio进程的测试结果文件分类保存

    5)对测试结果文件进行解析、统计和展示(当前只考虑IOPS和带宽)

    最后,将编写好的工具通过anssible进行统一分发和执行,实现对多台计算节点的并发操作,和结果收集。

    由于Golang语言的并行控制很方便,用其进行工具的开发,代码如下:

package main

import (
    "fmt"
    "github.com/bitly/go-simplejson"
    "github.com/urfave/cli"
    "io"
    "io/ioutil"
    "os"
    "os/exec"
    "runtime"
    "text/template"
    "time"
)

var ResultsDir string = "results"

type Options struct {
    FioJobFile string
    GoMaxProcs int
    GoJobs     int
}

type rbd struct {
    Name string
}

func main() {

    var options Options

    app := cli.NewApp()
    app.Version = "dev"
    app.Flags = []cli.Flag{
        cli.StringFlag{
            Name:        "job-file",
            Usage:       "Location of fio job file",
            Destination: &options.FioJobFile,
        },
        cli.IntFlag{
            Name:        "max-cpus",
            Usage:       "Max cpus for goroutines",
            Destination: &options.GoMaxProcs,
        },
        cli.IntFlag{
            Name:        "go-jobs",
            Usage:       "Numbers of goroutine",
            Destination: &options.GoJobs,
        },
    }

    app.Action = func(c *cli.Context) error {
        return run(options)
    }

    app.Run(os.Args)

}

func run(options Options) error {

    if options.GoMaxProcs == 0 || options.GoJobs == 0 {
        fmt.Printf("Optins max-cpus or go-jobs must be set.\n")
        return fmt.Errorf("Optins max-cpus or go-jobs must be set.")
    }

    runtime.GOMAXPROCS(options.GoMaxProcs)

    os.Mkdir(ResultsDir, os.ModePerm)
    os.Mkdir(ResultsDir+"/"+options.FioJobFile+"_proc_"+fmt.Sprintf("%d", options.GoJobs), os.ModePerm)

    chs := make([]chan int, options.GoJobs)

    fmt.Println("ceph test started at ", time.Now().Format("2006-01-02 15:04:05"))

    for i := 0; i < options.GoJobs; i++ {
        chs[i] = make(chan int)
        go test(chs[i], i, options)
    }

    for _, ch := range chs {
        <-ch
    }

    fmt.Println("ceph test ended at ", time.Now().Format("2006-01-02 15:04:05"))

    analyse(options)

    return nil
}

func analyse(options Options) {
    fmt.Println("Process\t JobName\t Read_IOPS\t Read_BW(MB/s)\t Write_IOPS\t Write_BW(MB/s)\n")
    fmt.Println("------------------------------------------------------------------------------------------")
    hostName, _ := os.Hostname()
    for i := 0; i < options.GoJobs; i++ {
        b, err := ioutil.ReadFile(ResultsDir + "/" + options.FioJobFile + "_proc_" +
            fmt.Sprintf("%d", options.GoJobs) + "/" + "test_result_" +
            hostName + "_proc_" + fmt.Sprintf("%d", i))

        if err != nil {
            fmt.Println(err)
            break
        }
        js, js_err := simplejson.NewJson(b)
        if js_err != nil {
            fmt.Println(js_err)
            break
        }
        jobsArr, _ := js.Get("jobs").Array()
        for j, _ := range jobsArr {
            job := js.Get("jobs").GetIndex(j)
            jobname := job.Get("jobname").MustString()
            read_bw_bytes := job.Get("read").Get("bw_bytes").MustFloat64()
            read_iops := job.Get("read").Get("iops").MustFloat64()
            write_bw_bytes := job.Get("write").Get("bw_bytes").MustFloat64()
            write_iops := job.Get("write").Get("iops").MustFloat64()
            fmt.Printf("proc_%s\t %s\t %8.2f\t %8.2f\t %8.2f\t %8.2f\n",
                fmt.Sprintf("%d", i),
                jobname, read_iops, read_bw_bytes/1024/1024,
                write_iops, write_bw_bytes/1024/1024)
        }

    }
}

func test(ch chan int, i int, options Options) {

    hostName, _ := os.Hostname()
    newFile := ResultsDir + "/" + options.FioJobFile + "_proc_" + fmt.Sprintf("%d", options.GoJobs) +
        "/" + "job_file_" + hostName + "_proc_" + fmt.Sprintf("%d", i)

    _, err := copyFile(options.FioJobFile, newFile)
    if err != nil {
        fmt.Println(err.Error())
    }

    tmpl, parse_err := template.ParseFiles(newFile)
    if parse_err != nil {
        fmt.Println("error is ", parse_err)
    }

    rbdName := hostName + "-" + fmt.Sprintf("%d", i)
    r := rbd{Name: rbdName}
    f, _ := os.OpenFile(newFile, os.O_RDWR, 0755)
    defer f.Close()
    err = tmpl.Execute(f, r)
    if err != nil {
        fmt.Println("error is ", err)
    }

    resFile := ResultsDir + "/" + options.FioJobFile + "_proc_" + fmt.Sprintf("%d", options.GoJobs) +
        "/" + "test_result_" + hostName + "_proc_" + fmt.Sprintf("%d", i)
    f, _ = os.Create(resFile)
    defer f.Close()
    cmdName := "./fio"
    cmdArgs := []string{newFile, "--output", resFile, "--output-format", "json"}
    err = executeCommand(cmdName, cmdArgs)
    if err != nil {
        fmt.Println("error is ", err)
    }
    ch <- 1

}

func executeCommand(cmdName string, cmdArgs []string) (err error) {
    cmd := exec.Command(cmdName, cmdArgs...)
    var stdout io.ReadCloser
    stdout, err = cmd.StderrPipe()
    if err != nil {
        return fmt.Errorf("error getting stdout from cmd '%v' %v", cmd, err)
    }
    if err = cmd.Start(); err != nil {
        return fmt.Errorf("error starting cmd '%v' %v", cmd, err)
    }
    defer func() {
        err = cmd.Wait()
    }()
    printLogs(stdout)
    return err
}

func printLogs(r io.Reader) {
    buf := make([]byte, 80)
    for {
        n, err := r.Read(buf)
        if n > 0 {
            fmt.Print(string(buf[0:n]))
        }
        if err != nil {
            break
        }
    }
}

func copyFile(src, des string) (w int64, err error) {
    srcFile, err := os.Open(src)
    if err != nil {
        fmt.Println(err)
    }
    defer srcFile.Close()

    desFile, err := os.Create(des)
    if err != nil {
        fmt.Println(err)
    }
    defer desFile.Close()

    return io.Copy(desFile, srcFile)
}

    需要的job模板如下所示,rbdname在执行时会转换为实际的卷名称($hostname-$n形式):

$ cat write_4k.fio
[write-4K]
description="write test with block size of 4K"
ioengine=rbd
clientname=admin
pool=rbd
rbdname={{.Name}}
iodepth=32
runtime=120
rw=write #write 表示顺序写,randwrite 表示随机写,read 表示顺序读,randread 表示随机读
bs=4K

    go build后会生成可执行程序,这里为cephtest,执行该程序的结果如下所示:

    anssible的使用不在这里叙述了,执行示例如下图所示:

 

 

posted on 2018-07-27 16:17  Hindsight_Tan  阅读(3120)  评论(0编辑  收藏  举报

导航