[Done] Linux System Performance Benchmarking

[ CPU Info ]

* cat /proc/cpuinfo

[ Ali ECS - 1 CPU ]
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
cpu MHz         : 2600.060
siblings        : 1
cpu cores       : 1
bogomips        : 5200.12

[ Lenovo Desktop PC ]
model name	: Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz
cpu MHz		: 799.875
siblings	: 4
cpu cores	: 2
bogomips	: 7183.37
[ Ali ECS - 1 CPU ]

root@db:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping        : 4
microcode       : 0x428
cpu MHz         : 2600.060
cache size      : 20480 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl pni ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm
bogomips        : 5200.12
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:
Ali ECS - 1 CPU
[ Lenovo Desktop ]

eric@eric-pc:~$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz
stepping	: 3
microcode	: 0x1c
cpu MHz		: 799.875
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm arat pln pts
bugs		:
bogomips	: 7183.37
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:
Lenovo Desktop PC

[ Memory Info ]

* free -h

[ Storage Info ]

* fdisk -l

* df -h, df -ih

[ Nework Info ]

* ifconfig -a

* sar -n DEV 1

[ OS Info ]

* uname -a

* /etc/lsb-release, /etc/os-release

 

[ Benchmarking - unixbench ]

Reference: http://www.361way.com/unixbench-benchmark/3437.html

[ Ali Cloud ECS - 1 CPU, 1G Memory, Normal HD, Ubuntu 14.04 ]
Dhrystone 2 using register variables         116700.0   28329189.2   2427.5
Execl Throughput                                 43.0       4004.4    931.3
Pipe Throughput                               12440.0    1827138.7   1468.8
Pipe-based Context Switching                   4000.0     294148.0    735.4
Process Creation                                126.0      13069.4   1037.3
Shell Scripts (1 concurrent)                     42.4       7224.2   1703.8
Shell Scripts (8 concurrent)                      6.0        939.2   1565.3
System Call Overhead                          15000.0    3811289.9   2540.9

[ Lenovo Desktop ]
Dhrystone 2 using register variables         116700.0   96415601.1   8261.8
Execl Throughput                                 43.0      18056.0   4199.1
Pipe Throughput                               12440.0    6448892.8   5184.0
Pipe-based Context Switching                   4000.0    1042165.6   2605.4
Process Creation                                126.0      49775.5   3950.4
Shell Scripts (1 concurrent)                     42.4      30791.5   7262.1
Shell Scripts (8 concurrent)                      6.0       3925.3   6542.2
System Call Overhead                          15000.0   10868299.5   7245.5

[ Dell Inspiron 15 - 3521, 4 Core CPU, 12G Memory ]
Dhrystone 2 using register variables         116700.0   44303325.7   3796.3
Execl Throughput                                 43.0       8588.9   1997.4
Pipe Throughput                               12440.0    3123568.2   2510.9
Pipe-based Context Switching                   4000.0     516216.8   1290.5
Process Creation                                126.0      21271.5   1688.2
Shell Scripts (1 concurrent)                     42.4      14398.9   3396.0
Shell Scripts (8 concurrent)                      6.0       1859.7   3099.6
System Call Overhead                          15000.0    5520127.6   3680.1

[ RGW ]
Dhrystone 2 using register variables         116700.0   13637261.1   1168.6
Execl Throughput                                 43.0       3420.9    795.6
Pipe Throughput                               12440.0     830597.2    667.7
Pipe-based Context Switching                   4000.0     201339.0    503.3
Process Creation                                126.0       8674.4    688.4
Shell Scripts (1 concurrent)                     42.4       3083.5    727.2
Shell Scripts (8 concurrent)                      6.0        419.1    698.5
System Call Overhead                          15000.0    1775096.5   1183.4
[ Ali Cloud ECS - 1 CPU, 1G Memory, Normal HD, Ubuntu 14.04 ]
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   28329189.2   2427.5
Double-Precision Whetstone                       55.0       3690.8    671.1
Execl Throughput                                 43.0       4004.4    931.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     946994.0   2391.4
File Copy 256 bufsize 500 maxblocks            1655.0     264533.9   1598.4
File Copy 4096 bufsize 8000 maxblocks          5800.0    1775740.9   3061.6
Pipe Throughput                               12440.0    1827138.7   1468.8
Pipe-based Context Switching                   4000.0     294148.0    735.4
Process Creation                                126.0      13069.4   1037.3
Shell Scripts (1 concurrent)                     42.4       7224.2   1703.8
Shell Scripts (8 concurrent)                      6.0        939.2   1565.3
System Call Overhead                          15000.0    3811289.9   2540.9
                                                                   ========
System Benchmarks Index Score                                        1504.8
Ali ECS
[ Dell Inspiron 15 - 3521, 4 Core CPU, 12G Memory ]
4 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   20599643.3   1765.2
Double-Precision Whetstone                       55.0       1450.5    263.7
Execl Throughput                                 43.0       2981.4    693.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     679470.2   1715.8
File Copy 256 bufsize 500 maxblocks            1655.0     194211.0   1173.5
File Copy 4096 bufsize 8000 maxblocks          5800.0    1625217.2   2802.1
Pipe Throughput                               12440.0    1436014.8   1154.4
Pipe-based Context Switching                   4000.0     116314.2    290.8
Process Creation                                126.0       8135.1    645.6
Shell Scripts (1 concurrent)                     42.4       7122.8   1679.9
Shell Scripts (8 concurrent)                      6.0       1751.3   2918.8
System Call Overhead                          15000.0    2395612.9   1597.1
                                                                   ========
System Benchmarks Index Score                                        1098.6

------------------------------------------------------------------------
4 CPUs in system; running 4 parallel copies of tests
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   44303325.7   3796.3
Double-Precision Whetstone                       55.0       4998.7    908.8
Execl Throughput                                 43.0       8588.9   1997.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     667076.0   1684.5
File Copy 256 bufsize 500 maxblocks            1655.0     177812.7   1074.4
File Copy 4096 bufsize 8000 maxblocks          5800.0    1878630.4   3239.0
Pipe Throughput                               12440.0    3123568.2   2510.9
Pipe-based Context Switching                   4000.0     516216.8   1290.5
Process Creation                                126.0      21271.5   1688.2
Shell Scripts (1 concurrent)                     42.4      14398.9   3396.0
Shell Scripts (8 concurrent)                      6.0       1859.7   3099.6
System Call Overhead                          15000.0    5520127.6   3680.1
                                                                   ========
System Benchmarks Index Score                                        2126.7
Dell Inspiron 15 - 3521
[ RGW ]

4 CPUs in system; running 1 parallel copy of tests

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    5541889.9    474.9
Double-Precision Whetstone                       55.0        628.5    114.3
Execl Throughput                                 43.0       1232.3    286.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     163085.0    411.8
File Copy 256 bufsize 500 maxblocks            1655.0      48618.4    293.8
File Copy 4096 bufsize 8000 maxblocks          5800.0     421322.5    726.4
Pipe Throughput                               12440.0     397309.5    319.4
Pipe-based Context Switching                   4000.0      60917.0    152.3
Process Creation                                126.0       3636.4    288.6
Shell Scripts (1 concurrent)                     42.4       1477.5    348.5
Shell Scripts (8 concurrent)                      6.0        399.4    665.6
System Call Overhead                          15000.0     813859.7    542.6
                                                                   ========
System Benchmarks Index Score                                         340.3

------------------------------------------------------------------------

4 CPUs in system; running 4 parallel copies of tests

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   13637261.1   1168.6
Double-Precision Whetstone                       55.0       2168.3    394.2
Execl Throughput                                 43.0       3420.9    795.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     158540.9    400.4
File Copy 256 bufsize 500 maxblocks            1655.0      45874.5    277.2
File Copy 4096 bufsize 8000 maxblocks          5800.0     411274.8    709.1
Pipe Throughput                               12440.0     830597.2    667.7
Pipe-based Context Switching                   4000.0     201339.0    503.3
Process Creation                                126.0       8674.4    688.4
Shell Scripts (1 concurrent)                     42.4       3083.5    727.2
Shell Scripts (8 concurrent)                      6.0        419.1    698.5
System Call Overhead                          15000.0    1775096.5   1183.4
                                                                   ========
System Benchmarks Index Score                                         631.4
RGW
[ EZM ]

4 CPUs in system; running 1 parallel copy of tests

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    6441179.5    551.9
Double-Precision Whetstone                       55.0        714.8    130.0
Execl Throughput                                 43.0       1523.1    354.2
File Copy 1024 bufsize 2000 maxblocks          3960.0     187264.9    472.9
File Copy 256 bufsize 500 maxblocks            1655.0      55540.5    335.6
File Copy 4096 bufsize 8000 maxblocks          5800.0     481207.8    829.7
Pipe Throughput                               12440.0     428346.1    344.3
Pipe-based Context Switching                   4000.0      62923.4    157.3
Process Creation                                126.0       4136.8    328.3
Shell Scripts (1 concurrent)                     42.4       1708.7    403.0
Shell Scripts (8 concurrent)                      6.0        474.7    791.2
System Call Overhead                          15000.0     937748.6    625.2
                                                                   ========
System Benchmarks Index Score                                         388.6

-----------------------------

4 CPUs in system; running 4 parallel copies of tests

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   16325756.4   1399.0
Double-Precision Whetstone                       55.0       2524.7    459.0
Execl Throughput                                 43.0       4251.4    988.7
File Copy 1024 bufsize 2000 maxblocks          3960.0     200632.4    506.6
File Copy 256 bufsize 500 maxblocks            1655.0      59859.6    361.7
File Copy 4096 bufsize 8000 maxblocks          5800.0     651897.8   1124.0
Pipe Throughput                               12440.0     998240.6    802.4
Pipe-based Context Switching                   4000.0     235934.9    589.8
Process Creation                                126.0      10930.8    867.5
Shell Scripts (1 concurrent)                     42.4       3721.9    877.8
Shell Scripts (8 concurrent)                      6.0        504.3    840.6
System Call Overhead                          15000.0    2397620.1   1598.4
                                                                   ========
System Benchmarks Index Score                                         794.6
复制代码
EZM
[ Lenovo Desktop ]

4 CPUs in system; running 1 parallel copy of tests

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   41153180.3   3526.4
Double-Precision Whetstone                       55.0       2997.6    545.0
Execl Throughput                                 43.0       5835.5   1357.1
File Copy 1024 bufsize 2000 maxblocks          3960.0    1329201.3   3356.6
File Copy 256 bufsize 500 maxblocks            1655.0     395956.0   2392.5
File Copy 4096 bufsize 8000 maxblocks          5800.0    2728246.5   4703.9
Pipe Throughput                               12440.0    3077583.0   2473.9
Pipe-based Context Switching                   4000.0     240658.5    601.6
Process Creation                                126.0      19070.8   1513.6
Shell Scripts (1 concurrent)                     42.4      15200.1   3584.9
Shell Scripts (8 concurrent)                      6.0       3473.3   5788.8
System Call Overhead                          15000.0    4761999.4   3174.7
                                                                   ========
System Benchmarks Index Score                                        2223.8

----------------------------
4 CPUs in system; running 4 parallel copies of tests

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   96415601.1   8261.8
Double-Precision Whetstone                       55.0      10667.0   1939.4
Execl Throughput                                 43.0      18056.0   4199.1
File Copy 1024 bufsize 2000 maxblocks          3960.0    1309286.0   3306.3
File Copy 256 bufsize 500 maxblocks            1655.0     362216.2   2188.6
File Copy 4096 bufsize 8000 maxblocks          5800.0    3369122.3   5808.8
Pipe Throughput                               12440.0    6448892.8   5184.0
Pipe-based Context Switching                   4000.0    1042165.6   2605.4
Process Creation                                126.0      49775.5   3950.4
Shell Scripts (1 concurrent)                     42.4      30791.5   7262.1
Shell Scripts (8 concurrent)                      6.0       3925.3   6542.2
System Call Overhead                          15000.0   10868299.5   7245.5
                                                                   ========
System Benchmarks Index Score                                        4395.0
Lenovo Desktop PC

[ Benchmarking - sysbench]

TBD

[ Benchmarking - hdparm ]

[ Ali ECS - 1 CPU, 1G Memory, Normal HardDisk ]
root@db:~# hdparm -t /dev/xvda1
/dev/xvda1:
 Timing buffered disk reads: 122 MB in  3.04 seconds =  40.14 MB/sec

[ Lenovo Desktop ]
root@eric-pc:/etc/ssh# hdparm -t /dev/sda9
/dev/sda9:
Timing buffered disk reads: 342 MB in 3.01 seconds = 113.74 MB/sec

[ Decitone RGW ]
root@dcu ~# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 180 MB in 3.03 seconds = 59.43 MB/sec

[ Decitone EZM ]
root@meet ~# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 302 MB in 3.07 seconds = 98.27 MB/sec

 [ Benchmarking - dd ]

[ Ali Cloud ECS ]
root@db:~# dd bs=64k count=4k if=/dev/zero of=test oflag=dsync
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 24.812 s, 10.8 MB/s
root@db:~# dd bs=64k count=4k if=/dev/zero of=test conv=fdatasync
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 6.50773 s, 41.2 MB/s
root@db:~# echo 3 > /proc/sys/vm/drop_caches
root@db:~# dd bs=64k count=4k if=test of=/dev/null
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 6.10765 s, 44.0 MB/s

 [ Benchmarking - fio ]

[ Ali Cloud ECS -  1 CPU, 1G Memory, Normal HardDisk ]

root@db:/dev# fio -filename=/dev/xvda -direct=1 -iodepth 1 -thread -rw=randread -ioengine=psync -bs=64k -size=10G -numjobs=10 -runtime=60 -group_reporting -name=mytes

Starting 10 threads
read : io=1868.9MB, bw=31887KB/s, iops=498, runt= 60016msec
Disk stats (read/write):
  xvda: ios=59754/23, merge=0/12, ticks=1197944/548, in_queue=1198632, util=99.94%
[ Ali Cloud ECS -  1 CPU, 1G Memory, Normal HardDisk ]

root@db:/dev# fio -filename=/dev/xvda -direct=1 -iodepth 1 -thread -rw=randread -ioengine=psync -bs=64k -size=10G -numjobs=10 -runtime=60 -group_reporting -name=mytes
mytes: (g=0): rw=randread, bs=64K-64K/64K-64K/64K-64K, ioengine=psync, iodepth=1
...
mytes: (g=0): rw=randread, bs=64K-64K/64K-64K/64K-64K, ioengine=psync, iodepth=1
fio-2.1.3
Starting 10 threads
Jobs: 10 (f=10): [rrrrrrrrrr] [100.0% done] [32095KB/0KB/0KB /s] [501/0/0 iops] [eta 00m:00s]
mytes: (groupid=0, jobs=10): err= 0: pid=22759: Sat Sep 17 07:53:16 2016
  read : io=1868.9MB, bw=31887KB/s, iops=498, runt= 60016msec
    clat (usec): min=598, max=503402, avg=20063.96, stdev=14124.53
     lat (usec): min=599, max=503403, avg=20064.29, stdev=14124.53
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    9], 10.00th=[   11], 20.00th=[   13],
     | 30.00th=[   15], 40.00th=[   17], 50.00th=[   18], 60.00th=[   21],
     | 70.00th=[   23], 80.00th=[   26], 90.00th=[   31], 95.00th=[   36],
     | 99.00th=[   57], 99.50th=[   88], 99.90th=[  190], 99.95th=[  245],
     | 99.99th=[  457]
    bw (KB  /s): min=  992, max= 7552, per=10.03%, avg=3198.62, stdev=606.67
    lat (usec) : 750=0.02%, 1000=0.06%
    lat (msec) : 2=0.68%, 4=0.93%, 10=6.40%, 20=51.84%, 50=38.79%
    lat (msec) : 100=0.85%, 250=0.38%, 500=0.04%, 750=0.01%
  cpu          : usr=0.03%, sys=0.12%, ctx=30005, majf=0, minf=167
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=29902/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=1868.9MB, aggrb=31886KB/s, minb=31886KB/s, maxb=31886KB/s, mint=60016msec, maxt=60016msec

Disk stats (read/write):
  xvda: ios=59754/23, merge=0/12, ticks=1197944/548, in_queue=1198632, util=99.94%
root@db:/dev#
Ali ECS

[ ping ]

* To test accessibility (from various locations)

http://ping.chinaz.com/ 

[ Web Performance Benchmarking ]

* ApacheBench

Ali ECS 1 CPU, 1G Memory, 4M WAN

ab -n 100000 -c 50 -r http://www.lecomm.net:82/index.html

Server: CPU load is pretty low, bandwidth is bottleneck.

Server Software:        nginx/1.10.0
Server Hostname:        www.lecomm.net
Server Port:            82

Document Path:          /index.html
Document Length:        775 bytes

Concurrency Level:      50
Time taken for tests:   249.561 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      101700000 bytes
HTML transferred:       77500000 bytes
Requests per second:    400.70 [#/sec] (mean)
Time per request:       124.780 [ms] (mean)
Time per request:       2.496 [ms] (mean, across all concurrent requests)
Transfer rate:          397.97 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0  109 332.7      0    7010
Processing:     0   16  66.4      1    3267
Waiting:        0   16  66.4      1    3267
Total:          0  124 339.8      1    7010

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      2
  75%      2
  80%      3
  90%    998
  95%   1000
  98%   1002
  99%   1006
 100%   7010 (longest request)
Ali ECS 1 CPU, 1G Memory, Ali LAN, Nginx Web Server (not r-proxy)

# ab -n 200000 -c 200 -r http://www.ali.lecomm.net:82/index.html

Server Load: CPU>95% (si>60%)

Server Software:        nginx/1.10.0
Server Hostname:        www.ali.lecomm.net
Server Port:            82

Document Path:          /index.html
Document Length:        775 bytes

Concurrency Level:      200
Time taken for tests:   34.034 seconds
Complete requests:      200000
Failed requests:        0
Total transferred:      203400000 bytes
HTML transferred:       155000000 bytes
Requests per second:    5876.41 [#/sec] (mean)
Time per request:       34.034 [ms] (mean)
Time per request:       0.170 [ms] (mean, across all concurrent requests)
Transfer rate:          5836.24 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   11 105.3      0    3007
Processing:     5   22  21.6     22    1653
Waiting:        2   22  21.6     22    1653
Total:          9   34 110.2     22    3026

Percentage of the requests served within a certain time (ms)
  50%     22
  66%     22
  75%     23
  80%     23
  90%     24
  95%     25
  98%     29
  99%   1018
 100%   3026 (longest request)

* Webbench

wget http://www.ha97.com/code/webbench-1.5.tar.gz
tar -xzvf webbench-1.5.tar.gz
apt-get install ctags
make; make install

webbench -c 200 -t 10 http://www.lecomm.net:82/index.html
* Speed=25962 pages/min, 440055 bytes/sec.
* Requests: 25962 susceed, 0 failed.
* all the requests are ok.
* download bandwidth is the bottleneck.
webbench -c 1000 -t 360 http://www.ali.lecomm.net:82/index.html
* some requests are failed.
* CPU is the bottleneck (%si>60%! %cpu>90%).

[ HTTP PUT ]

* To test upload bandwidth

posted @ 2016-09-17 07:57  Eric.YAO  阅读(182)  评论(0)    收藏  举报