[Done] Linux System Performance Benchmarking
[ CPU Info ]
* cat /proc/cpuinfo
[ Ali ECS - 1 CPU ] model name : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz cpu MHz : 2600.060 siblings : 1 cpu cores : 1 bogomips : 5200.12 [ Lenovo Desktop PC ] model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz cpu MHz : 799.875 siblings : 4 cpu cores : 2 bogomips : 7183.37
[ Ali ECS - 1 CPU ] root@db:~# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz stepping : 4 microcode : 0x428 cpu MHz : 2600.060 cache size : 20480 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl pni ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm bogomips : 5200.12 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:
[ Lenovo Desktop ] eric@eric-pc:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz stepping : 3 microcode : 0x1c cpu MHz : 799.875 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm arat pln pts bugs : bogomips : 7183.37 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management:
[ Memory Info ]
* free -h
[ Storage Info ]
* fdisk -l
* df -h, df -ih
[ Nework Info ]
* ifconfig -a
* sar -n DEV 1
[ OS Info ]
* uname -a
* /etc/lsb-release, /etc/os-release
[ Benchmarking - unixbench ]
Reference: http://www.361way.com/unixbench-benchmark/3437.html
[ Ali Cloud ECS - 1 CPU, 1G Memory, Normal HD, Ubuntu 14.04 ] Dhrystone 2 using register variables 116700.0 28329189.2 2427.5 Execl Throughput 43.0 4004.4 931.3 Pipe Throughput 12440.0 1827138.7 1468.8 Pipe-based Context Switching 4000.0 294148.0 735.4 Process Creation 126.0 13069.4 1037.3 Shell Scripts (1 concurrent) 42.4 7224.2 1703.8 Shell Scripts (8 concurrent) 6.0 939.2 1565.3 System Call Overhead 15000.0 3811289.9 2540.9 [ Lenovo Desktop ] Dhrystone 2 using register variables 116700.0 96415601.1 8261.8 Execl Throughput 43.0 18056.0 4199.1 Pipe Throughput 12440.0 6448892.8 5184.0 Pipe-based Context Switching 4000.0 1042165.6 2605.4 Process Creation 126.0 49775.5 3950.4 Shell Scripts (1 concurrent) 42.4 30791.5 7262.1 Shell Scripts (8 concurrent) 6.0 3925.3 6542.2 System Call Overhead 15000.0 10868299.5 7245.5 [ Dell Inspiron 15 - 3521, 4 Core CPU, 12G Memory ] Dhrystone 2 using register variables 116700.0 44303325.7 3796.3 Execl Throughput 43.0 8588.9 1997.4 Pipe Throughput 12440.0 3123568.2 2510.9 Pipe-based Context Switching 4000.0 516216.8 1290.5 Process Creation 126.0 21271.5 1688.2 Shell Scripts (1 concurrent) 42.4 14398.9 3396.0 Shell Scripts (8 concurrent) 6.0 1859.7 3099.6 System Call Overhead 15000.0 5520127.6 3680.1 [ RGW ] Dhrystone 2 using register variables 116700.0 13637261.1 1168.6 Execl Throughput 43.0 3420.9 795.6 Pipe Throughput 12440.0 830597.2 667.7 Pipe-based Context Switching 4000.0 201339.0 503.3 Process Creation 126.0 8674.4 688.4 Shell Scripts (1 concurrent) 42.4 3083.5 727.2 Shell Scripts (8 concurrent) 6.0 419.1 698.5 System Call Overhead 15000.0 1775096.5 1183.4
[ Ali Cloud ECS - 1 CPU, 1G Memory, Normal HD, Ubuntu 14.04 ]
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 28329189.2 2427.5
Double-Precision Whetstone 55.0 3690.8 671.1
Execl Throughput 43.0 4004.4 931.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 946994.0 2391.4
File Copy 256 bufsize 500 maxblocks 1655.0 264533.9 1598.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 1775740.9 3061.6
Pipe Throughput 12440.0 1827138.7 1468.8
Pipe-based Context Switching 4000.0 294148.0 735.4
Process Creation 126.0 13069.4 1037.3
Shell Scripts (1 concurrent) 42.4 7224.2 1703.8
Shell Scripts (8 concurrent) 6.0 939.2 1565.3
System Call Overhead 15000.0 3811289.9 2540.9
========
System Benchmarks Index Score 1504.8
[ Dell Inspiron 15 - 3521, 4 Core CPU, 12G Memory ]
4 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 20599643.3 1765.2
Double-Precision Whetstone 55.0 1450.5 263.7
Execl Throughput 43.0 2981.4 693.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 679470.2 1715.8
File Copy 256 bufsize 500 maxblocks 1655.0 194211.0 1173.5
File Copy 4096 bufsize 8000 maxblocks 5800.0 1625217.2 2802.1
Pipe Throughput 12440.0 1436014.8 1154.4
Pipe-based Context Switching 4000.0 116314.2 290.8
Process Creation 126.0 8135.1 645.6
Shell Scripts (1 concurrent) 42.4 7122.8 1679.9
Shell Scripts (8 concurrent) 6.0 1751.3 2918.8
System Call Overhead 15000.0 2395612.9 1597.1
========
System Benchmarks Index Score 1098.6
------------------------------------------------------------------------
4 CPUs in system; running 4 parallel copies of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 44303325.7 3796.3
Double-Precision Whetstone 55.0 4998.7 908.8
Execl Throughput 43.0 8588.9 1997.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 667076.0 1684.5
File Copy 256 bufsize 500 maxblocks 1655.0 177812.7 1074.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 1878630.4 3239.0
Pipe Throughput 12440.0 3123568.2 2510.9
Pipe-based Context Switching 4000.0 516216.8 1290.5
Process Creation 126.0 21271.5 1688.2
Shell Scripts (1 concurrent) 42.4 14398.9 3396.0
Shell Scripts (8 concurrent) 6.0 1859.7 3099.6
System Call Overhead 15000.0 5520127.6 3680.1
========
System Benchmarks Index Score 2126.7
[ RGW ]
4 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 5541889.9 474.9
Double-Precision Whetstone 55.0 628.5 114.3
Execl Throughput 43.0 1232.3 286.6
File Copy 1024 bufsize 2000 maxblocks 3960.0 163085.0 411.8
File Copy 256 bufsize 500 maxblocks 1655.0 48618.4 293.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 421322.5 726.4
Pipe Throughput 12440.0 397309.5 319.4
Pipe-based Context Switching 4000.0 60917.0 152.3
Process Creation 126.0 3636.4 288.6
Shell Scripts (1 concurrent) 42.4 1477.5 348.5
Shell Scripts (8 concurrent) 6.0 399.4 665.6
System Call Overhead 15000.0 813859.7 542.6
========
System Benchmarks Index Score 340.3
------------------------------------------------------------------------
4 CPUs in system; running 4 parallel copies of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 13637261.1 1168.6
Double-Precision Whetstone 55.0 2168.3 394.2
Execl Throughput 43.0 3420.9 795.6
File Copy 1024 bufsize 2000 maxblocks 3960.0 158540.9 400.4
File Copy 256 bufsize 500 maxblocks 1655.0 45874.5 277.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 411274.8 709.1
Pipe Throughput 12440.0 830597.2 667.7
Pipe-based Context Switching 4000.0 201339.0 503.3
Process Creation 126.0 8674.4 688.4
Shell Scripts (1 concurrent) 42.4 3083.5 727.2
Shell Scripts (8 concurrent) 6.0 419.1 698.5
System Call Overhead 15000.0 1775096.5 1183.4
========
System Benchmarks Index Score 631.4
[ EZM ]
4 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 6441179.5 551.9
Double-Precision Whetstone 55.0 714.8 130.0
Execl Throughput 43.0 1523.1 354.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 187264.9 472.9
File Copy 256 bufsize 500 maxblocks 1655.0 55540.5 335.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 481207.8 829.7
Pipe Throughput 12440.0 428346.1 344.3
Pipe-based Context Switching 4000.0 62923.4 157.3
Process Creation 126.0 4136.8 328.3
Shell Scripts (1 concurrent) 42.4 1708.7 403.0
Shell Scripts (8 concurrent) 6.0 474.7 791.2
System Call Overhead 15000.0 937748.6 625.2
========
System Benchmarks Index Score 388.6
-----------------------------
4 CPUs in system; running 4 parallel copies of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 16325756.4 1399.0
Double-Precision Whetstone 55.0 2524.7 459.0
Execl Throughput 43.0 4251.4 988.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 200632.4 506.6
File Copy 256 bufsize 500 maxblocks 1655.0 59859.6 361.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 651897.8 1124.0
Pipe Throughput 12440.0 998240.6 802.4
Pipe-based Context Switching 4000.0 235934.9 589.8
Process Creation 126.0 10930.8 867.5
Shell Scripts (1 concurrent) 42.4 3721.9 877.8
Shell Scripts (8 concurrent) 6.0 504.3 840.6
System Call Overhead 15000.0 2397620.1 1598.4
========
System Benchmarks Index Score 794.6
复制代码
[ Lenovo Desktop ]
4 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 41153180.3 3526.4
Double-Precision Whetstone 55.0 2997.6 545.0
Execl Throughput 43.0 5835.5 1357.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 1329201.3 3356.6
File Copy 256 bufsize 500 maxblocks 1655.0 395956.0 2392.5
File Copy 4096 bufsize 8000 maxblocks 5800.0 2728246.5 4703.9
Pipe Throughput 12440.0 3077583.0 2473.9
Pipe-based Context Switching 4000.0 240658.5 601.6
Process Creation 126.0 19070.8 1513.6
Shell Scripts (1 concurrent) 42.4 15200.1 3584.9
Shell Scripts (8 concurrent) 6.0 3473.3 5788.8
System Call Overhead 15000.0 4761999.4 3174.7
========
System Benchmarks Index Score 2223.8
----------------------------
4 CPUs in system; running 4 parallel copies of tests
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 96415601.1 8261.8
Double-Precision Whetstone 55.0 10667.0 1939.4
Execl Throughput 43.0 18056.0 4199.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 1309286.0 3306.3
File Copy 256 bufsize 500 maxblocks 1655.0 362216.2 2188.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 3369122.3 5808.8
Pipe Throughput 12440.0 6448892.8 5184.0
Pipe-based Context Switching 4000.0 1042165.6 2605.4
Process Creation 126.0 49775.5 3950.4
Shell Scripts (1 concurrent) 42.4 30791.5 7262.1
Shell Scripts (8 concurrent) 6.0 3925.3 6542.2
System Call Overhead 15000.0 10868299.5 7245.5
========
System Benchmarks Index Score 4395.0
[ Benchmarking - sysbench]
TBD
[ Benchmarking - hdparm ]
[ Ali ECS - 1 CPU, 1G Memory, Normal HardDisk ] root@db:~# hdparm -t /dev/xvda1 /dev/xvda1: Timing buffered disk reads: 122 MB in 3.04 seconds = 40.14 MB/sec
[ Lenovo Desktop ]
root@eric-pc:/etc/ssh# hdparm -t /dev/sda9
/dev/sda9:
Timing buffered disk reads: 342 MB in 3.01 seconds = 113.74 MB/sec
[ Decitone RGW ]
root@dcu ~# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 180 MB in 3.03 seconds = 59.43 MB/sec
[ Decitone EZM ]
root@meet ~# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 302 MB in 3.07 seconds = 98.27 MB/sec
[ Benchmarking - dd ]
[ Ali Cloud ECS ] root@db:~# dd bs=64k count=4k if=/dev/zero of=test oflag=dsync 4096+0 records in 4096+0 records out 268435456 bytes (268 MB) copied, 24.812 s, 10.8 MB/s
root@db:~# dd bs=64k count=4k if=/dev/zero of=test conv=fdatasync
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 6.50773 s, 41.2 MB/s
root@db:~# echo 3 > /proc/sys/vm/drop_caches
root@db:~# dd bs=64k count=4k if=test of=/dev/null
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 6.10765 s, 44.0 MB/s
[ Benchmarking - fio ]
[ Ali Cloud ECS - 1 CPU, 1G Memory, Normal HardDisk ] root@db:/dev# fio -filename=/dev/xvda -direct=1 -iodepth 1 -thread -rw=randread -ioengine=psync -bs=64k -size=10G -numjobs=10 -runtime=60 -group_reporting -name=mytes Starting 10 threads read : io=1868.9MB, bw=31887KB/s, iops=498, runt= 60016msec Disk stats (read/write): xvda: ios=59754/23, merge=0/12, ticks=1197944/548, in_queue=1198632, util=99.94%
[ Ali Cloud ECS - 1 CPU, 1G Memory, Normal HardDisk ]
root@db:/dev# fio -filename=/dev/xvda -direct=1 -iodepth 1 -thread -rw=randread -ioengine=psync -bs=64k -size=10G -numjobs=10 -runtime=60 -group_reporting -name=mytes
mytes: (g=0): rw=randread, bs=64K-64K/64K-64K/64K-64K, ioengine=psync, iodepth=1
...
mytes: (g=0): rw=randread, bs=64K-64K/64K-64K/64K-64K, ioengine=psync, iodepth=1
fio-2.1.3
Starting 10 threads
Jobs: 10 (f=10): [rrrrrrrrrr] [100.0% done] [32095KB/0KB/0KB /s] [501/0/0 iops] [eta 00m:00s]
mytes: (groupid=0, jobs=10): err= 0: pid=22759: Sat Sep 17 07:53:16 2016
read : io=1868.9MB, bw=31887KB/s, iops=498, runt= 60016msec
clat (usec): min=598, max=503402, avg=20063.96, stdev=14124.53
lat (usec): min=599, max=503403, avg=20064.29, stdev=14124.53
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 9], 10.00th=[ 11], 20.00th=[ 13],
| 30.00th=[ 15], 40.00th=[ 17], 50.00th=[ 18], 60.00th=[ 21],
| 70.00th=[ 23], 80.00th=[ 26], 90.00th=[ 31], 95.00th=[ 36],
| 99.00th=[ 57], 99.50th=[ 88], 99.90th=[ 190], 99.95th=[ 245],
| 99.99th=[ 457]
bw (KB /s): min= 992, max= 7552, per=10.03%, avg=3198.62, stdev=606.67
lat (usec) : 750=0.02%, 1000=0.06%
lat (msec) : 2=0.68%, 4=0.93%, 10=6.40%, 20=51.84%, 50=38.79%
lat (msec) : 100=0.85%, 250=0.38%, 500=0.04%, 750=0.01%
cpu : usr=0.03%, sys=0.12%, ctx=30005, majf=0, minf=167
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=29902/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=1868.9MB, aggrb=31886KB/s, minb=31886KB/s, maxb=31886KB/s, mint=60016msec, maxt=60016msec
Disk stats (read/write):
xvda: ios=59754/23, merge=0/12, ticks=1197944/548, in_queue=1198632, util=99.94%
root@db:/dev#
[ ping ]
* To test accessibility (from various locations)
[ Web Performance Benchmarking ]
* ApacheBench
Ali ECS 1 CPU, 1G Memory, 4M WAN ab -n 100000 -c 50 -r http://www.lecomm.net:82/index.html Server: CPU load is pretty low, bandwidth is bottleneck. Server Software: nginx/1.10.0 Server Hostname: www.lecomm.net Server Port: 82 Document Path: /index.html Document Length: 775 bytes Concurrency Level: 50 Time taken for tests: 249.561 seconds Complete requests: 100000 Failed requests: 0 Total transferred: 101700000 bytes HTML transferred: 77500000 bytes Requests per second: 400.70 [#/sec] (mean) Time per request: 124.780 [ms] (mean) Time per request: 2.496 [ms] (mean, across all concurrent requests) Transfer rate: 397.97 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 109 332.7 0 7010 Processing: 0 16 66.4 1 3267 Waiting: 0 16 66.4 1 3267 Total: 0 124 339.8 1 7010 Percentage of the requests served within a certain time (ms) 50% 1 66% 2 75% 2 80% 3 90% 998 95% 1000 98% 1002 99% 1006 100% 7010 (longest request)
Ali ECS 1 CPU, 1G Memory, Ali LAN, Nginx Web Server (not r-proxy) # ab -n 200000 -c 200 -r http://www.ali.lecomm.net:82/index.html Server Load: CPU>95% (si>60%) Server Software: nginx/1.10.0 Server Hostname: www.ali.lecomm.net Server Port: 82 Document Path: /index.html Document Length: 775 bytes Concurrency Level: 200 Time taken for tests: 34.034 seconds Complete requests: 200000 Failed requests: 0 Total transferred: 203400000 bytes HTML transferred: 155000000 bytes Requests per second: 5876.41 [#/sec] (mean) Time per request: 34.034 [ms] (mean) Time per request: 0.170 [ms] (mean, across all concurrent requests) Transfer rate: 5836.24 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 11 105.3 0 3007 Processing: 5 22 21.6 22 1653 Waiting: 2 22 21.6 22 1653 Total: 9 34 110.2 22 3026 Percentage of the requests served within a certain time (ms) 50% 22 66% 22 75% 23 80% 23 90% 24 95% 25 98% 29 99% 1018 100% 3026 (longest request)
* Webbench
wget http://www.ha97.com/code/webbench-1.5.tar.gz tar -xzvf webbench-1.5.tar.gz apt-get install ctags make; make install webbench -c 200 -t 10 http://www.lecomm.net:82/index.html
* Speed=25962 pages/min, 440055 bytes/sec.
* Requests: 25962 susceed, 0 failed.
* all the requests are ok.
* download bandwidth is the bottleneck.
webbench -c 1000 -t 360 http://www.ali.lecomm.net:82/index.html
* some requests are failed.
* CPU is the bottleneck (%si>60%! %cpu>90%).
[ HTTP PUT ]
* To test upload bandwidth

浙公网安备 33010602011771号