HPL测试

HPL测试需要在Linux系统下进行,由于一些原因我没有超算环境可以使用,所以我是在自己电脑上安装的Ubuntu系统,这样势必会导致测试的结果分数非常低。务必在proposal中说明自己的硬件配置。

系统信息

配置环境

在安装HPL之前需要配置好:

1. MPICH并行环境
2. BLAS/CBLAS库

MPICH安装教程大家参考我的Blog,安装成功后可以输入mpirun --version查看MPICH的版本。
至于BLAS/CBLAS库和HPL的安装教程,网上同类优秀教程太多了。这里具体的我就不赘叙了,推荐这篇教程:https://www.cnblogs.com/zhyantao/p/10614238.html
但是大家安装MPICH不要按照他的教程,否则在之后的HPCG测试中需要重新安装MPICH,这里推荐大家按照我的MPICH教程,在HPL测试安装好之后在HPCG中可以直接使用。
如果你是一步一步认真安装的你会得到HPL.dat、xhpl。
修改HPL.dat中的参数,如下:

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
4            # of problems sizes (N)
200 500 1000 2000         Ns
3            # of NBs
64 128 192      NBs
0            PMAP process mapping (0=Row-,1=Column-major)
3            # of process grids (P x Q)
2 1 2        Ps
2 2 1        Qs
16.0         threshold
1            # of panel fact
1            PFACTs (0=left, 1=Crout, 2=Right)
2            # of recursive stopping criterium
4 8          NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
2            RFACTs (0=left, 1=Crout, 2=Right)
2            # of broadcast
1 3          BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
2            # of lookahead depth
0 1          DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
60           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)

测试结果

执行HPL测试得到HPL-Benchmark.txt。结果如下:

================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :     200      500     1000     2000 
NB     :      64      128      192 
PMAP   : Row-major process mapping
P      :       2        1        2 
Q      :       2        2        1 
PFACT  :   Crout 
NBMIN  :       4        8 
NDIV   :       2 
RFACT  :   Right 
BCAST  :  1ringM   2ringM 
DEPTH  :       0        1 
SWAP   : Mix (threshold = 60)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR01R2C4         200    64     2     2               0.00             3.0405e+00
HPL_pdgesv() start time Wed Jan  6 18:39:44 2021

HPL_pdgesv() end time   Wed Jan  6 18:39:44 2021

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   1.02421835e-02 ...... PASSED
#............................中间省略
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR13R2C4        2000   192     2     1               0.74             7.1749e+00
HPL_pdgesv() start time Wed Jan  6 18:40:32 2021

HPL_pdgesv() end time   Wed Jan  6 18:40:33 2021

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   5.68748871e-03 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR13R2C8        2000   192     2     1               0.70             7.6763e+00
HPL_pdgesv() start time Wed Jan  6 18:40:33 2021

HPL_pdgesv() end time   Wed Jan  6 18:40:34 2021

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   5.68748871e-03 ...... PASSED
================================================================================

Finished    288 tests with the following results:
            288 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

这份文档非常长,但是其中大部分都是在重复多次测试,最后的结果7.6763Gflops。

posted @ 2021-01-15 23:14  Treasure_lee  阅读(2030)  评论(5)    收藏  举报