Linux下Linpack测试CPU性能的相关参数配置以及执行命令

最新推荐文章于 2024-04-21 19:27:45 发布

千与千与千

最新推荐文章于 2024-04-21 19:27:45 发布

阅读量5.9k

点赞数 2

分类专栏： Problem Be Solved 文章标签： linux linpack

本文链接：https://blog.csdn.net/liu_feng_zi_/article/details/107416291

版权

Problem Be Solved 专栏收录该内容

77 篇文章 10 订阅

订阅专栏

一、参数解释

合适的HPL.dat参数设置才能够正常运行以及达到较好的性能。

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
8            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
80000       Ns
1            # of NBs
1024         NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1           # of process grids (P x Q)
1           Ps
1           Qs
16.0         threshold
1            # of panel fact
1        PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4          NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1        RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
0            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
2            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)

1、第1、2行为注释说明行，不需要作修改

2、第3行说明如果输出文件的话，文件的名字

3、第4行说明输出结果文件的形式，为“6”时，测试结果输出至标准输出（stdout），为“7”时，测试结果输出至标准错误输出（stderr），为其它值时，测试结果输出至第3行所指定的文件中

4、第5行说明求解问题（矩阵）的个数，也就是第6行要设置的参数的个数

5、第6行要设置矩阵的阶，参数值要与第5行的数值相等。网上大多数都说N的值为N×N×8＝系统总内存×80％最优

6、第7行说明求解问题（矩阵）时采用的分块方式的种数，也就是第8行要设置的参数的个数

7、第8行说明每一种分块的大小。为提高数据的局部性，从而提高整体性能，HPL采用分块矩阵的算法。NB值的选择主要是通过实际测试得到最优值。

8、第9行是选择处理器阵列是按列的排列方式还是按行的排列方式。

9、第10-12行说明二维处理器网格（P×Q）。二维处理器网格（P×Q）的要遵循以下几个要求：P×Q＝进程数。这是HPL的硬性规定。

10、其他值采取默认即可。

二、单个节点上执行

命令：

./xhpl

结果：

================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   80000 
NB     :    1024 
PMAP   : Row-major process mapping
P      :       1 
Q      :       1 
PFACT  :   Crout 
NBMIN  :       4 
NDIV   :       2 
RFACT  :   Crout 
BCAST  :   1ring 
DEPTH  :       2 
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR20C2C4       80000  1024     1     1             729.77             4.6774e+02
HPL_pdgesv() start time Fri Jul 17 09:24:43 2020

HPL_pdgesv() end time   Fri Jul 17 09:36:53 2020

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   2.16389188e-03 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

三、多个节点执行

命令：

第一种方式：
mpirun -np N xhpl  N为进程数


第二种方式：
mpirun -p4pg <p4file> xhpl  需要自己编写配置文件，p4file指定每个进程在哪个节点运行

千与千与千

关注

2
点赞
踩
26

收藏

觉得还不错? 一键收藏
10
评论
Linux下Linpack测试CPU性能的相关参数配置以及执行命令

合适的HPL.dat参数设置才能够正常运行以及达到较好的性能。HPLinpack benchmark input fileInnovative Computing Laboratory, University of TennesseeHPL.out output file name (if any)8 device out (6=stdout,7=stderr,file)1 # of problems sizes (N)80000
复制链接

扫一扫