Design of HPC system

1. Design of HPC system

ASC 学生超级计算机挑战赛 2020
在这里插入图片描述

1.1 Hardwareresources

ItemNameConfigurationQuantity
ServerInspur NF5280M5CPU: Intel Xeon Gold 6230 x 2,2.1GHz,20 cores Memory: 32G x 12,DDR4,2933Mhz Hard disk: 480G SSD SATA x 1 Power consumption estimation: 6230 TDP 125W, memory 7.5W, hard disk 10W4
HCA cardEDRInfiniBand Mellanox ConnectX®-5 HCA card, single port QSFP, EDR IB Power consumption estimation: 9W4
SwitchGbE switch10/100/1000Mb/s,24 ports Ethernet switch Power consumption estimation: 30W1
EDR-IB switchSwitc-IB™ EDR InfiniBand switch, 36 QSFP port Power consumption estimation: 130W1
CableGigabit CAT6 cablesCAT6 copper cable, blue, 3m1
InfiniBand cableInfiniBand EDR copper cable, QSFP port, cooperating with the InfiniBand switch for use1
GPUNVIDIA Tesla V100-PCI-E8
Hard diskSamsung 970 EVO NVMe M.2(250GB4
MemoryKingSton DDR4 2933MHz Server Premier (KSM29RD4/32ME)48

1.2 Software resources

(表格:Times New Roman/11)

ItemName
operating systemDebian-9.4
Job scheduling systemSlurm-19.05.6
File systemzfs-0.6.5.9-1
Hardware monitoring softwareZabbix4.0+Grafana6.0
Package managerSpack-3.7
Cluster management softwareClustershell-1.7.3
Parallel environmentMPICH-3.3.2
Intel MPI-5.1.3
Open MPI-3.1.2
Application Development EnvironmentPython-3.6.6
GUN translater-4.4.7
PGI translater-17.4
Inteltranslater-12.10
BLAS-3.8,LAPACK-3.1.1,FFTW-3.2.2,intel MK-2018.0.3
Anaconda2-4.3.0,cuda9.1,pytorch-gpu-1.0.1,hpl-2.3,hpcg-3.0

1.3 Clusteranalysis

1.3.1Architecture diagram

(三级标题:Times New Roman/三号/左对齐/黑体)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-VvrFBmnp-1617031879617)(file:///C:\Users\24222\AppData\Local\Temp\msohtmlclip1\01\clip_image002.jpg)]

1.3.2 Clusterpower

ItemPower
CPU100W*8
GPU250W*8
HCA card9W*4
Switch30W+130W
Hard disk+Memory7.5W+10W
sum2933.5W

1.3.3 Floatingpoint performance

(正文:Times New Roman/小四)

CPUtheoretical floating point peak :

=1.512GHZ22512/648*20=7741.44Gfloas=7.74TFLOPs

GPUtheoretical floating point peak :

7.0Tflops*8=56.0Tflops

1.3.4 Pros andcons

1.3.4 .1Advantage:

(四级标题:Times New Roman/小三/左对齐/黑体)

1.High scalability. Can easilyrealize the increase of nodes, system expansion and upgrade, and also reducethe hardware requirements through the cluster software.

\2. Simplemanagement and installation. The simple architecture maximizes performance andcan be quickly installed for practical applications.

\3. Richapplication software. Provide middleware to handle coordination andcommunication between nodes, so that the entire system node can truly achievecooperation and load balancing.

4.Advanced.Use high-speed infiniband network interconnection to form a computingenvironment, and support software and job scheduling system through parallelcomputing to make them work together

\5. Strongcomputing power. Can achieve parallel computing and powerful GPU processingcapabilities to meet the stringent requirements for running speed.

\6. Thevisual hardware management system controls the cluster power more accuratelyand efficiently within the required range.

1.3.4 .12Disadvantage:

\1. Thepositioning of small HPC clusters, where each node is not establishedseparately, also highlights the room for the cluster to be improved in terms ofstability.

2.Singlemanagement node: There is only one management node to manage metadata. When thecluster system reaches a certain scale, the management node will be overlybusy, and the management node will become the system bottleneck

3.Clustermanagement software may reduce the computing speed of the cluster to a certainextent

ystem bottleneck

3.Clustermanagement software may reduce the computing speed of the cluster to a certainextent

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值