Hi3559AV100 SVP NNIE 功能仿真及CUDA加速

海思的Hi3559AV100提供在Ruyi Studio中进行仿真的能力。

仿真的CUDA加速

功能仿真中可以使用CUDA加速配置。配置方式如下。

环境要求

Windows版本 7/10,CUDA需要 >= 8.0,显存需要 >= 6G
从NVIDIA主页下载CUDA
https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64

架构支持

SM_30, SM_35
SM_50, SM_52, SM_53
SM_60, SM_61, SM_62

启动仿真

仿真默认设置

仿真设置在一个绝对路径
Windows:C:\hisilicon\nnie_sim.ini
Linux:/usr/hisilicon/nnie_sim.ini
默认配置是

#===== CONFIG =====
[LAYER_LINEAR_PRINT_EN]
0
[CUDA_CALC_EN]
0

跑默认的sample.exe产生console如下

Sun Sep  8 22:18:21 2019 E config openfile failure(path: C:/hisilicon/nnie_sim.ini)
Sun Sep  8 22:18:21 2019 W C:/hisilicon/nnie_sim.ini load failure, using default setting
Sun Sep  8 22:18:21 2019 I ============= nnie_sim.ini setting ==============
Sun Sep  8 22:18:21 2019 I [LOG_LEVEL]                      DBG
Sun Sep  8 22:18:21 2019 I =================================================
SvpSampleCnnClfLenet start ...
SVP Version: HiSVP_PC_V1.1.1.0_B030
build sha1: 86265e2c
      date: Jun 20 2018
      time: 15:41:12
Read version info from WK header
      sdk_arch_ver: 	1.1
      nnie_arch_ver: 	1.1
      nnie_mapper_ver: 	1.1.1.0
      test_ver: 	B030
      patch_ver: 	P000
Sun Sep  8 22:18:30 2019 I >>>>>>>>> createNet >>>>>>>>> virAddr = 0x1743fd0
Sun Sep  8 22:18:30 2019 D MAIN HI_MPI_SVP_NNIE_LoadModel():152 net ptr addr: 000000000031F740
Sun Sep  8 22:18:30 2019 D NET Net::setDumpPath():385 [DUMP_INDIVIDUAL_EN] = 0
Sun Sep  8 22:18:30 2019 D NET Net::setDumpPath():401 set net DumpPath: ./
Sun Sep  8 22:18:30 2019 D NET Net::setUp():255 model net base: 0000000001743FD0, segment num: 1
Sun Sep  8 22:18:30 2019 D NET Net::setUp():263 model seg(0) base: 0000000001744010
Sun Sep  8 22:18:30 2019 I create net seg(0), type: 205
Sun Sep  8 22:18:30 2019 D SEG Segment::DumpParamHdr():470 netType: 0 srcNodeNum: 1, dstNodeNum: 1
Sun Sep  8 22:18:30 2019 D SEG Segment::DumpParamHdr():471 tmpBufSize: 4734976, netSegSize: 96, netSegLayerNum: 9
Sun Sep  8 22:18:30 2019 D SEG Segment::DumpParamHdr():472 maxT: 0
Sun Sep  8 22:18:30 2019 D SEG Segment::DumpSrcNodeInfo():483 srcNode width: 28, height: 28, channel: 1, fmt: 1, isvec: 0, nodeid: 0
Sun Sep  8 22:18:30 2019 D SEG Segment::DumpDstNodeInfo():495 dstNode width: 1, height: 1, channel: 10, fmt: 0,  isvec: 1, nodeid: 8
Sun Sep  8 22:18:30 2019 D SEG Segment::setUp():435 model seg base: 0000000001744090, layers num: 8
Sun Sep  8 22:18:30 2019 D SEG Segment::setUp():442 model layer base: 0x1744090
Sun Sep  8 22:18:30 2019 I create layer(0): preprocess
Sun Sep  8 22:18:30 2019 D DATA DataLayer::setUp():139 param size: 64
Sun Sep  8 22:18:30 2019 D SEG Segment::setUp():442 model layer base: 0x17440d0
Sun Sep  8 22:18:30 2019 I create layer(1): convolution
Sun Sep  8 22:18:30 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:18:30 2019 D SEG Segment::setUp():442 model layer base: 0x1744150
Sun Sep  8 22:18:30 2019 I create layer(2): poolingave
Sun Sep  8 22:18:30 2019 D MAXP PoolingLayer::setUp():27 param size: 96
Sun Sep  8 22:18:30 2019 D SEG Segment::setUp():442 model layer base: 0x17441b0
Sun Sep  8 22:18:30 2019 I create layer(3): convolution
Sun Sep  8 22:18:30 2019 D CONV ConvolutionLayer::setUp():126 param size: 128

观测GPU利用率

仿真CUDA设置

仿真设置在一个绝对路径
Windows:C:\hisilicon\nnie_sim.ini
Linux:/usr/hisilicon/nnie_sim.ini
默认配置是

#===== CONFIG =====
[LAYER_LINEAR_PRINT_EN]
0
[CUDA_CALC_EN]
1

跑默认的sample.exe产生console如下

Sun Sep  8 22:46:50 2019 I ============= nnie_sim.ini setting ==============
Sun Sep  8 22:46:50 2019 I [CUDA_CALC_EN]                   1
Sun Sep  8 22:46:50 2019 I [LOG_LEVEL]                      DBG
Sun Sep  8 22:46:50 2019 I =================================================
SvpSampleCnnClfLenet start ...
SVP Version: HiSVP_PC_V1.1.1.0_B030
build sha1: 86265e2c
      date: Jun 20 2018
      time: 15:41:12
Read version info from WK header
      sdk_arch_ver: 	1.1
      nnie_arch_ver: 	1.1
      nnie_mapper_ver: 	1.1.1.0
      test_ver: 	B030
      patch_ver: 	P000
Sun Sep  8 22:46:55 2019 I >>>>>>>>> createNet >>>>>>>>> virAddr = 0x2ed3fd0
Sun Sep  8 22:46:55 2019 D CUDA cudaHostManager::loadLibray():88 LoadLibrary nnieCUDA1.1d.dll finished
Sun Sep  8 22:46:55 2019 D CUDA cudaHostManager::versionCheck():136 lib version check: nnieCUDA = HiSVP_PC_V1.1.1.0_B030, nnie Simulator = HiSVP_PC_V1.1.1.0_B030
                          CUDA Device(0) Properties
Identify: Quadro M2000
Clock Rate: 1162500 kHz
Max Grid Size: 2147483647 * 65535 * 65535
Max Threads Dim: 1024 * 1024 * 64
Max Threads per Block: 1024
Number of Multiprocessors: 6
32bit Registers Available per Block: 65536
Warp Size: 32 threads
Sun Sep  8 22:46:56 2019 D CUDA NetManager::createNet():45 nnieCUDA create success
Sun Sep  8 22:46:56 2019 D MAIN HI_MPI_SVP_NNIE_LoadModel():152 net ptr addr: 000000000031F870
Sun Sep  8 22:46:56 2019 D NET Net::setDumpPath():385 [DUMP_INDIVIDUAL_EN] = 0
Sun Sep  8 22:46:56 2019 D NET Net::setDumpPath():401 set net DumpPath: ./
Sun Sep  8 22:46:56 2019 D NET Net::setUp():255 model net base: 0000000002ED3FD0, segment num: 1
Sun Sep  8 22:46:56 2019 D NET Net::setUp():263 model seg(0) base: 0000000002ED4010
Sun Sep  8 22:46:56 2019 I create net seg(0), type: 205
Sun Sep  8 22:46:56 2019 D SEG Segment::DumpParamHdr():470 netType: 0 srcNodeNum: 1, dstNodeNum: 1
Sun Sep  8 22:46:56 2019 D SEG Segment::DumpParamHdr():471 tmpBufSize: 4734976, netSegSize: 96, netSegLayerNum: 9
Sun Sep  8 22:46:56 2019 D SEG Segment::DumpParamHdr():472 maxT: 0
Sun Sep  8 22:46:56 2019 D SEG Segment::DumpSrcNodeInfo():483 srcNode width: 28, height: 28, channel: 1, fmt: 1, isvec: 0, nodeid: 0
Sun Sep  8 22:46:56 2019 D SEG Segment::DumpDstNodeInfo():495 dstNode width: 1, height: 1, channel: 10, fmt: 0,  isvec: 1, nodeid: 8
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():435 model seg base: 0000000002ED4090, layers num: 8
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed4090
Sun Sep  8 22:46:56 2019 I create layer(0): preprocess
Sun Sep  8 22:46:56 2019 D DATA DataLayer::setUp():139 param size: 64
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed40d0
Sun Sep  8 22:46:56 2019 I create layer(1): convolution
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed4150
Sun Sep  8 22:46:56 2019 I create layer(2): poolingave
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setUp():27 param size: 96
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed41b0
Sun Sep  8 22:46:56 2019 I create layer(3): convolution
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed4230
Sun Sep  8 22:46:56 2019 I create layer(4): poolingave
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setUp():27 param size: 96
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed4290
Sun Sep  8 22:46:56 2019 I create layer(5): innerproduct
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setUp():24 param size: 112
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed4300
Sun Sep  8 22:46:56 2019 I create layer(6): innerproduct
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setUp():24 param size: 112
Sun Sep  8 22:46:56 2019 D SEG Segment::setUp():442 model layer base: 0x2ed4370
Sun Sep  8 22:46:56 2019 I create layer(7): softmax
Sun Sep  8 22:46:56 2019 D SOFTMAX SoftmaxLayer::setUp():55 param size: 64
Sun Sep  8 22:46:56 2019 D NET Net::setUp():274 seg(0) all param size: 0x3a0
Sun Sep  8 22:46:56 2019 D MAIN HI_MPI_SVP_NNIE_LoadModel():162 paramBuf addr: 0000000002ED43B0, size: 433824
Sun Sep  8 22:46:56 2019 D NET Net::paramWKInit():110 +
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002ED43B0, size: 0x69ea0
Sun Sep  8 22:46:56 2019 D NET Net::paramWKInit():110 -
Sun Sep  8 22:46:56 2019 D MAIN HI_MPI_SVP_NNIE_Forward():902 net ptr addr: 000000000031F870
Sun Sep  8 22:46:56 2019 D NET Net::tmpWKInit():109 +
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0x484000
Sun Sep  8 22:46:56 2019 D NET Net::tmpWKInit():109 -
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192A430, size: 0x40
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192A470, size: 0x20
Sun Sep  8 22:46:56 2019 D NET Net::inputWKsInit():61 +
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192C5C0, size: 0x380
Sun Sep  8 22:46:56 2019 D NET Net::inputWKsInit():114 -
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192C970, size: 0x30
Sun Sep  8 22:46:56 2019 D SEG CNNSegment::forward():10 compute begin
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(0)]: preprocess
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192C5C0, size: 0x380
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0x310
Sun Sep  8 22:46:56 2019 D DATA DataLayer::setAddr():167 [0] inUsrOffset: 0, input base: 0x192c5c0
Sun Sep  8 22:46:56 2019 D DATA DataLayer::setAddr():168 [0] outOffset: 0, output base: 0x4660040
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer0_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(0)]: preprocess
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(1)]: convolution
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0x310
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004698040, size: 0x2d00
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002ED43B0, size: 0
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002ED43B0, size: 0
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002ED43B0, size: 0x1f4
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002ED45A4, size: 0x50
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002ED45F4, size: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():279 [1] inOffset: 0, input base: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():280 [1] outOffset: 0, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():284 [1] input range(784): [0x4660040, 0x4660040]
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():294 [1] output range(11520): [0x4698040, 0x469ad40]
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():302 [1] bnDeltaA param range(0): [0x2ed43b0, 0x2ed43b0], offset: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():304 [1] bnDeltB param range(0): [0x2ed43b0, 0x2ed43b0], offset: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():306 [1] weight param range(500): [0x2ed43b0, 0x2ed45a4], offset: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():308 [1] bias param range(20): [0x2ed45a4, 0x2ed45f4], offset: 0x1f4
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():311 [1] preluNegtiveSlope param range(0): [0x2ed45f4, 0x2ed45f4], offset: 0x244
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::compute():393 GPU conv
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer1_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(1)]: convolution
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(2)]: poolingave
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004698040, size: 0x2d00
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004968040, size: 0xb40
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():92 [2] outOffset: 3178496, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():93 [2] inOffset: 229376, input base: 0
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():97 [2] input range(11520): [0x4698040, 0x469ad40]
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():107 [2] output range(2880): [0x4968040, 0x4968b80]
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer2_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(2)]: poolingave
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(3)]: convolution
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004968040, size: 0xb40
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0xc80
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002ED4600, size: 0
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002ED4600, size: 0
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002ED4600, size: 0x61a8
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002EDA7A8, size: 0xc8
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002EDA870, size: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():279 [3] inOffset: 3178496, input base: 0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():280 [3] outOffset: 0, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():284 [3] input range(2880): [0x4968040, 0x4968040]
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():294 [3] output range(3200): [0x4660040, 0x4660cc0]
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():302 [3] bnDeltaA param range(0): [0x2ed4600, 0x2ed4600], offset: 0x250
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():304 [3] bnDeltB param range(0): [0x2ed4600, 0x2ed4600], offset: 0x250
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():306 [3] weight param range(25000): [0x2ed4600, 0x2eda7a8], offset: 0x250
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():308 [3] bias param range(50): [0x2eda7a8, 0x2eda870], offset: 0x63f8
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::setAddr():311 [3] preluNegtiveSlope param range(0): [0x2eda870, 0x2eda870], offset: 0x64c0
Sun Sep  8 22:46:56 2019 D CONV ConvolutionLayer::compute():393 GPU conv
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer3_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(3)]: convolution
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(4)]: poolingave
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0xc80
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004A1C040, size: 0x320
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():92 [4] outOffset: 3915776, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():93 [4] inOffset: 0, input base: 0
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():97 [4] input range(3200): [0x4660040, 0x4660cc0]
Sun Sep  8 22:46:56 2019 D MAXP PoolingLayer::setAddr():107 [4] output range(800): [0x4a1c040, 0x4a1c360]
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer4_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(4)]: poolingave
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(5)]: innerproduct
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004A1C040, size: 0x320
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0x1f4
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002EDA870, size: 0x61a80
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002F3C2F0, size: 0x7d0
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002F3CAC0, size: 0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():106 [5] inOffset: 0x3bc000, input base: 0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():107 [5] outOffset: 0, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():109 [5] weight param range(400000): [0x2eda870, 0x2f3c2f0], offset: 0x64c0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():111 [5] bias param range(2000): [0x2f3c2f0, 0x2f3cac0], offset: 0x67f40
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():113 [5] BNDeltaA param range(0): [0, 0], offset: 0x68710
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():115 [5] BNDeltaA param range(?): [0, ?], offset: 0x68710
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():119 [5] input range(800): [0x4a1c040, 0x4a1c360]
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():129 [5] output range(500): [0x4660040, 0x4660234]
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer5_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(5)]: innerproduct
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(6)]: innerproduct
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000004660040, size: 0x1f4
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000477A040, size: 0x28
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002F3CAC0, size: 0x1388
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002F3DE48, size: 0x28
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<int>::init():175 base: 0000000002F3DE70, size: 0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():106 [6] inOffset: 0, input base: 0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():107 [6] outOffset: 0x11a000, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():109 [6] weight param range(5000): [0x2f3cac0, 0x2f3de48], offset: 0x68710
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():111 [6] bias param range(40): [0x2f3de48, 0x2f3de70], offset: 0x69a98
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():113 [6] BNDeltaA param range(0): [0, 0], offset: 0x69ac0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():115 [6] BNDeltaA param range(?): [0, ?], offset: 0x69ac0
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():119 [6] input range(500): [0x4660040, 0x4660234]
Sun Sep  8 22:46:56 2019 D IP InnerProductLayer::setAddr():129 [6] output range(10): [0x477a040, 0x477a068]
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer6_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(6)]: innerproduct
Sun Sep  8 22:46:56 2019 I start [seg(0) layer(7)]: softmax
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000477A040, size: 0x28
Sun Sep  8 22:46:56 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192C970, size: 0x30
Sun Sep  8 22:46:56 2019 D SOFTMAX SoftmaxLayer::setAddr():141 [7] inOffset: 1155072, input base: 0
Sun Sep  8 22:46:56 2019 D SOFTMAX SoftmaxLayer::setAddr():142 [7] outOffset: 0, output base: 0x192c970
Sun Sep  8 22:46:56 2019 D SOFTMAX SoftmaxLayer::setAddr():146 [7] input range(10): [0x477a040, 0x477a068]
Sun Sep  8 22:46:56 2019 D SOFTMAX SoftmaxLayer::setAddr():161 [7] output range(10): [0x192c970, 0x192c9a0]
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:56 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer7_output0_func
Sun Sep  8 22:46:56 2019 I end   [seg(0) layer(7)]: softmax
fseek return value is 0
fseek return value is 0
Leaf0, Pic0 --> expected label: 0
Top0: index --    0, confidence -- 1.0000000
Top1: index --    1, confidence -- 0.0000000
Top2: index --    2, confidence -- 0.0000000
Top3: index --    3, confidence -- 0.0000000
Top4: index --    4, confidence -- 0.0000000
cudaSetDevice(0)
Sun Sep  8 22:46:56 2019 D CUDA cudaHostManager::freeLibray():111 FreeLibrary nnieCUDA1.1d.dll finished
Sun Sep  8 22:46:56 2019 I <<<<<<<<< deleteNet <<<<<<<<< virAddr = 0x2ed3fd0
SvpSampleCnnClfLenet end ...
SvpSampleCnnClfAlexnet start ...
SVP Version: HiSVP_PC_V1.1.1.0_B030
build sha1: 86265e2c
      date: Jun 20 2018
      time: 15:41:12
Read version info from WK header
      sdk_arch_ver: 	1.1
      nnie_arch_ver: 	1.1
      nnie_mapper_ver: 	1.1.1.0
      test_ver: 	B030
      patch_ver: 	P000
Sun Sep  8 22:46:56 2019 I >>>>>>>>> createNet >>>>>>>>> virAddr = 0x75f0040
Sun Sep  8 22:46:56 2019 D CUDA cudaHostManager::loadLibray():88 LoadLibrary nnieCUDA1.1d.dll finished
                          CUDA Device(0) Properties
Identify: Quadro M2000
Clock Rate: 1162500 kHz
Max Grid Size: 2147483647 * 65535 * 65535
Max Threads Dim: 1024 * 1024 * 64
Max Threads per Block: 1024
Number of Multiprocessors: 6
32bit Registers Available per Block: 65536
Warp Size: 32 threads
Sun Sep  8 22:46:56 2019 D CUDA NetManager::createNet():45 nnieCUDA create success
Sun Sep  8 22:46:56 2019 D MAIN HI_MPI_SVP_NNIE_LoadModel():152 net ptr addr: 0000000007222010
Sun Sep  8 22:46:56 2019 D NET Net::setDumpPath():385 [DUMP_INDIVIDUAL_EN] = 0
Sun Sep  8 22:46:56 2019 D NET Net::setDumpPath():401 set net DumpPath: ./
Sun Sep  8 22:46:57 2019 D NET Net::setUp():255 model net base: 00000000075F0040, segment num: 1
Sun Sep  8 22:46:57 2019 D NET Net::setUp():263 model seg(0) base: 00000000075F0080
Sun Sep  8 22:46:57 2019 I create net seg(0), type: 205
Sun Sep  8 22:46:57 2019 D SEG Segment::DumpParamHdr():470 netType: 0 srcNodeNum: 1, dstNodeNum: 1
Sun Sep  8 22:46:57 2019 D SEG Segment::DumpParamHdr():471 tmpBufSize: 221147136, netSegSize: 96, netSegLayerNum: 16
Sun Sep  8 22:46:57 2019 D SEG Segment::DumpParamHdr():472 maxT: 0
Sun Sep  8 22:46:57 2019 D SEG Segment::DumpSrcNodeInfo():483 srcNode width: 227, height: 227, channel: 3, fmt: 1, isvec: 0, nodeid: 0
Sun Sep  8 22:46:57 2019 D SEG Segment::DumpDstNodeInfo():495 dstNode width: 1, height: 1, channel: 1000, fmt: 0,  isvec: 1, nodeid: 15
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():435 model seg base: 00000000075F0100, layers num: 15
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0100
Sun Sep  8 22:46:57 2019 I create layer(0): preprocess
Sun Sep  8 22:46:57 2019 D DATA DataLayer::setUp():139 param size: 64
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0140
Sun Sep  8 22:46:57 2019 I create layer(1): convolution
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f01c0
Sun Sep  8 22:46:57 2019 I create layer(2): lrn
Sun Sep  8 22:46:57 2019 D LRN LrnLayer::setUp():77 param size: 80
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0210
Sun Sep  8 22:46:57 2019 I create layer(3): poolingave
Sun Sep  8 22:46:57 2019 D MAXP PoolingLayer::setUp():27 param size: 96
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0270
Sun Sep  8 22:46:57 2019 I create layer(4): convolution
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f02f0
Sun Sep  8 22:46:57 2019 I create layer(5): lrn
Sun Sep  8 22:46:57 2019 D LRN LrnLayer::setUp():77 param size: 80
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0340
Sun Sep  8 22:46:57 2019 I create layer(6): poolingave
Sun Sep  8 22:46:57 2019 D MAXP PoolingLayer::setUp():27 param size: 96
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f03a0
Sun Sep  8 22:46:57 2019 I create layer(7): convolution
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0420
Sun Sep  8 22:46:57 2019 I create layer(8): convolution
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f04a0
Sun Sep  8 22:46:57 2019 I create layer(9): convolution
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setUp():126 param size: 128
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0520
Sun Sep  8 22:46:57 2019 I create layer(10): poolingave
Sun Sep  8 22:46:57 2019 D MAXP PoolingLayer::setUp():27 param size: 96
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0580
Sun Sep  8 22:46:57 2019 I create layer(11): innerproduct
Sun Sep  8 22:46:57 2019 D IP InnerProductLayer::setUp():24 param size: 112
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f05f0
Sun Sep  8 22:46:57 2019 I create layer(12): innerproduct
Sun Sep  8 22:46:57 2019 D IP InnerProductLayer::setUp():24 param size: 112
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f0660
Sun Sep  8 22:46:57 2019 I create layer(13): innerproduct
Sun Sep  8 22:46:57 2019 D IP InnerProductLayer::setUp():24 param size: 112
Sun Sep  8 22:46:57 2019 D SEG Segment::setUp():442 model layer base: 0x75f06d0
Sun Sep  8 22:46:57 2019 I create layer(14): softmax
Sun Sep  8 22:46:57 2019 D SOFTMAX SoftmaxLayer::setUp():55 param size: 64
Sun Sep  8 22:46:57 2019 D NET Net::setUp():274 seg(0) all param size: 0x690
Sun Sep  8 22:46:57 2019 D MAIN HI_MPI_SVP_NNIE_LoadModel():162 paramBuf addr: 00000000075F0710, size: 62411792
Sun Sep  8 22:46:57 2019 D NET Net::paramWKInit():110 +
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 00000000075F0710, size: 0x3b85410
Sun Sep  8 22:46:57 2019 D NET Net::paramWKInit():110 -
Sun Sep  8 22:46:57 2019 D MAIN HI_MPI_SVP_NNIE_Forward():902 net ptr addr: 0000000007222010
Sun Sep  8 22:46:57 2019 D NET Net::tmpWKInit():109 +
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000B180040, size: 0xd2e7000
Sun Sep  8 22:46:57 2019 D NET Net::tmpWKInit():109 -
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192CBA0, size: 0x40
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000192CBE0, size: 0x30
Sun Sep  8 22:46:57 2019 D NET Net::inputWKsInit():61 +
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002ED3FD0, size: 0x27e70
Sun Sep  8 22:46:57 2019 D NET Net::inputWKsInit():114 -
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002EFBE70, size: 0xfa0
Sun Sep  8 22:46:57 2019 D SEG CNNSegment::forward():10 compute begin
Sun Sep  8 22:46:57 2019 I start [seg(0) layer(0)]: preprocess
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000002ED3FD0, size: 0x27e70
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000B180040, size: 0x25bdb
Sun Sep  8 22:46:57 2019 D DATA DataLayer::setAddr():167 [0] inUsrOffset: 0, input base: 0x2ed3fd0
Sun Sep  8 22:46:57 2019 D DATA DataLayer::setAddr():168 [0] outOffset: 0, output base: 0xb180040
Sun Sep  8 22:46:57 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:57 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer0_output0_func
Sun Sep  8 22:46:57 2019 I end   [seg(0) layer(0)]: preprocess
Sun Sep  8 22:46:57 2019 I start [seg(0) layer(1)]: convolution
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000B180040, size: 0x25bdb
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000D967040, size: 0x46e60
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<int>::init():175 base: 00000000075F0710, size: 0
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<int>::init():175 base: 00000000075F0710, size: 0
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 00000000075F0710, size: 0x8820
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<int>::init():175 base: 00000000075F8F30, size: 0x180
Sun Sep  8 22:46:57 2019 D MAIN MemoryWK<int>::init():175 base: 00000000075F90B0, size: 0
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():279 [1] inOffset: 0, input base: 0
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():280 [1] outOffset: 0, output base: 0x2efbe70
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():284 [1] input range(154587): [0xb180040, 0xb180040]
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():294 [1] output range(290400): [0xd967040, 0xd9adea0]
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():302 [1] bnDeltaA param range(0): [0x75f0710, 0x75f0710], offset: 0
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():304 [1] bnDeltB param range(0): [0x75f0710, 0x75f0710], offset: 0
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():306 [1] weight param range(34848): [0x75f0710, 0x75f8f30], offset: 0
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():308 [1] bias param range(96): [0x75f8f30, 0x75f90b0], offset: 0x8820
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::setAddr():311 [1] preluNegtiveSlope param range(0): [0x75f90b0, 0x75f90b0], offset: 0x89a0
Sun Sep  8 22:46:57 2019 D CONV ConvolutionLayer::compute():393 GPU conv
Sun Sep  8 22:46:58 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:58 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer1_output0_func
Sun Sep  8 22:46:58 2019 I end   [seg(0) layer(1)]: convolution
Sun Sep  8 22:46:58 2019 I start [seg(0) layer(2)]: lrn
Sun Sep  8 22:46:58 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000D967040, size: 0x46e60
Sun Sep  8 22:46:58 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000012BE7040, size: 0x46e60
Sun Sep  8 22:46:58 2019 D LRN LrnLayer::setAddr():158 [2] inOffset: 41840640, input base: 0
Sun Sep  8 22:46:58 2019 D LRN LrnLayer::setAddr():159 [2] outOffset: 128348160, output base: 0x2efbe70
Sun Sep  8 22:46:58 2019 D LRN LrnLayer::setAddr():163 [2] input range(290400): [0xd967040, 0xd967040]
Sun Sep  8 22:46:58 2019 D LRN LrnLayer::setAddr():173 [2] output range(290400): [0x12be7040, 0x12be7040]
Sun Sep  8 22:46:59 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:59 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer2_output0_func
Sun Sep  8 22:46:59 2019 I end   [seg(0) layer(2)]: lrn
Sun Sep  8 22:46:59 2019 I start [seg(0) layer(3)]: poolingave
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 0000000012BE7040, size: 0x46e60
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000B180040, size: 0x11160
Sun Sep  8 22:46:59 2019 D MAXP PoolingLayer::setAddr():92 [3] outOffset: 0, output base: 0x2efbe70
Sun Sep  8 22:46:59 2019 D MAXP PoolingLayer::setAddr():93 [3] inOffset: 128348160, input base: 0
Sun Sep  8 22:46:59 2019 D MAXP PoolingLayer::setAddr():97 [3] input range(290400): [0x12be7040, 0x12c2dea0]
Sun Sep  8 22:46:59 2019 D MAXP PoolingLayer::setAddr():107 [3] output range(69984): [0xb180040, 0xb1911a0]
Sun Sep  8 22:46:59 2019 D NET tools::dumpLayerBase():447 dumpLayerBase Path: ./
Sun Sep  8 22:46:59 2019 D NET tools::dumpLayerBase():448 dumpLayerBase filename: seg0_layer3_output0_func
Sun Sep  8 22:46:59 2019 I end   [seg(0) layer(3)]: poolingave
Sun Sep  8 22:46:59 2019 I start [seg(0) layer(4)]: convolution
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000B180040, size: 0x11160
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 000000000C338040, size: 0x2d900
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<int>::init():175 base: 00000000075F90B0, size: 0
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<int>::init():175 base: 00000000075F90B0, size: 0
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<unsigned char>::init():175 base: 00000000075F90B0, size: 0x96000
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<int>::init():175 base: 000000000768F0B0, size: 0x400
Sun Sep  8 22:46:59 2019 D MAIN MemoryWK<int>::init():175 base: 000000000768F4B0, size: 0
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():279 [4] inOffset: 0, input base: 0
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():280 [4] outOffset: 0, output base: 0x2efbe70
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():284 [4] input range(69984): [0xb180040, 0xb180040]
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():294 [4] output range(186624): [0xc338040, 0xc365940]
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():302 [4] bnDeltaA param range(0): [0x75f90b0, 0x75f90b0], offset: 0x89a0
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():304 [4] bnDeltB param range(0): [0x75f90b0, 0x75f90b0], offset: 0x89a0
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():306 [4] weight param range(614400): [0x75f90b0, 0x768f0b0], offset: 0x89a0
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():308 [4] bias param range(256): [0x768f0b0, 0x768f4b0], offset: 0x9e9a0
Sun Sep  8 22:46:59 2019 D CONV ConvolutionLayer::setAddr():311 [4] preluNegtiveSlope param range(0): [0x768f4b0, 0x768f4b0], offset: 0x9eda0

使用CUDA后

  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值