cuda z linux,Ubuntu 14.04 安装配置CUDA

5. 验证安装是否成功

5.1. 驱动验证

首先,验证nvidia的驱动是否安装成功。

~$ cat /proc/driver/nvidia/version

NVRM version: NVIDIA UNIX x86_64 Kernel Module  340.29  Thu Jul 31 20:23:19 PDT 2014

GCC version:  gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)

5.2. Toolkit验证

验证cuda toolkit是否成功。

~$ nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2014 NVIDIA Corporation

Built on Thu_Jul_17_21:41:27_CDT_2014

Cuda compilation tools, release 6.5, V6.5.12

5.3. 设备识别

使用cuda sample已经编译好的deviceQuery来验证。deviceQuery在/bin/x_86_64/linux/release目录下。我的结果如下,检测出了两块GPU来。

~/install/NVIDIA_CUDA-6.5_Samples/bin/x86_64/linux/release$ ./deviceQuery

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: "Tesla K20c"

CUDA Driver Version / Runtime Version          6.5 / 6.5

CUDA Capability Major/Minor version number:    3.5

Total amount of global memory:                4800 MBytes (5032706048 bytes)

(13) Multiprocessors, (192) CUDA Cores/MP:    2496 CUDA Cores

GPU Clock rate:                                706 MHz (0.71 GHz)

Memory Clock rate:                            2600 Mhz

Memory Bus Width:                              320-bit

L2 Cache Size:                                1310720 bytes

Maximum Texture Dimension Size (x,y,z)        1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers

Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers

Total amount of constant memory:              65536 bytes

Total amount of shared memory per block:      49152 bytes

Total number of registers available per block: 65536

Warp size:                                    32

Maximum number of threads per multiprocessor:  2048

Maximum number of threads per block:          1024

Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)

Maximum memory pitch:                          2147483647 bytes

Texture alignment:                            512 bytes

Concurrent copy and kernel execution:          Yes with 2 copy engine(s)

Run time limit on kernels:                    No

Integrated GPU sharing Host Memory:            No

Support host page-locked memory mapping:      Yes

Alignment requirement for Surfaces:            Yes

Device has ECC support:                        Enabled

Device supports Unified Addressing (UVA):      Yes

Device PCI Bus ID / PCI location ID:          3 / 0

Compute Mode:

< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: "Quadro K4000"

CUDA Driver Version / Runtime Version          6.5 / 6.5

CUDA Capability Major/Minor version number:    3.0

Total amount of global memory:                3071 MBytes (3220504576 bytes)

( 4) Multiprocessors, (192) CUDA Cores/MP:    768 CUDA Cores

GPU Clock rate:                                811 MHz (0.81 GHz)

Memory Clock rate:                            2808 Mhz

Memory Bus Width:                              192-bit

L2 Cache Size:                                393216 bytes

Maximum Texture Dimension Size (x,y,z)        1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers

Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers

Total amount of constant memory:              65536 bytes

Total amount of shared memory per block:      49152 bytes

Total number of registers available per block: 65536

Warp size:                                    32

Maximum number of threads per multiprocessor:  2048

Maximum number of threads per block:          1024

Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)

Maximum memory pitch:                          2147483647 bytes

Texture alignment:                            512 bytes

Concurrent copy and kernel execution:          Yes with 1 copy engine(s)

Run time limit on kernels:                    Yes

Integrated GPU sharing Host Memory:            No

Support host page-locked memory mapping:      Yes

Alignment requirement for Surfaces:            Yes

Device has ECC support:                        Disabled

Device supports Unified Addressing (UVA):      Yes

Device PCI Bus ID / PCI location ID:          4 / 0

Compute Mode:

< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

> Peer access from Tesla K20c (GPU0) -> Quadro K4000 (GPU1) : No

> Peer access from Quadro K4000 (GPU1) -> Tesla K20c (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 2, Device0 = Tesla K20c, Device1 = Quadro K4000

Result = PASS

这样,cuda就安装成功了。

0b1331709591d260c1c78e86d0c51c18.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值