KALDI学习笔记——The CUDA Matrix library

最近一直在看KALDI官网的资料,在看的同时加一些注解,方便自己的理解。

我的学习笔记基本上都是转自KALDI官方网址http://kaldi.sourceforge.net,并加上我的注解,特此说明。

The CUDA Matrix libary(CUDA Matrix库)

注:CUDA是NVIDIA公司的并行计算架构,该架构通过利用GPU的处理能力,可大幅提升计算性能。

The CUDA matrix library seamless wrapper of CUDA computation.

注:CUDA Matrix库可以无缝的包装CUDA运算。

Its purpose is to separate the low level CUDA-dependent routines from the high level C++ code.

注:其目的在于将低层依赖CUDA的行为从高层C++code中分离出来。

The library can be both compiled with or without CUDA libraries, depending on theHAVE_CUDA==1 macro. Without CUDA, the library backs-off to computation on host processor. The host processor is also used when the toolkit is compiled with CUDA and no suitable GPU is detected. This is particularly useful in heterogenous ``grid-like'' environments.

注:不论是否有CUDA库,CUDA Matrix库都能够编译,依赖于设置HAVE_CUDA==1。如果没有CUDA将使用主机处理器来进行编译。

Computationally, the library is based on CUBLAS linear algebra operations, and manually implemened grid-like kernels for the non-linear operations, which are conforming with the Map'' pattern. While most of theReduce'' kernels do use the tree-like computational pattern in conjunction with extensive use of the shared memory.

注:CUBLAS,是NVIDIA的一个GPU的blas库,提供的计算函数都在GPU上执行。

classes

The most important classes are:CuDevice CuMatrix CuVector CuStlVector.

注:主要的类,CuDevice CuMatrix CuVector CuStlVector.

CuDevice : is an abstraction of the GPU board, it is a singleton object which initializes CUBLAS library upon the application startup, and releases it at the end. It is also used to collect the profiling statistics.

CuMatrix :is a GPU analogy of the Matrix class. It holds a buffer in the GPU global memory, as well as a backup CPU buffer. It implements a subset of the Matrix interface. The host-GPU transfers are done by CopyFromMat CopyToMat methods, which may internally reallocate the buffers.

CuVector :is a GPU analogy of the Vector class. It holds a buffer in the GPU global memory, as well as a backup CPU buffer. It implements a subset of the Vector interface. The host-GPU transfers are done by CopyFromVec CopyToVec methods, which may internally reallocate the buffers.

CuStlVector :is particularly useful to create vectors of indices (int32)

mathematical operations

In cu-math.h are math functions which cannot be associated solely to a vector or a matrix. There are concentrated in the namespace cu::, in order to separate them from global namespace.

kernels

The CUDA kernels are concentrated in thecu-kernels.cu file. Since the CUDA code is compiled by NVCC, and the rest of the code is compiled by different compiler, the only possible way of interatation was to employ ANSI C interface cu-kernels.h, which represents a low-level interface to CUDA. The high level interface is via CuMatrix, CuVectorand functions in the cu:: namespace.


  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值