mxnet代码剖析之--mshadow篇

### -------------------------------------------------------------------------
Basic Class:

1 shape: kDimension, kSubDim, shape_[]
                 size(), slice(), subShape()

2 stream: <gpu> stream_, blas_handle_, blas_handle_ownership_
                                               dnn_handle_, dnn_handle_ownership_
Wait(): cudaStreamSynchronize(stream_)
CheckIdle(): cudaStreamQuery(stream_);
GetStream():

3 Tensor: kDevCPU, kSubDim, *dptr_, shape_, stride_, stream_
             MemSize()

4 Saver: Save(DType &a, DType b) {a = b}
            plusto, minusto, multo, divto

5 Red(reduce): sum, maximum, minimum

6 Expression engine:
6.1 type: kRValue, kMapper, kChainer, kComplex
6.2        ScalarExp
TypecastExp
TransposeExp
RValueExp
DotExp: Eval() --> BLASEngine<xpu, type>gemm()
BLASEngine: gemm(), gemv(), ger(), dot()
BLASEngine<cpu, float>: gemm(), gemv(), ger(), dot()
BLASEngine<cpu, double>: gemm(), gemv(), ger(), dot()
BLASEngine<gpu, float>: gemm(), gemv(), ger(), dot()
BLASGngine<gpu, double>: gemm(), gemv(), ger(), dot()

BinaryMapExp:
UnaryMapExp:

7 IStream: Read()
                   Write()
                   SaveBinary()
                   LoadBinary()

8 TensorContainer: pad_, data_
9 TShape: ndim_, num_heap_allocated_, data_stack_[kStackCache], kStackCache, *data_heap_
10 TBlob: *dptr_, shape_, stride_, dev_mask_, type_flag_

### -------------------------------------------------------------------------
namespace mshadow-ps

1 ThreadPQueue: use_fifo, pqueue_, fqueue_, lock_, counter_
                                  push(), pop(), abort()
2 ThreadSafeMap: lock, map

### -------------------------------------------------------------------------
Some public method:

1 InitTensorEngine(int device_id = 0):
cpu: nothing to do
gpu: check device_id valid, setDevice(device_id)

2 ShutdownTensorEngine(void): nothing
3 SetDevice(int devid):

4 NewStream<Device>(true, false):
5 DeleteStream(Stream<Device> *stream):

6 AllocSpace(cpu/gpu): attention pitch!
7 FreeSpace(cpu/gpu):

8 NewTensor(): ??
9 Copy(cpu <-> gpu): wrap cudaMemcpy2D()

10 Softmax(cpu/gpu):
11 SoftmaxGrad(cpu/gpu):
12 MapExp(cpu/gpu):
13 MapReduceKeepLowest(cpu/gpu):
14 MapReduceKeepHighDim(cpu/gpu):
15 VectorDot(dst, lhs, rhs):

### ------------------------------------------------------------------------- 
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值