大多数人认为,神经网络推算都应该留给GPU,TPU各种专用ASIC运算,实际,在ARM上也可以实现神经网络推算.
在很久之前(半年前吧),ARM已经推出了CMSIS-NN,但是,大家都不太感兴趣啊.
实际上,在ARM实现NN算法,所需的开销不算很大,当然,不是在线学习了,学习训练成本比推理要大出很多.
实际开销(以官方例子),下面所有开销,都可以在需要时候再申请,用完就释放:
scratch_buffer ≈ 40KB
col_buffer ≈ 3KB
CNN_IMG_SIZE*CNN_IMG_SIZE*NUM_OUT_CH,比如3通道颜色,分别是RGB,CNN大小是32,则3*32*32=3072 = 3KB开销.
输出的基本缓冲区 IP1_OUT_DIM , 例子中取10字节,根据可能出现的可能性决定.
其实给了个很好的例子,我在此总结一下.
首先,你需要在PC上安装caffe深度学习框架,大致安装流程如下.
1)准备系统环境.
apt install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
apt install libboost-all-dev
apt install libatlas-base-dev
apt install libgflags-dev libgoogle-glog-dev liblmdb-dev
apt install python-dev
2)安装Anaconda2.
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda2-5.0.1-Linux-x86_64.sh
3)下载Caffe.
git clone https://github.com/BVLC/caffe.git
4)在bashrc中导出相关环境变量.
export PATH=/root/anaconda2/bin:$PATH
export PYTHONPATH=/root/caffe/python:$PYTHONPATH
5)复制配置文件,然后修改.
cp Makefile.config.example Makefile.config
6)修改参考.
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
# USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support)
CPU_ONLY := 1
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
# This code is taken from https://github.com/sh1r0/caffe-android-li
# USE_HDF5 := 0
# unc