深度学习C++代码配套教程（2. 基础数据操作）

最新推荐文章于 2022-02-11 14:22:28 发布

闵帆

最新推荐文章于 2022-02-11 14:22:28 发布

阅读量1.8k

点赞数 2

分类专栏：深度学习C++代码文章标签：深度学习 c++

本文链接：https://blog.csdn.net/minfanphd/article/details/113803461

版权

深度学习C++代码专栏收录该内容

13 篇文章 9 订阅

订阅专栏

本文介绍了深度学习C++代码中的基础数据操作，包括Activator类及其管理的激活函数，如Sigmoid、Tanh、ReLU等，以及MfSize、MfIntArray和MfDoubleMatrix类的使用。通过这些类，程序员能够灵活处理矩阵计算和数据转换，为后续神经网络实现打下基础。

摘要由CSDN通过智能技术生成

导航栏

深度学习C++代码 (位于 Github)
深度学习C++代码配套教程（1. 总述）
深度学习C++代码配套教程（2. 基础数据操作）
深度学习C++代码配套教程（3. 数据文件读取）
深度学习C++代码配套教程（4. ANN 经典神经网络）
深度学习C++代码配套教程（5. CNN 卷积神经网络）

本贴介绍基础数据操作.

1. Activator 类

该类管理不同的激活函数.
成员变量有 3 个, 即

char activationFunction;
double gamma;
double beta;

成员方法说明如下:

//The default constructor
Activator();
//The constructor
Activator(char paraActivationFunction);
//The destructor
virtual ~Activator();
//Convert to string for display.

//多数类有这个方法, 以方便调拭.
string toString();
//Set activation function
void setActivationFunction(char paraFunction);
//Setter. 设置 1 次.
void setGamma(double paraGamma);
//Setter. 设置 1 次.
void setBeta(double paraBeta);

//The sigmoid activation function
double sigmoid(double paraValue);
//The sigmoid derive function. 
double sigmoidDerive(double paraValue);

//The tanh activation function
double tanh(double paraValue);
//The tanh derive function. 
double tanhDerive(double paraValue);

//The hard-logistic activation function
double hardLogistic(double paraValue);
//The hard-logistic derive function
double hardLogisticDerive(double paraValue);

//The hard-tanh activation function
double hardTanh(double paraValue);
//The relu activation function
double relu(double paraValue);
//The LeakyReLU activation function
double leakyRelu(double paraValue);
//The ELU activation function
double elu(double paraValue);
//The Softplus activation function
double softplus(double paraValue);
//The Softsign activation function
double softsign(double paraValue);
//The Swish activation function
double swish(double paraValue);
//The GELU activation function
double gelu(double paraValue);

//Activate. 根据当前的激活函数, 调用相应的方法.
double activate(double paraValue);
//Derive. 对激活函数进行求导, 调用相应的方法.
double derive(double paraValue);

//Unit test. 
//多数类有此方法, 以支持单元测试.
void unitTest();

其中, activate() 和 derive() 是 public 方法, 具体的激活函数都通过它们进行访问, 所以申明为 private 的. 某些激活函数需要 gamma, beta 等参数，应通过相应的 setter 预先赋值.
如果需要, 还可增加其它的激活函数.

2. MfSize 类

该类管理一个二维数据 (如图片) 的大小, 仅有两个成员变量:

int width;
int height;

当前主要用于 CNN. 注意它的 width/height 对应于其它地方的 rows/columns, 需要仔细校对. 成员方法说明如下:

//The default constructor.
MfSize();
//The constructor with enough parameters.
MfSize(int, int);
//The destructor.
virtual ~MfSize();
//Set width and height.
void setValues(int, int);
//Clone the size to me.
MfSize* cloneToMe(MfSize* paraFirstSize);
//Divide two sizes, the result is stored to me.
MfSize* divideToMe(MfSize*, MfSize*);
//Subtract two sizes, and append a value on both directions. 
//The result is stored to me.
MfSize* subtractToMe(MfSize*, MfSize*, int);
//For display
string toString();
//Unit test.
void unitTest();

这里的几个方法以 ToMe 结尾, 表示结果存放于 this 对象.

3. MfIntArray 类

该类管理一个整数数组, 例如, 可以表示随机顺序, 或一个数据集的标签向量 $Y$ . 使用它是因为 C++ 的 int* 需要另一个数据指定其长度, 而 int[] 又不利于动态空间的分配. 在 Java 中就不需要该类.
仅有两个成员变量:

int length;
int* data;

成员方法说明如下:

//Default constructor
MfIntArray();
//Initialize it with the given length
MfIntArray(int paraLength);
//Initialize it with the given array
MfIntArray(int paraLength, int* paraValues);
//Destructor
virtual ~MfIntArray();
//Convert to string
string toString();

//Set one value
void setValue(int paraPosition, int paraValue);
//Get the value at the given position
int getValue(int paraPosition);
//Get length. No setLength enabled
int getLength();

//Copy from another array.
//与 ToMe 同理, 需要本对象的长度与给定对象相同.
void copyFrom(MfIntArray* paraArray);
//Randomize the order.
//如 [3, 5, 0, 2, 4, 1], 支持间址获得数据的随机顺序. 见 MfDataReader 类.
void randomizeOrder();

//Code unit test
void unitTest();

该类不支持对 length 的调整, 至少现在没有这个需求.

4. MfDoubleMatrix 类

二维实数数组可以存放非常多的东西, 如数据的 $X$ , CNN 的 kernel 等等.
为了存放一维数据, 只需要将其行数设置为 1 即可.
成员变量说明如下:

int rows;
int columns;
double** data;
Activator* activator;

不支持对 rows 和 columns 的调整, 以方便空间的管理.
成员方法说明如下:

//The default constructor. 实际上并未使用到.
MfDoubleMatrix();
//Initialize a matrix with given sizes.
//所有的值初始化为 [0, 1] 区间的随机数. 可通过 fill(double, double) 改变.
MfDoubleMatrix(int paraRows, int paraColumns);
//Destructor.
virtual ~MfDoubleMatrix();
//Convert to string for display.
string toString();

//Getter.
int getRows();
//Getter.
int getColumns();
//Set a value at the given position.
double setValue(int paraRow, int paraColumn, double paraValue);
//Get a value at the given position.
double getValue(int paraRow, int paraColumn);

//Range check.
//看各分量是否在相应区间之内, 用于调拭.
bool rangeCheck(double paraLowerBound, double paraUpperBound);

//Getter.
//本方法将数据暴露任意位置, 比较危险; 
//但同时它允许我们用data[i][j] 的方式进行方便的数据操作, 所以经常被使用.
double** getData();

//Setter.
void setActivator(Activator* paraActivator);
//Getter.
Activator* getActivator();
//Activate, return myself.
//使用先设置的 activator.
//支持对矩阵所有分量的激活/求导, 在 forward 和 backPropagation中有用.
MfDoubleMatrix* activate();

//Copy a matrix.
MfDoubleMatrix* cloneToMe(MfDoubleMatrix*);
//Add another one with the same size to me, no space allocation.
MfDoubleMatrix* addToMe(MfDoubleMatrix*, MfDoubleMatrix*);
//Each element adds the same value.
MfDoubleMatrix* addValueToMe(double);
//Each element times the same value.
MfDoubleMatrix* timesValueToMe(double);
//1 - each element.
MfDoubleMatrix* oneValueToMe();
//Minus another one with the same size to me.
MfDoubleMatrix* subtractToMe(MfDoubleMatrix*, MfDoubleMatrix*);
//Point-to-point multiply another one with the same size to me.
MfDoubleMatrix* cwiseProductToMe(MfDoubleMatrix*, MfDoubleMatrix*);
//Times to me, return myself, m*n times n*k gets m*k.
MfDoubleMatrix* timesToMe(MfDoubleMatrix*, MfDoubleMatrix*);
//Transpose to me.
MfDoubleMatrix* transposeToMe(MfDoubleMatrix*);

//Fill the matrix with the same value.
void fill(double paraValue);
//Fill the matrix with a random value between the bounds.
void fill(double paraLowerBound, double paraUpperBound);

//Convolution valid, the size is smaller, return myself.
//CNN forward 时使用.
MfDoubleMatrix* convolutionValidToMe(MfDoubleMatrix *paraData, MfDoubleMatrix *paraKernel);
//Convolution full, the size is bigger, return myself.
//CNN backPropagation 时使用.
MfDoubleMatrix* convolutionFullToMe(MfDoubleMatrix *paraData, MfDoubleMatrix *paraKernel);

//Rotate 180 degrees.
MfDoubleMatrix* rotate180ToMe(MfDoubleMatrix* paraMatrix);
//Scale the matrix with the given size to me.
MfDoubleMatrix* scaleToMe(MfDoubleMatrix* paraMatrix, MfSize* paraSize);
//Kronecker: copy many times.
MfDoubleMatrix* kroneckerToMe(MfDoubleMatrix* paraMatrix, MfSize* paraSize);
//Derive each element.
MfDoubleMatrix* deriveToMe(MfDoubleMatrix* paraMatrix);

//Sum up to a value. 将所有数据叠加.
double sumUp();

//Code unit test
void unitTest();

该类以前存在一些非 ToMe 方法, 即可以动态分配空间, 但最近我发现都不需要, 所以注释掉了. 安全第一.
下面以一个例子来说明该类的风格.

MfDoubleMatrix* MfDoubleMatrix::subtractToMe(MfDoubleMatrix* paraFirstMatrix, MfDoubleMatrix* paraSecondMatrix)
{
    if ((rows != paraFirstMatrix->rows) || (rows != paraSecondMatrix->rows))
    {
        printf("MfDoubleMatrix.subtractToMe(), rows do not match.");
        throw "Rows do not match.";
    }//Of if

    if ((columns != paraFirstMatrix->columns) || (columns != paraSecondMatrix->columns))
    {
        printf("MfDoubleMatrix.subtractToMe(), columns do not match.");
        throw "Columns do not match.";
    }//Of if

    double** tempFirstData = paraFirstMatrix->data;
    double** tempSecondData = paraSecondMatrix->data;
    for (int i = 0; i < rows; i ++)
    {
        for (int j = 0; j < columns; j ++)
        {
            data[i][j] = tempFirstData[i][j] - tempSecondData[i][j];
        }//Of for j
    }//Of for i

    return this;
}//Of subtractToMe

几点说明:

tempFirstData 和 tempSecondData 仅仅是两个对于相应数据的指针 (在 java 中为引用), 它们并不涉及新的动态空间分配 (除了指针本身占用的 4 个字节). 在后面用数组的方式进行赋值, 就避免了使用 paraSecondMatrix->data[i][j] 或者 paraSecondMatrix->getValue(i, j) 这些方式.
该函数按定义涉及 3 个矩阵, 但以下用法很常见
firstMatrix->subtractToMe(firstMatrix, secondMatrix)
它只涉及两个矩阵.
对矩阵大小进行了检查, 这个函数的检查是完备的, 因为我在写这个文档时修改了. 但其它的某些函数还不一定. 所以, 要写出完备的程序还是需要一遍又一遍.

为单独测试 MfDoubleMatrix 类，应在 main.cpp 里面将其它代码注释, 只留下

int main()
{
    printf("Hello world!\r\n");
    MfDoubleMatrix tempMatrix;
    tempMatrix.unitTest();
    printf("end.\r\n");
    getchar();

    return 0;
}//Of main

其它类的测试同理.

5. Mf4DTensor 类

该类常用于 CNN, 管理 4 维数据. 其设计模式与 MfDoubleMatrix 类相同, 但由于可以用到后者, 所以方法比较少.
成员变量说明如下:

int firstLength;
int secondLength;
int thirdLength;
int fourthLength;
double**** data;

成员方法说明如下:

 //The constructor.
 Mf4DTensor();
 //The constructor.
 Mf4DTensor(int, int, int, int);
 //The destructor.
 virtual ~Mf4DTensor();

 //Fill with one value.
 void fill(double);

 //Getter.
 double**** getData();

 //Convert to string for display.
 string toString();

 //Set one value.
 void setValue(int, int, int, int, double);
 //Sum to a matrix.
 //订制方法, 用于 CNN 中.
 //paraIndex 指定第 2 维的下标, 遍历第 1 维, 累加成一个二维矩阵.
 void sumToMatrix(int paraIndex, MfDoubleMatrix* paraMatrix);
 //Unit test.
 void unitTest();

这里的
sumToMatrix
函数按理应该定义到 MfDoubleMatrix 类里面, 将 Mf4DTensor 对象作为参数. 但这里有一个循环 include 问题: 如果在 Mf4DTensor 类里面用了
#include "MfDoubleMatrix .h"
就不可以在 MfDoubleMatrix 类里面使用
#include "Mf4DTensor .h"
所以就变成现在这个样子.

6. 小结

本节的类都比较简单. 有两个重点:

ToMe 系列方法的设计理念. 它使得我们程序运行起来之后, 使用空间保持不变;
程序风格 (如命名规则) 的统一, 有利于后期的协同编程. 好吧, 这算是我的梦想;
自定义的方法, 可以让我们对矩阵操作等有完全的把握 (想怎么玩就怎么玩), 这正是我做这个程序的初衷;
神经网络的很多核心计算都在 MfDoubleMatrix 进行. 等我掌握并行计算的方法后, 将这个类进行改造, 就可以愉快地使用 GPU 加速了.
点击进入下一节