c++重写卷积网络的前向计算过程，完美复现theano的测试结果

最新推荐文章于 2024-06-10 12:19:48 发布

空穴来风

最新推荐文章于 2024-06-10 12:19:48 发布

阅读量2.6w

点赞数 6

分类专栏：机器学习文章标签：卷积神经网络 c++ theano cnn

本文链接：https://blog.csdn.net/qiaofangjie/article/details/18042407

版权

本人的需求是：

通过theano的cnn训练神经网络，将最终稳定的网络权值保存下来。c++实现cnn的前向计算过程，读取theano的权值，复现theano的测试结果

本人最终的成果是：

1、卷积神经网络的前向计算过程
2、mlp网络的前向与后向计算，也就是可以用来训练样本
需要注意的是：
如果为了复现theano的测试结果，那么隐藏层的激活函数要选用tanh；
否则，为了mlp的训练过程，激活函数要选择sigmoid

成果的展现：

下图是theano的训练以及测试结果，验证样本错误率为9.23%

下面是我的c++程序，验证错误率也是9.23%，完美复现theano的结果

简单讲难点有两个：

1.theano的权值以及测试样本与c++如何互通？

2.theano的卷积的时候，上层输入的featuremap如何组合，映射到本层的每个像素点上？

在解决上述两点的过程中，走了很多的弯路：

为了用c++重现theano的测试结果，必须让c++能够读取theano保存的权值以及测试样本。

思考分析如下：

1.theano的权值是numpy格式，而它直接与c++交互，很困难，numpy的格式不好解析，网上资料很少

2.采用python做中间转换，实现1)的要求。后看theano代码，发现读入python的训练样本，不用转换成numpy数组，用本来python就可以了。但是python经过cPickle的dump文件，加了很多格式，不适合同c++交互。

3. 用json转换，由于python和cpp都有json的接口，都转成json的格式，然后再交互。可是theano训练之后权值是numpy格式的，需要转成python数组，json才可以存到文件中。现在的问题是怎么把numpy转成python的list？

4.为了解决3，找了一天，终于找到了numpy数组的tolist接口，可以将numpy数组转换成python的list。

5.现在python和c++都可以用json了。研究jsoncpp库的使用，将python的json文件读取。通过测试发现，库 jsoncpp不适合读取大文件，很容易造成内存不足，效率极低，故不可取。

6.用c++写函数，自己解析json文件。并且通过pot文件生成训练与测试样本的时候，也直接用c++来生成，不需要转换成numpy数组的格式。

经过上述分析，解决了难点1。通过json格式实现c++与theano权值与测试样本的互通，并且自己写函数解析json文件。

对于难点2，看一个典型的cnn网络图

难点2的详细描述如下：

Theano从S2到C3的时候，如何选择S2的featuremap进行组合？每次固定选取还是根据一定的算法动态组合？
Theano从C3到S4的pooling过程，令poolsize是(2*2),如何将C3的每4个像素变成S4的一个像素？

通过大量的分析，对比验证，发现以下结论：

Theano从S2到C3的时候，选择S2的所有featuremap进行组合
Theano从C3到S4的pooling过程，令poolsize是(2*2),，对于C3的每4个像素，选取最大值作为S4的一个像素

通过以上的分析，理论上初步已经弄清楚了。下面就是要根据理论编写代码，真正耗时的是代码的调试过程，总是复现不了theano的测试结果。

曾经不止一次的认为这是不可能复现的，鬼知道theano怎么搞的。

今天终于将代码调通，很是兴奋，于是有了这篇博客。

阻碍我实现结果的bug主要有两个，一个是理论上的不足，对theano卷积的细节把握不准确；一个是自己写代码时粗心，变量初始化错误。如下：

S2到C3卷积时，theano会对卷积核旋转180度之后，才会像下图这样进行卷积（本人刚接触这块，实在是不知道啊。。。）

C3到S4取像素最大值的时候，想当然认为像素都是正的，变量初始化为0，导致最终找最大值错误（这个bug找的时间最久，血淋淋的教训。。。）

theano对写权值的函数，注意它保存的是卷积核旋转180度后的权值，如果权值是二维的，那么行列互换（与c++的权值表示法统一）

def getDataJson(layers):
    data = []
    i = 0
    for layer in layers:
        w, b = layer.params
        # print '..layer is', i
        w, b = w.get_value(), b.get_value()
        wshape = w.shape
        # print '...the shape of w is', wshape
        if len(wshape) == 2:
            w = w.transpose()
        else:
            for k in xrange(wshape[0]):
                for j in xrange(wshape[1]):
                    w[k][j] = numpy.rot90(w[k][j], 2)

            w = w.reshape((wshape[0], numpy.prod(wshape[1:])))
        
        w = w.tolist()
        b = b.tolist()
        data.append([w, b])
        i += 1
    return data

def writefile(data, name = '../../tmp/src/data/theanocnn.json'):
    print ('writefile is ' + name)
    f = open(name, "wb")
    json.dump(data,f)
    f.close()

theano读权值

def readfile(layers, nkerns, name = '../../tmp/src/data/theanocnn.json'):
    # Load the dataset
    print ('readfile is ' + name)
    f = open(name, 'rb')
    data = json.load(f)
    f.close()
    readwb(data, layers, nkerns)

def readwb(data, layers, nkerns):
    i = 0
    kernSize = len(nkerns)
    inputnum = 1
    for layer in layers:
        w, b = data[i]
        w = numpy.array(w, dtype='float32')
        b = numpy.array(b, dtype='float32')

        # print '..layer is', i
        # print w.shape
        if i >= kernSize:
            w = w.transpose()
        else:
            w = w.reshape((nkerns[i], inputnum, 5, 5))
            for k in xrange(nkerns[i]):
                for j in xrange(inputnum):
                    c = w[k][j]
                    w[k][j] = numpy.rot90(c, 2)
            inputnum = nkerns[i]
        # print '..readwb ，transpose and rot180'
        # print w.shape
        layer.W.set_value(w, borrow=True)
        layer.b.set_value(b, borrow=True)
        i += 1

测试样本由手写数字库mnist生成，核心代码如下：

def mnist2json_small(cnnName = 'mnist_small.json', validNumber = 10):
    dataset = '../../data/mnist.pkl'
    print '... loading data', dataset

    # Load the dataset
    f = open(dataset, 'rb')
    train_set, valid_set, test_set = cPickle.load(f)
    #print test_set
    f.close()
    def np2listSmall(train_set, number):
        trainfile = []
        trains, labels = train_set
        trainfile = []
        #如果注释掉下面，将生成number个验证样本
        number = len(labels)
        for one in trains[:number]:
            one = one.tolist()
            trainfile.append(one)
        labelfile = labels[:number].tolist()
        datafile = [trainfile, labelfile]
        return datafile
    smallData = valid_set
    print len(smallData)
    valid, validlabel = np2listSmall(smallData, validNumber)
    datafile = [valid, validlabel]
    basedir = '../../tmp/src/data/'
    # basedir = './'
    json.dump(datafile, open(basedir + cnnName, 'wb'))

个人收获：

面对较难的任务，逐步分解，各个击破
解决问题的过程中，如果此路不通，要马上寻找其它思路，就像当年做数学证明题一样
态度要积极，不要轻言放弃，尽全力完成任务
代码调试时，应该首先构造较为全面的测试用例，这样可以迅速定位bug

本人的需求以及实现时的困难已经基本描述清楚，如果还有别的小问题，我相信大家花点比俺少很多很多

的时间就可以解决，下面开始贴代码

如果不想自己建工程，这里有vs2008的c++代码，自己按照theano生成一下权值就可以读入运行了

C++代码

main.cpp

#include <iostream>
#include "mlp.h"
#include "util.h"
#include "testinherit.h"
#include "neuralNetwork.h"
using namespace std;

/************************************************************************/
/* 本程序实现了
1、卷积神经网络的前向计算过程 
2、mlp网络的前向与后向计算，也就是可以用来训练样本
需要注意的是：
如果为了复现theano的测试结果，那么隐藏层的激活函数要选用tanh；
否则，为了mlp的训练过程，激活函数要选择sigmoid

*/
/************************************************************************/
int main()
{

	cout << "****cnn****" << endl;
	TestCnnTheano(28 * 28, 10);
	// TestMlpMnist对mlp训练样本进行测试
	//TestMlpMnist(28 * 28, 500, 10);
	return 0;
}

neuralNetwork.h

#ifndef NEURALNETWORK_H
#define NEURALNETWORK_H

#include "mlp.h"
#include "cnn.h"
#include <vector>
using std::vector;

/************************************************************************/
/* 这是一个卷积神经网络                                                                     */
/************************************************************************/
class NeuralNetWork
{
public:
	NeuralNetWork(int iInput, int iOut);
	~NeuralNetWork();
	void Predict(double** in_data, int n);


	double CalErrorRate(const vector<double *> &vecvalid, const vector<WORD> &vecValidlabel);

	void Setwb(vector< vector<double*> > &vvAllw, vector< vector<double> > &vvAllb);
	void SetTrainNum(int iNum);

	int Predict(double *pInputData);
	//    void Forward_propagation(double** ppdata, int n);
	double* Forward_propagation(double *);

private:
	int m_iSampleNum; //样本数量
	int m_iInput; //输入维数
	int m_iOut; //输出维数
	
	vector<CnnLayer *> vecCnns; 
	Mlp *m_pMlp;
};
void TestCnnTheano(const int iInput, const int iOut);

#endif

neuralNetwork.cpp

#include "neuralNetwork.h"

#include <iostream>
#include "util.h"
#include <iomanip>

using namespace std;

NeuralNetWork::NeuralNetWork(int iInput, int iOut):m_iSampleNum(0), m_iInput(iInput), m_iOut(iOut), m_pMlp(NULL)
{
	int iFeatureMapNumber = 20, iPoolWidth = 2, iInputImageWidth = 28, iKernelWidth = 5, iInputImageNumber = 1;

	CnnLayer *pCnnLayer = new CnnLayer(m_iSampleNum, iInputImageNumber, iInputImageWidth, iFeatureMapNumber, iKernelWidth, iPoolWidth);
	vecCnns.push_back(pCnnLayer);

	iInputImageNumber = 20;
	iInputImageWidth = 12;
	iFeatureMapNumber = 50;
	pCnnLayer = new CnnLayer(m_iSampleNum, iInputImageNumber, iInputImageWidth, iFeatureMapNumber, iKernelWidth, iPoolWidth);
	vecCnns.push_back(pCnnLayer);

	const int ihiddenSize = 1;
	int phidden[ihiddenSize] = {500};
	// construct LogisticRegression
	m_pMlp = new Mlp(m_iSampleNum, iFeatureMapNumber * 4 * 4, m_iOut, ihiddenSize, phidden);

}

NeuralNetWork::~NeuralNetWork()
{

	for (vector<CnnLayer*>::iterator it = vecCnns.begin(); it != vecCnns.end(); ++it)
	{
		delete *it;
	}
	delete m_pMlp;
}

void NeuralNetWork::SetTrainNum(int iNum)
{
	m_iSampleNum = iNum;

	for (size_t i = 0; i < vecCnns.size(); ++i)
	{
		vecCnns[i]->SetTrainNum(iNum);
	}
	m_pMlp->SetTrainNum(iNum);
	
}

int NeuralNetWork::Predict(double *pdInputdata)
{
	double *pdPredictData = NULL;
	pdPredictData = Forward_propagation(pdInputdata);

	int iResult = -1;
	
	
	iResult = m_pMlp->m_pLogisticLayer->Predict(pdPredictData);

	return iResult;
}

double* NeuralNetWork::Forward_propagation(double *pdInputData)
{
	double *pdPredictData = pdInputData;

	vector<CnnLayer*>::iterator it;
	CnnLayer *pCnnLayer;
	for (it = vecCnns.begin(); it != vecCnns.end(); ++it)
	{
		pCnnLayer = *it;
		pCnnLayer->Forward_propagation(pdPredictData);
		pdPredictData = pCnnLayer->GetOutputData();
	}
	//此时pCnnLayer指向最后一个卷积层,pdInputData是卷积层的最后输出
	//暂时忽略mlp的前向计算，以后加上
	pdPredictData = m_pMlp->Forward_propagation(pdPredictData);
	return pdPredictData;
}

void NeuralNetWork::Setwb(vector< vector<double*> > &vvAllw, vector< vector<double> > &vvAllb)
{
	for (size_t i = 0; i < vecCnns.size(); ++i)
	{
		vecCnns[i]->Setwb(vvAllw[i], vvAllb[i]);
	}
	
	size_t iLayerNum = vvAllw.size();
	for (size_t i = vecCnns.size(); i < iLayerNum - 1; ++i)
	{
		int iHiddenIndex = 0;
		m_pMlp->m_ppHiddenLayer[iHiddenIndex]->Setwb(vvAllw[i], vvAllb[i]);
		++iHiddenIndex;
	}
	m_pMlp->m_pLogisticLayer->Setwb(vvAllw[iLayerNum - 1], vvAllb[iLayerNum - 1]);
}
double NeuralNetWork::CalErrorRate(const vector<double *> &vecvalid, const vector<WORD> &vecValidlabel)
{
	cout << "Predict------------" << endl;
	int iErrorNumber = 0, iValidNumber = vecValidlabel.size();
	//iValidNumber = 1;
	for (int i = 0; i < iValidNumber; ++i)
	{
		int iResult = Predict(vecvalid[i]);
		//cout << i << ",valid is " << iResult << " label is " << vecValidlabel[i] << endl;
		if (iResult != vecValidlabel[i])
		{
			++iErrorNumber;
		}
	}

	cout << "the num of error is " << iErrorNumber << endl;
	double dErrorRate = (double)iErrorNumber / iValidNumber;
	cout << "the error rate of Train sample by softmax is " << setprecision(10) << dErrorRate * 100 << "%" << endl;

	return dErrorRate;
}

/************************************************************************/
/* 
测试样本采用mnist库，此cnn的结构与theano教程上的一致，即
输入是28*28图像，接下来是2个卷积层（卷积+pooling），featuremap个数分别是20和50，
然后是全连接层（500个神经元），最后输出层10个神经元

*/
/************************************************************************/
void TestCnnTheano(const int iInput, const int iOut)
{
	//构建卷积神经网络
	NeuralNetWork neural(iInput, iOut);
	//存取theano的权值
	vector< vector<double*> > vvAllw;
	vector< vector<double> > vvAllb;
	//存取测试样本与标签
	vector<double*> vecValid;
	vector<WORD> vecLabel;
	//保存theano权值与测试样本的文件
	const char *szTheanoWeigh = "../../data/theanocnn.json", *szTheanoTest = "../../data/mnist_validall.json";
	//将每次权值的第二维（列宽）保存到vector中，用于读取json文件
	vector<int> vecSecondDimOfWeigh;
	vecSecondDimOfWeigh.push_back(5 * 5);
	vecSecondDimOfWeigh.push_back(20 * 5 * 5);
	vecSecondDimOfWeigh.push_back(50 * 4 * 4);
	vecSecondDimOfWeigh.push_back(500);

	cout << "loadwb ---------" << endl;
	//读取权值
	LoadWeighFromJson(vvAllw, vvAllb, szTheanoWeigh, vecSecondDimOfWeigh);
	//将权值设置到cnn中
	neural.Setwb(vvAllw, vvAllb);
	//读取测试文件		 
	LoadTestSampleFromJson(vecValid, vecLabel, szTheanoTest, iInput);
	//设置测试样本的总量
	int iVaildNum = vecValid.size();
	neural.SetTrainNum(iVaildNum);
	//前向计算cnn的错误率，输出结果
	neural.CalErrorRate(vecValid, vecLabel);
	
	//释放测试文件所申请的空间
	for (vector<double*>::iterator cit = vecValid.begin(); cit != vecValid.end(); ++cit)
	{
		delete [](*cit);
	}
}

cnn.h

#ifndef CNN_H
#define CNN_H
#include "featuremap.h"
#include "poollayer.h"

#include <vector>

using std::vector;

typedef unsigned short WORD;
/**
*本卷积模拟theano的测试过程
*当输入层是num个featuremap时，本层卷积层假设有featureNum个featuremap。
*对于本层每个像素点选取，上一层num个featuremap一起组合，并且没有bias
*然后本层输出到pooling层，pooling只对poolsize内的像素取最大值，然后加上bias，总共有featuremap个bias值
*/
class CnnLayer
{
public:
	CnnLayer(int iSampleNum, int iInputImageNumber, int iInputImageWidth, int iFeatureMapNumber,
		int iKernelWidth, int iPoolWidth);
	~CnnLayer();
	void Forward_propagation(double *pdInputData);
	void Back_propagation(double* , double* , double );

	void Train(double *x, WORD y, double dLr);
	int Predict(double *);

	void Setwb(vector<double*> &vpdw, vector<double> &vdb);
	void SetInputAllData(double **ppInputAllData, int iInputNum);
	void SetTrainNum(int iSampleNumber);
	void PrintOutputData();

	double* GetOutputData();
private:
	int m_iSampleNum;

	FeatureMap *m_pFeatureMap;
	PoolLayer *m_pPoolLayer;
	//反向传播时所需值
	double **m_ppdDelta;
	double *m_pdInputData;
	double *m_pdOutputData;
};
void TestCnn();

#endif // CNN_H

cnn.cpp

#include "cnn.h"
#include "util.h"
#include <cassert>

CnnLayer::CnnLayer(int iSampleNum, int iInputImageNumber, int iInputImageWidth, int iFeatureMapNumber,
				   int iKernelWidth, int iPoolWidth):
    m_iSampleNum(iSampleNum), m_pdInputData(NULL), m_pdOutputData(NULL)
{
	m_pFeatureMap = new FeatureMap(iInputImageNumber, iInputImageWidth, iFeatureMapNumber, iKernelWidth);
	int iFeatureMapWidth =  iInputImageWidth - iKernelWidth + 1;
	m_pPoolLayer = new PoolLayer(iFeatureMapNumber, iPoolWidth, iFeatureMapWidth);
   
}

CnnLayer::~CnnLayer()
{
    delete m_pFeatureMap;
	delete m_pPoolLayer;
}

void CnnLayer::Forward_propagation(double *pdInputData)
{
    m_pFeatureMap->Convolute(pdInputData);
	m_pPoolLayer->Convolute(m_pFeatureMap->GetFeatureMapValue());
	m_pdOutputData = m_pPoolLayer->GetOutputData();
	/************************************************************************/
	/* 调试卷积过程的各阶段结果，调通后删除                                                                     */
	/************************************************************************/
	/*m_pFeatureMap->PrintOutputData();
	m_pPoolLayer->PrintOutputData();*/
}

void CnnLayer::SetInputAllData(double **ppInputAllData, int iInputNum)
{
    
}

double* CnnLayer::GetOutputData()
{
	assert(NULL != m_pdOutputData);

	return m_pdOutputData;
}

void CnnLayer::Setwb(vector<double*> &vpdw, vector<double> &vdb)
{
	m_pFeatureMap->SetWeigh(vpdw);
	m_pPoolLayer->SetBias(vdb);
	
}

void CnnLayer::SetTrainNum( int iSampleNumber )
{
	m_iSampleNum = iSampleNumber;
}

void CnnLayer::PrintOutputData()
{
	m_pFeatureMap->PrintOutputData();
	m_pPoolLayer->PrintOutputData();

}
void TestCnn()
{
	const int iFeatureMapNumber = 2, iPoolWidth = 2, iInputImageWidth = 8, iKernelWidth = 3, iInputImageNumber = 2;

	double *pdImage = new double[iInputImageWidth * iInputImageWidth * iInputImageNumber];
	double arrInput[iInputImageNumber][iInputImageWidth * iInputImageWidth];

	MakeCnnSample(arrInput, pdImage, iInputImageWidth, iInputImageNumber);

	double *pdKernel = new double[3 * 3 * iInputImageNumber];
	double arrKernel[3 * 3 * iInputImageNumber];
	MakeCnnWeigh(pdKernel, iInputImageNumber) ;
	

	CnnLayer cnn(3, iInputImageNumber, iInputImageWidth, iFeatureMapNumber, iKernelWidth, iPoolWidth);

	vector <double*> vecWeigh;
	vector <double> vecBias;
	for (int i = 0; i < iFeatureMapNumber; ++i)
	{
		vecBias.push_back(1.0);
	}
	vecWeigh.push_back(pdKernel);
	for (int i = 0; i < 3 * 3 * 2; ++i)
	{
		arrKernel[i] = i;
	}
	vecWeigh.push_back(arrKernel);
	cnn.Setwb(vecWeigh, vecBias);
	
	cnn.Forward_propagation(pdImage);
	cnn.PrintOutputData();
	


	delete []pdKernel;
	delete []pdImage;

}

featuremap.h

#ifndef FEATUREMAP_H
#define FEATUREMAP_H

#include <cassert>
#include <vector>
using std::vector;

class FeatureMap
{
public:
    FeatureMap(int iInputImageNumber, int iInputImageWidth, int iFeatureMapNumber, int iKer

最低0.47元/天解锁文章

空穴来风

关注

6
点赞
踩
55

收藏

觉得还不错? 一键收藏
27
评论
c++重写卷积网络的前向计算过程，完美复现theano的测试结果

本人的需求是：通过theano的cnn训练神经网络，将最终稳定的网络权值保存下来。c++实现cnn的前向计算过程，读取theano的权值，复现theano的测试结果本人最终的成果是：1、卷积神经网络的前向计算过程 2、mlp网络的前向与后向计算，也就是可以用来训练样本需要注意的是：如果为了复现theano的测试结果，那么隐藏层的激活函数要选用tanh；否则，为了mlp的
复制链接

扫一扫