【deep learning学习笔记】注释yusugomori的SDA代码 -- 准备工作

最新推荐文章于 2019-11-06 11:21:21 发布

xceman1997

最新推荐文章于 2019-11-06 11:21:21 发布

阅读量1.6k

点赞数

分类专栏： DL

本文链接：https://blog.csdn.net/xceman1997/article/details/9749731

版权

DL 专栏收录该内容

49 篇文章 0 订阅

订阅专栏

1. SDA基本原理

其实是承接前面几篇博文，继续注释yusugomori的deep learning代码，注释到SDA了，顺便说一下基本原理。

SDA是stack denosing autoencoders的缩写。比起标准的DA（denosing autoencoders），主要区别在于：1. 多了一些隐含层——stack的含义；2. 最上一层是LR（logestic regresssion)，用softmax来计算每个类别的概率。每两个隐含层之间还是用标准DA的训练算法来训练。最后一层，用LR算法来训练。其时隐含层也可以用RBM来代替。RBM的基本原理与DA不同，不过作用是一样的，都是输入层的表示方式。

2. hidden layer

hidden layer和DA在代码中都是SDA的组件，在网络结构上，hidden layer和DA共享同样的网络结构。那为什么会声明两个东西呢（一个是hidden layer，一个是DA）？因为SDA中中要用到sample的过程（计算完概率之后，根据当前概率来进行贝努利实验，得到0-1输出），这个在DA的代码中没有（在RBM的代码中有）。其实如果单独写一个sample函数，也就没必要弄出一个hidden layer类出来。不过yusugomori的源代码中还有DNN，可能这样写方便DNN的实现吧。

3. HiddenLayer.h

头文件，如下：

class HiddenLayer 
{
public:
  	int N;			// the number of training samples
  	int n_in;		// the node number of input layer
  	int n_out;		// the node number of output layer
  	double **W;		// the network weights
  	double *b;		// the bias 
  	
  	// allocate memory and initialize the parameters
  	HiddenLayer (
	  	int, 		// N
	  	int, 		// n_in
	  	int, 		// n_out
	  	double**, 	// W
	  	double*		// b
		  );
  	~HiddenLayer();
  	// calculate the value of a certain node in hidden layer
  	double output (
	  	int*, 		// input value vector
	  	double*, 	// the network weight of the node in hidden layer
	  	double		// the bias of the node in hidden layer
		  );
	// sample the 0-1 state of hidden layer given the input
  	void sample_h_given_v (
	  	int*, 		// input value vector
	  	int*		// the output 0-1 state of hidden layesr
		  );
};

4. HiddenLayer的实现是在Sda.cpp中实现的，代码片段如下：

// HiddenLayer
HiddenLayer::HiddenLayer (
		int size, 			// N
		int in, 			// n_in
		int out, 			// n_out
		double **w, 		// W
		double *bp			// b
			) 
{
  	N = size;
  	n_in = in;
  	n_out = out;

  	if(w == NULL) 
  	{
  		// allocate memory for W
    	W = new double*[n_out];
    	for(int i=0; i<n_out; i++) 
			W[i] = new double[n_in];
		// the initial value
    	double a = 1.0 / n_in;
    	for(int i=0; i<n_out; i++) 
		{
      		for(int j=0; j<n_in; j++) 
 			{
        		W[i][j] = uniform(-a, a);
      		}
    	}
  	} 
  	else 
	{
    	W = w;
  	}

  	if(bp == NULL) 
  	{
    	b = new double[n_out];
    	memset (b, 0, sizeof(int));	// I add this to initialize b
  	} 
  	else 
  	{
    	b = bp;
  	}
}

HiddenLayer::~HiddenLayer() 
{
	// clear W and b
  	for(int i=0; i<n_out; i++) 
	  	delete W[i];
  	delete[] W;
  	delete[] b;
}

double HiddenLayer::output (
		int *input, 
		double *w, 
		double b
			) 
{
	// iterate all the input nodes and calcualte the output of the hidden node
  	double linear_output = 0.0;
  	for(int j=0; j<n_in; j++) 
  	{
    	linear_output += w[j] * input[j];
  	}
  	linear_output += b;
  	return sigmoid(linear_output);
}

void HiddenLayer::sample_h_given_v (
		int *input, 
		int *sample
			) 
{
  	for(int i=0; i<n_out; i++) 
  	{
  		// get the result of binomial test
    	sample[i] = binomial(1, output(input, W[i], b[i]));
  	}
}

5. 另外，SDA中还用到了LR、DA等。

他们的头文件注释参见我从前的博文，实现代码都放到了Sda.cpp中，和之前dA.cpp、LR.cpp对比，没有差别，只是简单的copy&paste。这些代码注释在这里也都不详细写了，有需要的请参考我前面的“【deep learning学习笔记】注释yusugomori的xxx代码”系列博文。

xceman1997

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
【deep learning学习笔记】注释yusugomori的SDA代码 -- 准备工作

1. SDA基本原理其实是承接前面几篇博文，继续注释yusugomori的deep learning代码，注释到SDA了，顺便说一下基本原理。SDA是stack denosing autoencoders的缩写。比起标准的DA（denosing autoencoders），主要区别在于：1. 多了一些隐含层——stack的含义；2. 最上一层是LR（logestic regresssi
复制链接

扫一扫

专栏目录