Chapter 5 Implementation of Neural Network Models for Extracting Reliable Patterns from Data
the full article:http://note.youdao.com/noteshare?id=0627e2a45e34c2c4ae29bd04fa1af0d1
5.1 Introduction and Overview
Three topics of this chapter are generalization ability,minimizing model complexity and robustness of models.
5.2 Bias-Variance Tradeoff
5.3 Improving Generalization of Neural Networks
5.3.1 Illustration of Early Stopping
5.3.1.1 Effect of Initial Random Weights
5.3.1.2 Weights structure of the Trained Networks
5.3.1.3 Effect of Random Sampling
5.3.1.4 Effect of Model Complexity:Nember of Hidden Neurons
5.3.1.5 Summary of Early Stopping
5.3.2 Regularization
The previous two methods are benefit of improving generalization by keeping weights small.The advantage of regularization is that training takes less time,and once the optimum weights are reached,they do not continue to grow.
5.4 Reducing Structural Complexity of Networks by Pruning
5.4.1 Optimal Brain Damage
Correction:correlation reduces redundant.
5.4.1.1 Example of Network Pruning with Optimal Brain Damage
5.4.2 Network Pruning Based on Variance of Network Sensitivity
Essentially,we want to test the sensitivity of parameters where are zero or not.
5.4.2.1 Illustration of Application of Variance Nullity in Pruning Weights
If the weights link the neuron and the ahead layer are eliminated,we tend to eliminated the neuron,whereas we don't.
5.4.2.2 Pruning Hidden Neurons Based on Variance Nullity of Sensitivity
we can calculate the sensitivity of output to hidden layers' ouput,then use pruning method.
The pruning processes of all methods provide a method to prun more logical and fast.But it also need iteration and trail and error.It is not complete and certain,like deceison trees.
5.5 Robustness of a Network to Perturbation of Weights
5.5.1 Confidence Intervals for Weights
We generate a set of weights by useing optimal weights add a random noise.Then we calculate the mean and the standard deviation of the weights.Then calculating the Upper and the Lower in terms of the formula.
The noise affect the weights and the network.We want to have a network that is robust to response to the situations about different noise.And we want to select the optimal weights overcoming the effects of noise.A related methods with Bayesian statistics is introduced in Chapter 7.
5.6 Summary