
本文为荷兰代尔夫特理工大学(作者:Aarnoud Hoekstra)的毕业论文,共136页。





The use of neural networks, or neural classifiersas they are also referred to, has become common practice in the patternrecognition practice. Neural networks are considered to be very powerfulclassifiers compared to classical algorithms such as the nearest neighbourmethod. The algorithms used in neural network applications are capable offinding a good classifier based on a limited and in general a small number oftraining examples. This capability, also referred to as generalisation, is ofinterest from a pattern recognition point of view since a large set ofparameters is estimated using a relatively small data set. In this thesis thegeneralisation behaviour of neural networks is studied. In particular, thequestion of how this behaviour can be detected and which factors influence itare answered. To be able to answers these questions, a proper understanding ofthe concept of generalisation is needed, such that the results obtained fromthe introduced techniques can be compared. Therefore, an operational definitionof generalisation is introduced, namely the number of expected errors made by aclassifier on a set of test samples. The set of neural classifiers studied inthis thesis was restricted to the class of backpropagation trained classifiersand classical algorithms as the k nearest neighbour method, linear andquadratic classifiers. The first objective is to gain more insight into thebehaviour of a neural network. Consecutively, a measure can be applied whichholds for both neural and classical classifiers. By using a nonlinearity measure,insight is obtained in the generalisation behaviour of a neural network. Thismeasure shows to which extent the network has adapted to the data set. Thebetter it adapts to the data, the smaller its generalisation error. Too muchadaptation to the data implying a high nonlinearity, however, results in anon-generalising classifier. By monitoring the value of the nonlinearitymeasure during training this might be avoided. The definition of thenonlinearity measure is such that it also applies to non-neural classifiers.This enables us to compare classical classifiers and neural networks. It showsthat neural networks start from a linear solution and gradually adapt in anonlinear fashion. This adaptation is even stronger, and hence resulting in alarger nonlinearity, for a neural classifier than a classical one. It mighttherefore indicate that a neural network has a larger effective capacitycompared to the classical classifiers. Another factor which influences thegeneralisation behaviour, is the architecture of a network. In this thesis thearchitecture is determined by the number of hidden units. If the number ofhidden units is larger than needed, but not too large, the network will learnfaster. This is due to the redundancy present in the network. Redundancy causesthe neurons to cluster. At the start of a learning cycle, neurons fullfilapproximately the same functions. As the training progresses, the neurons startto specialise. This specialisation, known as symmetry breaking, can bevisualised using a projection technique which maps the high dimensional spacein which the hidden units move onto a two dimensional plane. In this twodimensional plane the trajectories of the hidden units during training can bedepicted in order to understand the training behaviour. Finally, thereliability of a neural network classification was studied. After the trainingcycle one is interested in the reliability of the classifications made by aneural network. This is done by estimating the a posteriori probabilities ofthe classifications. These probabilities can be used to improve the networkreliability by creating better networks or used in rejecting samples. We canestimate the probabilities by using confidence value estimators. Theseestimators can be determined using three techniques: the network outputs, thenearest neighbour method and the logistic estimator. Using these estimators,the reliability of a network classification is checked. A problem, however, isthat these estimators need an independent test set. Such a set can only be usedonce in order to prevent obtained biased classifiers. This problem wascircumvented by the introduction of k nearest neighbour data generation method.Using this method a new set is generated from a learning set. This new set, referredto as validation set, can act as a substitute of a test set. After applying thetechniques presented in this thesis, it can be concluded that we have gainedmore insight into the generalisation behaviour of neural classifiers. Inparticular the nonlinearity measure is of interest since it enables thecomparison of neural and non-neural classifiers in an objective manner.

  1. 引言
  2. 泛化
  3. 分类器非线性
  4. 分类器冗余
  5. 分类可靠性
  6. 数字识别
  7. 结论






