( A, B )---6*30*2---( 1, 0 )( 0, 1 )
让网络的输入只有6个节点,AB训练集各由6张二值化的图片组成,让A的6张图片中共有3个点,B中有1个点.,测试集为二进制的0-63.收敛误差7e-4,比较分类准确率。
A | B | |||||||||||
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | |
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
得到的结果为
训练集的关键列是3,4列,因此有理由假设训练集可以把二值化的测试集分成4部分。因为3,4列A的1多,B的0多,因此11被分为A,00被分为B。而在3,4列中只有A有01,因此01被分为A。
A | B | |||||||||||||||
0 | 0 | 0 | 0 | 1 | 0 | *** | 0 | 0 | 0 | 0 | 0 | 0 | * | |||
0 | 0 | 0 | 0 | 1 | 1 | *** | 0 | 0 | 0 | 0 | 0 | 1 | * | |||
0 | 0 | 0 | 1 | 1 | 0 | *** | 0 | 0 | 1 | 0 | 0 | 0 | * | |||
0 | 0 | 0 | 1 | 1 | 1 | *** | 0 | 0 | 1 | 0 | 0 | 1 | * | |||
0 | 0 | 1 | 0 | 1 | 0 | *** | 0 | 1 | 0 | 0 | 0 | 0 | * | |||
0 | 0 | 1 | 0 | 1 | 1 | *** | 0 | 1 | 0 | 0 | 0 | 1 | * | |||
0 | 0 | 1 | 1 | 1 | 0 | *** | 0 | 1 | 1 | 0 | 0 | 0 | * | |||
0 | 0 | 1 | 1 | 1 | 1 | *** | 0 | 1 | 1 | 0 | 0 | 1 | * | |||
0 | 1 | 0 | 0 | 1 | 0 | *** | 1 | 0 | 0 | 0 | 0 | 0 | * | |||
0 | 1 | 0 | 0 | 1 | 1 | *** | 1 | 0 | 0 | 0 | 0 | 1 | * | |||
0 | 1 | 0 | 1 | 1 | 0 | *** | 1 | 0 | 1 | 0 | 0 | 0 | * | |||
0 | 1 | 0 | 1 | 1 | 1 | *** | 1 | 0 | 1 | 0 | 0 | 1 | * | |||
0 | 1 | 1 | 0 | 1 | 0 | *** | 1 | 1 | 0 | 0 | 0 | 0 | * | |||
0 | 1 | 1 | 0 | 1 | 1 | *** | 1 | 1 | 0 | 0 | 0 | 1 | * | |||
0 | 1 | 1 | 1 | 1 | 0 | *** | 1 | 1 | 1 | 0 | 0 | 0 | * | |||
0 | 1 | 1 | 1 | 1 | 1 | *** | 1 | 1 | 1 | 0 | 0 | 1 | * | |||
1 | 0 | 0 | 0 | 1 | 0 | *** | ||||||||||
1 | 0 | 0 | 0 | 1 | 1 | *** | ||||||||||
1 | 0 | 0 | 1 | 1 | 0 | *** | ||||||||||
1 | 0 | 0 | 1 | 1 | 1 | *** | ||||||||||
1 | 0 | 1 | 0 | 1 | 0 | *** | ||||||||||
1 | 0 | 1 | 0 | 1 | 1 | *** | ||||||||||
1 | 0 | 1 | 1 | 1 | 0 | *** | ||||||||||
1 | 0 | 1 | 1 | 1 | 1 | *** | ||||||||||
1 | 1 | 0 | 0 | 1 | 0 | *** | ||||||||||
1 | 1 | 0 | 0 | 1 | 1 | *** | ||||||||||
1 | 1 | 0 | 1 | 1 | 0 | *** | ||||||||||
1 | 1 | 0 | 1 | 1 | 1 | *** | ||||||||||
1 | 1 | 1 | 0 | 1 | 0 | *** | ||||||||||
1 | 1 | 1 | 0 | 1 | 1 | *** | ||||||||||
1 | 1 | 1 | 1 | 1 | 0 | *** | ||||||||||
1 | 1 | 1 | 1 | 1 | 1 | *** |
实验结果符合这个假设。
但是训练集AB的3,4列都含有10的结构,测试集的10应该分类为A还是B?
A/B | |||||||
0 | 0 | 0 | 1 | 0 | 0 | *** | |
0 | 0 | 0 | 1 | 0 | 1 | * | |
0 | 0 | 1 | 1 | 0 | 0 | *** | |
0 | 0 | 1 | 1 | 0 | 1 | *** | |
0 | 1 | 0 | 1 | 0 | 0 | *** | |
0 | 1 | 0 | 1 | 0 | 1 | *** | |
0 | 1 | 1 | 1 | 0 | 0 | *** | |
0 | 1 | 1 | 1 | 0 | 1 | *** | |
1 | 0 | 0 | 1 | 0 | 0 | * | |
1 | 0 | 0 | 1 | 0 | 1 | * | |
1 | 0 | 1 | 1 | 0 | 0 | *** | |
1 | 0 | 1 | 1 | 0 | 1 | * | |
1 | 1 | 0 | 1 | 0 | 0 | * | |
1 | 1 | 0 | 1 | 0 | 1 | * | |
1 | 1 | 1 | 1 | 0 | 0 | *** | |
1 | 1 | 1 | 1 | 0 | 1 | * |
( A, B )---6*n*2---( 1, 0 )( 0, 1 )
- | - | - | - | - | - |
- | - | - | 2 | 1 | - |
- | - | - | 1 | - | - |
- | - | - | - | 1 | - |
- | - | - | - | - | - |
- | - | - | - | - | - |
重新做一个网络,让隐藏层节点数为2,5,6,7,8,10,30,50,70,90,110,130,150.收敛误差7e-4,收敛199次。测试集为第3,4列为10的16张图片,观察分类准确率平均值的变化,得到数据
d | 376148.2 | 238242.8 | 223491 | 215009 | 209407 | 201725 | 171966.1 | 150114 | 132766 | 118440.5 | 106999.64 | 97992.106 | 91503.724 |
r | 0.567211 | 0.5380025 | 0.55936 | 0.53518 | 0.6149 | 0.6065 | 0.6875 | 0.8068 | 0.8847 | 0.921168 | 0.9651382 | 0.9842965 | 0.9918342 |
n | 2 | 5 | 6 | 7 | 8 | 10 | 30 | 50 | 70 | 90 | 110 | 130 | 150 |
迭代次数d,A的分类准确率r,节点数n
这里的A分类准确率指16张图片里被分为A的比例,显然随着n的增加被分为A的概率也在增加。
如果把AB进样顺序互换
- | - | - | - | - | - |
- | - | - | 1 | 2 | - |
- | - | - | 2 | - | - |
- | - | - | - | 2 | - |
- | - | - | - | - | - |
- | - | - | - | - | - |
在做这个实验,得到数据为
d | 376108.96 | 236600 | 224544 | 215233 | 209272 | 201754.5 | 171505 | 150543 | 132729.1 | 118424.1 | 106899.22 | 98129.206 | 91444 |
r | 0.5599874 | 0.51853 | 0.56564 | 0.5999 | 0.6008 | 0.624058 | 0.6878 | 0.7644 | 0.858354 | 0.9224246 | 0.9425251 | 0.9795854 | 0.9903 |
n | 2 | 5 | 6 | 7 | 8 | 10 | 30 | 50 | 70 | 90 | 110 | 130 | 150 |
迭代次数d,A的分类准确率r,节点数n
两条分类准确率的曲线几乎是重合的。因此如果测试集关键列和训练集关键列的相似性有矛盾,到底是分为A或B,取决于进样顺序和隐藏层节点数。隐藏层节点数越大,先进样的被分到的概率越大。
做第3组实验
- | - | - | - | - | - |
- | - | - | 2 | 1 | - |
- | - | - | 1 | - | - |
- | - | - | - | 1 | - |
- | - | - | - | - | - |
- | - | - | - | - | - |
这次实验的测试集是3,4列为00,01,11的48张图片。收敛199次,统计分类准确率平均值的变化
迭代次数 | 376040.1 | 237616 | 201856.7 | 171765.9 | 150262.7 | 132793.8 | 118556.3 | 106986.9 | 98020.57 | 91661.83 |
A分类准确率 | 0.672425 | 0.667609 | 0.666876 | 0.666667 | 0.668551 | 0.675984 | 0.703308 | 0.768635 | 0.81428 | 0.846524 |
n | 2 | 5 | 10 | 30 | 50 | 70 | 90 | 110 | 130 | 150 |
因为01,11被分为A,00被分为B,所以48张图片有2/3被分为A,因此A的分类准确率为2/3.这个值在n由2到70几乎都没有变化。超过90以后这个分类准确率才开始明显增加。
所以神经网络的分类有两种情况,一种是规则的清晰的,如这个例子里的00,01,11.这种分类在相当大的范围内是稳定的和网络的参数设置无关。
还有一种是模糊的如这个例子里的10.这种结构的分类和进样顺序和隐藏层节点数都有关。