Nice find; thank you for being so thorough in your review.
There indeed is an error in the code, specifically, in the calculation of
There actually is even a further error for layer #2. The given equation for the number of weights is (5x5+1)x6x50=7800. In this equation, the "+1" is intended to refer to the bias weight, and yields (5x5+1)x6=156 weights for each of the 50 feature maps. However, the equation incorrectly implies that there are 6 bias weights for each neuron, which is incorrect (there should only be one single bias weight). The correct equation is (5x5x6+1)x50=7550, with 151 weights per feature map. The code above does not reflect this, and for this reason, when you re-do your math, you will see that there are still a few unused weights (more precisely, 5 unused weights per feature map).
I can't change the code in the article or the download; if I do, then the download for the trained neural network will no longer work (it's tightly coupled to the interconnections in the network architecture).
I'm working on a part-2 to the article, and I will make your changes there.
Meanwhile, I think it is a testament to the robustness of Dr. LuCun's architecture, that such a low error rate (0.74%) can still be achieved even when there's an error in interconnections.
Best regards,
Mike
There indeed is an error in the code, specifically, in the calculation of
iNumWeight
. Here is a correction; note the commented-out line:
for ( fm=0; fm<50; ++fm) { for ( ii=0; ii<5; ++ii ) { for ( jj=0; jj<5; ++jj ) { // iNumWeight = fm * 26; // 26 is the number of weights per feature map iNumWeight = fm * 156; // 156 is the number of weights per feature map NNNeuron& n = *( pLayer->m_Neurons[ jj + ii*5 + fm*25 ] ); n.AddConnection( ULONG_MAX, iNumWeight++ ); // bias weight for ( kk=0; kk<25; ++kk ) { // note: max val of index == 1013, corresponding to 1014 neurons in prev layer n.AddConnection( 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ ); n.AddConnection( 169 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ ); n.AddConnection( 338 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ ); n.AddConnection( 507 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ ); n.AddConnection( 676 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ ); n.AddConnection( 845 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ ); } } } }As you can see, the error is a copy-and-paste type of error. The previous layer (#1) had 26 weights per feature map, whereas the current layer (#2) has 156.
There actually is even a further error for layer #2. The given equation for the number of weights is (5x5+1)x6x50=7800. In this equation, the "+1" is intended to refer to the bias weight, and yields (5x5+1)x6=156 weights for each of the 50 feature maps. However, the equation incorrectly implies that there are 6 bias weights for each neuron, which is incorrect (there should only be one single bias weight). The correct equation is (5x5x6+1)x50=7550, with 151 weights per feature map. The code above does not reflect this, and for this reason, when you re-do your math, you will see that there are still a few unused weights (more precisely, 5 unused weights per feature map).
I can't change the code in the article or the download; if I do, then the download for the trained neural network will no longer work (it's tightly coupled to the interconnections in the network architecture).
I'm working on a part-2 to the article, and I will make your changes there.
Meanwhile, I think it is a testament to the robustness of Dr. LuCun's architecture, that such a low error rate (0.74%) can still be achieved even when there's an error in interconnections.
Best regards,
Mike