目录
第一章 引言
近些年来,随着深度学习的蓬勃发展,出现了越来越多新的神经网络概念方法,例如:强化学习、多任务学习、迁移学习等等。手写数字及字母的识别作为深度学习的经典题目,如今已经发展的十分成熟,通过CNN与合理的网络结构设计,经短短的几十次迭代即可达到98%以上的正确识别率,本文旨在通过一种更加新颖的多任务方式来完成手写数字字母识别这个经典题目。多任务神经网络因具有简化网络规模、提升网络性能,防止网络过拟合等优势,近些年来越来越多的被提到,而多任务网络中的参数共享机制也随之蓬勃发展,除了经典的Hard sharing、Soft sharing以Hierarchical sharing外,复旦大学的研究团队还于2020年在其论文《Learning Sparse Sharing Architectures for Multiple》中提出了一种新的共享方式——Sparse sharing。并且,该共享机制已经在自然语言处理领域取得了巨大的成功,本文选择了Sparse sharing作为多任务网络的参数共享机制也是对Sparse sharing在ANN中应用的一种探索。同时,本文利用了数字图像处理相关的技术,设计了一些图像优化与切割的方法,保证输入到多任务神经网络中的数字图片数据是最有利于网络进行判别的。综上所述,本设计的主要探究内容如下:
- 将原始图像灰度化,然后利用经验阈值将灰度图二值化
- 利用横向去除图片上下的白边,利用纵向切割寻找纵向切割点。为了除去噪声对纵向切割点产生的影响,本文还对寻找到的纵向切割点进行了筛选
- 利用筛选后的纵向切割点对图片进行切割,将字符串切割为单个的字符,然后将字符图片统一大小并储存数据
- 构建基础网络并进行任务训练
- 通过Sparse sharing进行网络的多任务训练
- 测试多任务网络的准确率与鲁棒性
第二章 图像的预处理
图像处理(Digital Image Processing)是通过计算机对图像进行去除噪声、增强、复原、分割、提取特征等处理的方法和技术 。在本设计中,图像的预处理至关重要,它能最大程度的滤除图片噪声,从而凸显重要特征的可检测性,这对于有利于神经网络的体征提取,减少噪声对图像识别过程的干扰,可大大提高网络的识别准确率。以下是本文所做的图像预处理的具体操作步骤。
2.1 二值化与横向切割
图像的二值化处理就是将图像上的点的灰度值设为0或255,也就是将整个图像呈现出明显的黑白效果。即将256个亮度等级的灰度图像通过适当的阈值选取而获得仍然可以反映图像整体和局部特征的二值化图像。在数字图像处理中,二值图像占有非常重要的地位,特别是在实用的图像处理中,以二值图像处理实现而构成的系统是很多的,要进行二值图像的处理与分析,首先要把灰度图像二值化,得到二值化图像,这样子有利于在对图像做进一步处理时,图像的集合性质只与像素值为0或255的点的位置有关,不再涉及像素的多级值,使处理变得简单。本设计中,将原始图像以灰度图的格式读入后以80作为经验阈值将灰度图二值化。二值化公式如图2-1所示。其中,maxval=255,thresh=80。
![](https://i-blog.csdnimg.cn/blog_migrate/9a7cce0b48a9a768af44a5dc0f5ba1e0.png)
灰度图与二值化图如图2-2与2-3所示。
![](https://i-blog.csdnimg.cn/blog_migrate/95e594424ff8b9c23e35859543843b53.png)
![](https://i-blog.csdnimg.cn/blog_migrate/6226690ea6e3cba29f35701d19a68bd1.png)
在得到图像的二值图之后,利用横向切割切去图像上下两侧的白边,切割方法为:将黑像素所占比例大于0.5%的行保留,其余行认为没有信息并切去经过横向切割后的图像如图2-4所示。
![](https://i-blog.csdnimg.cn/blog_migrate/47faadceaa676ddc4c48dec2859a87cf.png)
2.2 纵向切割与图形大小统一化
得到横向切割后的图像后,再在图像纵向寻找能将字符串分离的纵向切割点,寻找切割点的算法为流程为:首先找到一个白像素所占比例小于99.5%的列c1,再找到一个白像素所占比例大于99.5%的列c2,那么久认为c1与c2之间存在一个字符,将c1与c2记录下来。由于这种切割方式容易受到噪声点的影响,因此本设计中加入了切割点筛选机制,其筛选条件为:1.两切割点之间的宽度必须大于两个像素; 2.两切割点之间黑像素所占比例必须大于5%。经过上述筛选后得到了最终的纵向切割点,经切割后的图像如图2-5所示。
![](https://i-blog.csdnimg.cn/blog_migrate/88e9dc295ab8e45777d9d389bf851331.png)
由于图像信息要传给后面的神经网络,因此其大小必须统一化为神经网络输入层神经元的个数。在神经网络的设计中,本文将输入层设置了784个神经元,因此需要将图片统一大小值784个像素。本设计中选择通过双线性插值法将图片统一为28*28像素,统一化后的图像如图2-6所示。
![](https://i-blog.csdnimg.cn/blog_migrate/803cafc9331ce7d558b58fe194ceb191.png)
第三章 多任务神经网络的设计
3.1 基网络的设计
多任务网络因其所具有的众多优势而日益受到关注,其优势可总结如下:
- 多任务网络可以通过一个网络执行多个任务,缩小了网络的规模,减少了网络的层数,对于需要移植到移动端的神经网络,网络规模的减小将是一个巨大的优势
- 提升网络性能的传统方法为加深网络层数或增加每一层的神经元个数,而这样极可能引起神经网络中参数量的指数性爆炸。相比之下,多任务网络可通过多任务相互促进学习,提高网络性能
- 网络过拟合是神经网络中的一个重要问题,传统神经网络通过L2正则化或者dropout等方式来防止过拟合,而多任务网络可通过任务间的相互牵制来防止网络的过拟合
多任务网络的训练需要一个基网络(Base Network)作为骨干网络,在本设计中采用了一个五层ANN作为训练多任务网络的骨干网络,其网络结构如图3-1所示。
![](https://i-blog.csdnimg.cn/blog_migrate/e316a1c918f4867a2968a94cf6c25fca.png)
3.2 参数共享机制的选择
3.2.1 硬共享
参数硬共享是神经网络 MTL 最常用的方法,在实际应用中,通常通过在所有任务之间共享隐藏层,同时保留几个特定任务的输出层来实现。硬共享方式虽然可以大大降低网络过拟合的风险,但是共享隐藏层限制了子任务对各自独特特征的表达,硬共享方式要求子任务之间相关度高,对于异构任务达不到多任务相互促进学习的目的,反而可能出现“负迁移”现象。硬共享机制网络示意图如图3-2所示。
![](https://i-blog.csdnimg.cn/blog_migrate/cd7a5cd0b22bb1e0e69c2c2ab2a7fc47.png)
3.2.2 软共享
在参数软共享中,每个任务都有自己的参数和模型。模型参数之间的距离是正则化的,以便鼓励参数相似化。正则化的方式有“L2正则化”以及“迹范数“等。软共享虽然给了异构任务更多的空间去表达各自的特征,但是软共享方式参数量大,参数有效性差。软共享机制网络示意图如图3-3所示。
![](https://i-blog.csdnimg.cn/blog_migrate/91e1df01d281d3cc16025a47d5bbf33f.png)
3.2.3 层级共享
层级共享的思路是在不同的层加入不同的监督,不同的子任务共享不同的隐藏层。分级共享虽然也给异构任务更多的空间去表达自己独特的特征,但是分级共享网络的结构通常是人工手动设置,这非常依赖于设计者的技巧与洞察力。层级共享机制网络示意图如图3-4所示。
![](https://i-blog.csdnimg.cn/blog_migrate/2bfd1951d57399098bd64d4f0708efcc.png)
3.2.4 稀疏共享
“稀疏共享”方式中,网络中的每个子任务都共享隐藏层,但是子任务在隐藏层的共享是“部分共享”的,该共享方式具有以下优势。
- 部分共享为异构任务提供了各自表达的空间,网路兼容性好(解决了“硬共享”的问题)
- 共享隐藏层减少了参数量,若网络效果良好的话,则该网络变得参数有效(解决了“软共享”的问题)
- 网络结构不需要人手动设置,不依赖于专家经验(解决了“层级共享”的问题)
稀疏共享网络示意图如图3-5所示。
![](https://i-blog.csdnimg.cn/blog_migrate/936bb246b2ab7d618f392309cba75a41.png)
3.3 IMP剪枝算法
IMP(Iterative Magnitude pruning)算法可以帮助每一个子任务都能从基网络中提取出适合自己任务的子网络,其算法流程如图3-6所示:
![](https://i-blog.csdnimg.cn/blog_migrate/69e146613788b9015c944fa2f0d57b33.png)
其中,剪枝过程示意图如图3-7所示。
![](https://i-blog.csdnimg.cn/blog_migrate/cccf980fc6e66310004ec82467231531.png)
第四章 实验过程及对比
4.1 任务划分与单任务实验
针对要识别的数字与字母48种字符(数据集来自emninst),共分为4部分,每一部分12类作为一个任务。每类任务共设有500个训练样本与500个测试样本,训练次数为3000次。四个任务具体划分情况如图4-1所示。
![](https://i-blog.csdnimg.cn/blog_migrate/6f82d3f2edef5b9e7c403326744d038b.png)
任务划分好之后,针对每一个任务在基网络里做单任务实验,实验结果如图4-2所示。
![](https://i-blog.csdnimg.cn/blog_migrate/56f2274b9e6261e1cba7ccc7bb16e805.png)
通过上图可知,在单任务中,四个任务最佳训练准确率均不超过85%,测试准确率均不超过70,帧率均位于20fps左右。
4.2 多任务实验与结论
在多任务实验中,首先对每一个任务进行1000次的warm up,然后进行2000次训练,每训练12次进行一次剪枝,控制剪枝率不超过10%。在上述两步操作完成后,开始对四个任务进行平行训练。在多任务平行训练中,所用的基网络,网络初始化参数等与单任务中完全一致,但是将训练次数改为了330次,训练结果如图4-3所示。
![](https://i-blog.csdnimg.cn/blog_migrate/fccd1b0b931ff6dfefaa458f0e318c9e.png)
由上图可知,在多任务中,虽然迭代次数减少近10倍,但四个任务的最佳训练准确率均超过94%,测试准确率均大于79%。同时,任务一的帧率略微的提升。
第五章 实测与总结
在模型训练结束之后,还对模型针对任务一到任务四以及混合任务进行了实测,实测图片均为手写,实测图如图5-1所示。
![](https://i-blog.csdnimg.cn/blog_migrate/317da6b6696ca6da9d13b2723f1bf679.png)
通过实测可以看出,多任务网络针对任务一到四的准确率都比较高,针对于混合任务也表现优异。也就是说,该网络实现了通过多任务方式识别字母和数字,并且具有很高的准确率与鲁棒性。
附录 实验源代码
任务一单任务实验(训练+测试)
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import datetime
'1.数据预处理'
def Datapre(path):
file=pd.read_csv(path)
data=file['data'].values
label=file['label'].values
m=len(label)
X=np.matrix(np.zeros((m,28*28)))
Y=np.matrix(np.zeros((m,5)))
for i in range(m): #把data由str转为matrix
data[i]=np.matrix(data[i])
X[i,:]=data[i]
encoder=OneHotEncoder(sparse=False) #把label转为矩阵后进行one-hot编码
Y=(np.matrix(encoder.fit_transform(np.matrix(label).T)))
return X.T,Y.T
'2.softmax函数'
def Softmax(Z):
col=Z.shape[1]
for i in range(col):
max=np.max(Z[:,i])
Z[:,i]-=max
Z[:,i]=np.exp(Z[:,i])
Z[:,i]=Z[:,i]/np.sum(Z[:,i])
return Z
'3.relu函数'
def Relu(Z):
Z=(np.abs(Z)+Z)/2.0
return Z
'4.relu导函数'
def d_Relu(Z):
Z[Z<=0]=0
Z[Z>0]=1
return Z
'6.前向传播'
def Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4):
Z1=np.dot(W1,X)+b1
A1=Relu(Z1)
Z2=np.dot(W2,A1)+b2
A2=Relu(Z2)
Z3=np.dot(W3,A2)+b3
A3=Relu(Z3)
Z4=np.dot(W4,A3)+b4
A4=Softmax(Z4)
return A1,A2,A3,A4,Z1,Z2,Z3,Z4
'7.反向传播算法'
def Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3):
m=len(Y)
dz4=A4-Y #求dw4、db4
dw4=np.dot(dz4,A3.T)/m
db4=np.sum(np.array(dz4),axis=1,keepdims=True)/m
da3=np.dot(W4.T,dz4) #求dw3、db3
dz3=np.multiply(da3,d_Relu(Z3))
dw3=np.dot(dz3,A2.T)/m
db3=np.sum(np.array(dz3),axis=1,keepdims=True)/m
da2 = np.dot(W3.T, dz3) # 求dw2、db2
dz2 = np.multiply(da2, d_Relu(Z2))
dw2 = np.dot(dz2, A1.T) / m
db2 = np.sum(np.array(dz2), axis=1, keepdims=True) / m
da1 = np.dot(W2.T, dz2) # 求dw1、db1
dz1= np.multiply(da1, d_Relu(Z1))
dw1 = np.dot(dz1, X.T) / m
db1 = np.sum(np.array(dz1), axis=1, keepdims=True) / m
return dw1,db1,dw2,db2,dw3,db3,dw4,db4
'10.计算测试准确率'
def Computeaccuracy(A4,Y):
m=A4.shape[1]
n=A4.shape[0]
for i in range(m):
loc=np.argmax(A4[:,i])
A4[:,i]=np.matrix(np.zeros((n,1)))
A4[loc,i]=1
accuray=np.sum(np.multiply(A4,Y))/m
return accuray
'11.梯度下降算法'
def Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters):
W1_best=np.matrix(np.ones(W1.shape))
W2_best=np.matrix(np.ones(W2.shape))
W3_best=np.matrix(np.ones(W3.shape))
W4_best=np.matrix(np.ones(W4.shape))
b1_best=np.matrix(np.ones(b1.shape))
b2_best=np.matrix(np.ones(b2.shape))
b3_best=np.matrix(np.ones(b3.shape))
b4_best=np.matrix(np.ones(b4.shape))
max=0
for i in range(iters):
alpha=alpha0/(int(i/850)*0.5+1)
# alpha=alpha0
A1,A2,A3,A4,Z1,Z2,Z3,Z4=Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4)
dw1,db1,dw2,db2,dw3,db3,dw4,db4=Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3)
W1=W1-alpha*dw1
b1=b1-alpha*db1
W2=W2-alpha*dw2
b2=b2-alpha*db2
W3=W3-alpha*dw3
b3=b3-alpha*db3
W4=W4-alpha*dw4
b4=b4-alpha*db4
accuracy=Computeaccuracy(A4, Y)
if accuracy>max:
max=accuracy
W1_best=W1
W2_best=W2
W3_best=W3
W4_best=W4
b1_best=b1
b2_best=b2
b3_best=b3
b4_best=b4
print(i,accuracy)
print("Task1:")
print("Best_accuracy:",max)
return W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best
'12.初始化变量并运行'
# W1=np.matrix(np.random.randint(0,100,(285,784))*0.00001)
# b1=np.matrix(np.random.randint(0,100,(285,1))*0.00001)
# W2=np.matrix(np.random.randint(100,200,(104,285))*0.000005)
# b2=np.matrix(np.random.randint(100,200,(104,1))*0.000005)
# W3=np.matrix(np.random.randint(200,300,(37,104))*0.000003)
# b3=np.matrix(np.random.randint(200,300,(37,1))*0.000003)
# W4=np.matrix(np.random.randint(300,400,(12,37))*0.0000025)
# b4=np.matrix(np.random.randint(300,400,(12,1))*0.0000025)
# W5=np.matrix(np.random.randint(400,500,(5,13))*0.000002)
# b5=np.matrix(np.random.randint(400,500,(5,1))*0.000002)
W1=np.matrix(np.random.randint(100,200,(285,784))*0.00001)
b1=np.matrix(np.random.randint(100,200,(285,1))*0.00001)
W2=np.matrix(np.random.randint(200,300,(104,285))*0.000005)
b2=np.matrix(np.random.randint(200,300,(104,1))*0.000005)
W3=np.matrix(np.random.randint(300,400,(37,104))*0.000003)
b3=np.matrix(np.random.randint(300,400,(37,1))*0.000003)
W4=np.matrix(np.random.randint(400,500,(12,37))*0.0000025)
b4=np.matrix(np.random.randint(400,500,(12,1))*0.0000025)
alpha0=0.00006
iters=3000
X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task1\task1_train.csv')
W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best=Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters)
X_test,Y_test=X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task1\task1_test.csv')
print("Begin test:")
starttime = datetime.datetime.now() #开始计时
A4=Forwardpropagation(X_test,W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best)[3]
endtime = datetime.datetime.now() #结束计时
accuracy_test=Computeaccuracy(A4,Y_test)
time=(endtime - starttime).microseconds
fps=1/((time/6000)/1000)
print("accuracy_test,fps:",accuracy_test,',',fps)
任务二单任务实验(训练+测试)
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import datetime
'1.数据预处理'
def Datapre(path):
file=pd.read_csv(path)
data=file['data'].values
label=file['label'].values
m=len(label)
X=np.matrix(np.zeros((m,28*28)))
Y=np.matrix(np.zeros((m,5)))
for i in range(m): #把data由str转为matrix
data[i]=np.matrix(data[i])
X[i,:]=data[i]
encoder=OneHotEncoder(sparse=False) #把label转为矩阵后进行one-hot编码
Y=(np.matrix(encoder.fit_transform(np.matrix(label).T)))
return X.T,Y.T
'2.softmax函数'
def Softmax(Z):
col=Z.shape[1]
for i in range(col):
max=np.max(Z[:,i])
Z[:,i]-=max
Z[:,i]=np.exp(Z[:,i])
Z[:,i]=Z[:,i]/np.sum(Z[:,i])
return Z
'3.relu函数'
def Relu(Z):
Z=(np.abs(Z)+Z)/2.0
return Z
'4.relu导函数'
def d_Relu(Z):
Z[Z<=0]=0
Z[Z>0]=1
return Z
'6.前向传播'
def Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4):
Z1=np.dot(W1,X)+b1
A1=Relu(Z1)
Z2=np.dot(W2,A1)+b2
A2=Relu(Z2)
Z3=np.dot(W3,A2)+b3
A3=Relu(Z3)
Z4=np.dot(W4,A3)+b4
A4=Softmax(Z4)
return A1,A2,A3,A4,Z1,Z2,Z3,Z4
'7.反向传播算法'
def Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3):
m=len(Y)
dz4=A4-Y #求dw4、db4
dw4=np.dot(dz4,A3.T)/m
db4=np.sum(np.array(dz4),axis=1,keepdims=True)/m
da3=np.dot(W4.T,dz4) #求dw3、db3
dz3=np.multiply(da3,d_Relu(Z3))
dw3=np.dot(dz3,A2.T)/m
db3=np.sum(np.array(dz3),axis=1,keepdims=True)/m
da2 = np.dot(W3.T, dz3) # 求dw2、db2
dz2 = np.multiply(da2, d_Relu(Z2))
dw2 = np.dot(dz2, A1.T) / m
db2 = np.sum(np.array(dz2), axis=1, keepdims=True) / m
da1 = np.dot(W2.T, dz2) # 求dw1、db1
dz1= np.multiply(da1, d_Relu(Z1))
dw1 = np.dot(dz1, X.T) / m
db1 = np.sum(np.array(dz1), axis=1, keepdims=True) / m
return dw1,db1,dw2,db2,dw3,db3,dw4,db4
'10.计算测试准确率'
def Computeaccuracy(A4,Y):
m=A4.shape[1]
n=A4.shape[0]
for i in range(m):
loc=np.argmax(A4[:,i])
A4[:,i]=np.matrix(np.zeros((n,1)))
A4[loc,i]=1
accuray=np.sum(np.multiply(A4,Y))/m
return accuray
'11.梯度下降算法'
def Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters):
W1_best=np.matrix(np.ones(W1.shape))
W2_best=np.matrix(np.ones(W2.shape))
W3_best=np.matrix(np.ones(W3.shape))
W4_best=np.matrix(np.ones(W4.shape))
b1_best=np.matrix(np.ones(b1.shape))
b2_best=np.matrix(np.ones(b2.shape))
b3_best=np.matrix(np.ones(b3.shape))
b4_best=np.matrix(np.ones(b4.shape))
max=0
for i in range(iters):
alpha=alpha0/(int(i/850)*0.5+1)
# alpha=alpha0
A1,A2,A3,A4,Z1,Z2,Z3,Z4=Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4)
dw1,db1,dw2,db2,dw3,db3,dw4,db4=Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3)
W1=W1-alpha*dw1
b1=b1-alpha*db1
W2=W2-alpha*dw2
b2=b2-alpha*db2
W3=W3-alpha*dw3
b3=b3-alpha*db3
W4=W4-alpha*dw4
b4=b4-alpha*db4
accuracy=Computeaccuracy(A4, Y)
if accuracy>max:
max=accuracy
W1_best=W1
W2_best=W2
W3_best=W3
W4_best=W4
b1_best=b1
b2_best=b2
b3_best=b3
b4_best=b4
print(i,accuracy)
print("Task2:")
print("Best_accuracy:",max)
return W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best
'12.初始化变量并运行'
# W1=np.matrix(np.random.randint(0,100,(285,784))*0.00001)
# b1=np.matrix(np.random.randint(0,100,(285,1))*0.00001)
# W2=np.matrix(np.random.randint(100,200,(104,285))*0.000005)
# b2=np.matrix(np.random.randint(100,200,(104,1))*0.000005)
# W3=np.matrix(np.random.randint(200,300,(37,104))*0.000003)
# b3=np.matrix(np.random.randint(200,300,(37,1))*0.000003)
# W4=np.matrix(np.random.randint(300,400,(12,37))*0.0000025)
# b4=np.matrix(np.random.randint(300,400,(12,1))*0.0000025)
# W5=np.matrix(np.random.randint(400,500,(5,13))*0.000002)
# b5=np.matrix(np.random.randint(400,500,(5,1))*0.000002)
W1=np.matrix(np.random.randint(100,200,(285,784))*0.00001)
b1=np.matrix(np.random.randint(100,200,(285,1))*0.00001)
W2=np.matrix(np.random.randint(200,300,(104,285))*0.000005)
b2=np.matrix(np.random.randint(200,300,(104,1))*0.000005)
W3=np.matrix(np.random.randint(300,400,(37,104))*0.000003)
b3=np.matrix(np.random.randint(300,400,(37,1))*0.000003)
W4=np.matrix(np.random.randint(400,500,(12,37))*0.0000025)
b4=np.matrix(np.random.randint(400,500,(12,1))*0.0000025)
alpha0=0.00006
iters=3000
X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task2\task2_train.csv')
W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best=Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters)
X_test,Y_test=X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task2\task2_test.csv')
print("Begin test:")
starttime = datetime.datetime.now() #开始计时
A4=Forwardpropagation(X_test,W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best)[3]
endtime = datetime.datetime.now() #结束计时
accuracy_test=Computeaccuracy(A4,Y_test)
time=(endtime - starttime).microseconds
fps=1/((time/6000)/1000)
print("accuracy_test,fps:",accuracy_test,',',fps)
任务三单任务实验(训练+测试)
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import datetime
'1.数据预处理'
def Datapre(path):
file=pd.read_csv(path)
data=file['data'].values
label=file['label'].values
m=len(label)
X=np.matrix(np.zeros((m,28*28)))
Y=np.matrix(np.zeros((m,5)))
for i in range(m): #把data由str转为matrix
data[i]=np.matrix(data[i])
X[i,:]=data[i]
encoder=OneHotEncoder(sparse=False) #把label转为矩阵后进行one-hot编码
Y=(np.matrix(encoder.fit_transform(np.matrix(label).T)))
return X.T,Y.T
'2.softmax函数'
def Softmax(Z):
col=Z.shape[1]
for i in range(col):
max=np.max(Z[:,i])
Z[:,i]-=max
Z[:,i]=np.exp(Z[:,i])
Z[:,i]=Z[:,i]/np.sum(Z[:,i])
return Z
'3.relu函数'
def Relu(Z):
Z=(np.abs(Z)+Z)/2.0
return Z
'4.relu导函数'
def d_Relu(Z):
Z[Z<=0]=0
Z[Z>0]=1
return Z
'6.前向传播'
def Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4):
Z1=np.dot(W1,X)+b1
A1=Relu(Z1)
Z2=np.dot(W2,A1)+b2
A2=Relu(Z2)
Z3=np.dot(W3,A2)+b3
A3=Relu(Z3)
Z4=np.dot(W4,A3)+b4
A4=Softmax(Z4)
return A1,A2,A3,A4,Z1,Z2,Z3,Z4
'7.反向传播算法'
def Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3):
m=len(Y)
dz4=A4-Y #求dw4、db4
dw4=np.dot(dz4,A3.T)/m
db4=np.sum(np.array(dz4),axis=1,keepdims=True)/m
da3=np.dot(W4.T,dz4) #求dw3、db3
dz3=np.multiply(da3,d_Relu(Z3))
dw3=np.dot(dz3,A2.T)/m
db3=np.sum(np.array(dz3),axis=1,keepdims=True)/m
da2 = np.dot(W3.T, dz3) # 求dw2、db2
dz2 = np.multiply(da2, d_Relu(Z2))
dw2 = np.dot(dz2, A1.T) / m
db2 = np.sum(np.array(dz2), axis=1, keepdims=True) / m
da1 = np.dot(W2.T, dz2) # 求dw1、db1
dz1= np.multiply(da1, d_Relu(Z1))
dw1 = np.dot(dz1, X.T) / m
db1 = np.sum(np.array(dz1), axis=1, keepdims=True) / m
return dw1,db1,dw2,db2,dw3,db3,dw4,db4
'10.计算测试准确率'
def Computeaccuracy(A4,Y):
m=A4.shape[1]
n=A4.shape[0]
for i in range(m):
loc=np.argmax(A4[:,i])
A4[:,i]=np.matrix(np.zeros((n,1)))
A4[loc,i]=1
accuray=np.sum(np.multiply(A4,Y))/m
return accuray
'11.梯度下降算法'
def Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters):
W1_best=np.matrix(np.ones(W1.shape))
W2_best=np.matrix(np.ones(W2.shape))
W3_best=np.matrix(np.ones(W3.shape))
W4_best=np.matrix(np.ones(W4.shape))
b1_best=np.matrix(np.ones(b1.shape))
b2_best=np.matrix(np.ones(b2.shape))
b3_best=np.matrix(np.ones(b3.shape))
b4_best=np.matrix(np.ones(b4.shape))
max=0
for i in range(iters):
alpha=alpha0/(int(i/850)*0.5+1)
# alpha=alpha0
A1,A2,A3,A4,Z1,Z2,Z3,Z4=Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4)
dw1,db1,dw2,db2,dw3,db3,dw4,db4=Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3)
W1=W1-alpha*dw1
b1=b1-alpha*db1
W2=W2-alpha*dw2
b2=b2-alpha*db2
W3=W3-alpha*dw3
b3=b3-alpha*db3
W4=W4-alpha*dw4
b4=b4-alpha*db4
accuracy=Computeaccuracy(A4, Y)
if accuracy>max:
max=accuracy
W1_best=W1
W2_best=W2
W3_best=W3
W4_best=W4
b1_best=b1
b2_best=b2
b3_best=b3
b4_best=b4
print(i,accuracy)
print("Task3:")
print("Best_accuracy:",max)
return W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best
'12.初始化变量并运行'
# W1=np.matrix(np.random.randint(0,100,(285,784))*0.00001)
# b1=np.matrix(np.random.randint(0,100,(285,1))*0.00001)
# W2=np.matrix(np.random.randint(100,200,(104,285))*0.000005)
# b2=np.matrix(np.random.randint(100,200,(104,1))*0.000005)
# W3=np.matrix(np.random.randint(200,300,(37,104))*0.000003)
# b3=np.matrix(np.random.randint(200,300,(37,1))*0.000003)
# W4=np.matrix(np.random.randint(300,400,(12,37))*0.0000025)
# b4=np.matrix(np.random.randint(300,400,(12,1))*0.0000025)
# W5=np.matrix(np.random.randint(400,500,(5,13))*0.000002)
# b5=np.matrix(np.random.randint(400,500,(5,1))*0.000002)
W1=np.matrix(np.random.randint(100,200,(285,784))*0.00001)
b1=np.matrix(np.random.randint(100,200,(285,1))*0.00001)
W2=np.matrix(np.random.randint(200,300,(104,285))*0.000005)
b2=np.matrix(np.random.randint(200,300,(104,1))*0.000005)
W3=np.matrix(np.random.randint(300,400,(37,104))*0.000003)
b3=np.matrix(np.random.randint(300,400,(37,1))*0.000003)
W4=np.matrix(np.random.randint(400,500,(12,37))*0.0000025)
b4=np.matrix(np.random.randint(400,500,(12,1))*0.0000025)
alpha0=0.00006
iters=3000
X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task3\task3_train.csv')
W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best=Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters)
X_test,Y_test=X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task3\task3_test.csv')
print("Begin test:")
starttime = datetime.datetime.now() #开始计时
A4=Forwardpropagation(X_test,W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best)[3]
endtime = datetime.datetime.now() #结束计时
accuracy_test=Computeaccuracy(A4,Y_test)
time=(endtime - starttime).microseconds
fps=1/((time/6000)/1000)
print("accuracy_test,fps:",accuracy_test,',',fps)
任务四单任务实验(训练+测试)
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import datetime
'1.数据预处理'
def Datapre(path):
file=pd.read_csv(path)
data=file['data'].values
label=file['label'].values
m=len(label)
X=np.matrix(np.zeros((m,28*28)))
Y=np.matrix(np.zeros((m,5)))
for i in range(m): #把data由str转为matrix
data[i]=np.matrix(data[i])
X[i,:]=data[i]
encoder=OneHotEncoder(sparse=False) #把label转为矩阵后进行one-hot编码
Y=(np.matrix(encoder.fit_transform(np.matrix(label).T)))
return X.T,Y.T
'2.softmax函数'
def Softmax(Z):
col=Z.shape[1]
for i in range(col):
max=np.max(Z[:,i])
Z[:,i]-=max
Z[:,i]=np.exp(Z[:,i])
Z[:,i]=Z[:,i]/np.sum(Z[:,i])
return Z
'3.relu函数'
def Relu(Z):
Z=(np.abs(Z)+Z)/2.0
return Z
'4.relu导函数'
def d_Relu(Z):
Z[Z<=0]=0
Z[Z>0]=1
return Z
'6.前向传播'
def Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4):
Z1=np.dot(W1,X)+b1
A1=Relu(Z1)
Z2=np.dot(W2,A1)+b2
A2=Relu(Z2)
Z3=np.dot(W3,A2)+b3
A3=Relu(Z3)
Z4=np.dot(W4,A3)+b4
A4=Softmax(Z4)
return A1,A2,A3,A4,Z1,Z2,Z3,Z4
'7.反向传播算法'
def Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3):
m=len(Y)
dz4=A4-Y #求dw4、db4
dw4=np.dot(dz4,A3.T)/m
db4=np.sum(np.array(dz4),axis=1,keepdims=True)/m
da3=np.dot(W4.T,dz4) #求dw3、db3
dz3=np.multiply(da3,d_Relu(Z3))
dw3=np.dot(dz3,A2.T)/m
db3=np.sum(np.array(dz3),axis=1,keepdims=True)/m
da2 = np.dot(W3.T, dz3) # 求dw2、db2
dz2 = np.multiply(da2, d_Relu(Z2))
dw2 = np.dot(dz2, A1.T) / m
db2 = np.sum(np.array(dz2), axis=1, keepdims=True) / m
da1 = np.dot(W2.T, dz2) # 求dw1、db1
dz1= np.multiply(da1, d_Relu(Z1))
dw1 = np.dot(dz1, X.T) / m
db1 = np.sum(np.array(dz1), axis=1, keepdims=True) / m
return dw1,db1,dw2,db2,dw3,db3,dw4,db4
'10.计算测试准确率'
def Computeaccuracy(A4,Y):
m=A4.shape[1]
n=A4.shape[0]
for i in range(m):
loc=np.argmax(A4[:,i])
A4[:,i]=np.matrix(np.zeros((n,1)))
A4[loc,i]=1
accuray=np.sum(np.multiply(A4,Y))/m
return accuray
'11.梯度下降算法'
def Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters):
W1_best=np.matrix(np.ones(W1.shape))
W2_best=np.matrix(np.ones(W2.shape))
W3_best=np.matrix(np.ones(W3.shape))
W4_best=np.matrix(np.ones(W4.shape))
b1_best=np.matrix(np.ones(b1.shape))
b2_best=np.matrix(np.ones(b2.shape))
b3_best=np.matrix(np.ones(b3.shape))
b4_best=np.matrix(np.ones(b4.shape))
max=0
for i in range(iters):
alpha=alpha0/(int(i/850)*0.5+1)
# alpha=alpha0
A1,A2,A3,A4,Z1,Z2,Z3,Z4=Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4)
dw1,db1,dw2,db2,dw3,db3,dw4,db4=Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3)
W1=W1-alpha*dw1
b1=b1-alpha*db1
W2=W2-alpha*dw2
b2=b2-alpha*db2
W3=W3-alpha*dw3
b3=b3-alpha*db3
W4=W4-alpha*dw4
b4=b4-alpha*db4
accuracy=Computeaccuracy(A4, Y)
if accuracy>max:
max=accuracy
W1_best=W1
W2_best=W2
W3_best=W3
W4_best=W4
b1_best=b1
b2_best=b2
b3_best=b3
b4_best=b4
print(i,accuracy)
print("Task4:")
print("Best_accuracy:",max)
return W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best
'12.初始化变量并运行'
# W1=np.matrix(np.random.randint(0,100,(285,784))*0.00001)
# b1=np.matrix(np.random.randint(0,100,(285,1))*0.00001)
# W2=np.matrix(np.random.randint(100,200,(104,285))*0.000005)
# b2=np.matrix(np.random.randint(100,200,(104,1))*0.000005)
# W3=np.matrix(np.random.randint(200,300,(37,104))*0.000003)
# b3=np.matrix(np.random.randint(200,300,(37,1))*0.000003)
# W4=np.matrix(np.random.randint(300,400,(12,37))*0.0000025)
# b4=np.matrix(np.random.randint(300,400,(12,1))*0.0000025)
# W5=np.matrix(np.random.randint(400,500,(5,13))*0.000002)
# b5=np.matrix(np.random.randint(400,500,(5,1))*0.000002)
W1=np.matrix(np.random.randint(100,200,(285,784))*0.00001)
b1=np.matrix(np.random.randint(100,200,(285,1))*0.00001)
W2=np.matrix(np.random.randint(200,300,(104,285))*0.000005)
b2=np.matrix(np.random.randint(200,300,(104,1))*0.000005)
W3=np.matrix(np.random.randint(300,400,(37,104))*0.000003)
b3=np.matrix(np.random.randint(300,400,(37,1))*0.000003)
W4=np.matrix(np.random.randint(400,500,(12,37))*0.0000025)
b4=np.matrix(np.random.randint(400,500,(12,1))*0.0000025)
alpha0=0.00006
iters=3000
X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task4\task4_train.csv')
W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best=Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha0,iters)
X_test,Y_test=X,Y=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task4\task4_test.csv')
print("Begin test:")
starttime = datetime.datetime.now() #开始计时
A4=Forwardpropagation(X_test,W1_best,b1_best,W2_best,b2_best,W3_best,b3_best,W4_best,b4_best)[3]
endtime = datetime.datetime.now() #结束计时
accuracy_test=Computeaccuracy(A4,Y_test)
time=(endtime - starttime).microseconds
fps=1/((time/6000)/1000)
print("accuracy_test,fps:",accuracy_test,',',fps)
多任务实验(训练)
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import datetime
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
'1.数据预处理'
def Datapre(path):
file=pd.read_csv(path)
data=file['data'].values
label=file['label'].values
m=len(label)
X=np.matrix(np.zeros((m,28*28)))
Y=np.matrix(np.zeros((m,5)))
for i in range(m): #把data由str转为matrix
data[i]=np.matrix(data[i])
X[i,:]=data[i]
encoder=OneHotEncoder(sparse=False) #把label转为矩阵后进行one-hot编码
Y=(np.matrix(encoder.fit_transform(np.matrix(label).T)))
return X.T,Y.T
'2.softmax函数'
def Softmax(Z):
col=Z.shape[1]
for i in range(col):
max=np.max(Z[:,i])
Z[:,i]-=max
Z[:,i]=np.exp(Z[:,i])
Z[:,i]=Z[:,i]/np.sum(Z[:,i])
return Z
'3.relu函数'
def Relu(Z):
Z=(np.abs(Z)+Z)/2.0
return Z
'4.relu导函数'
def d_Relu(Z):
Z[Z<=0]=0
Z[Z>0]=1
return Z
'5.前向传播'
def Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4):
Z1=np.dot(W1,X)+b1
A1=Relu(Z1)
Z2=np.dot(W2,A1)+b2
A2=Relu(Z2)
Z3=np.dot(W3,A2)+b3
A3=Relu(Z3)
Z4=np.dot(W4,A3)+b4
A4=Softmax(Z4)
return A1,A2,A3,A4,Z1,Z2,Z3,Z4
'6.反向传播算法'
def Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3):
m=len(Y)
dz4=A4-Y #求dw4、db4
dw4=np.dot(dz4,A3.T)/m
db4=np.sum(np.array(dz4),axis=1,keepdims=True)/m
da3=np.dot(W4.T,dz4) #求dw3、db3
dz3=np.multiply(da3,d_Relu(Z3))
dw3=np.dot(dz3,A2.T)/m
db3=np.sum(np.array(dz3),axis=1,keepdims=True)/m
da2 = np.dot(W3.T, dz3) # 求dw2、db2
dz2 = np.multiply(da2, d_Relu(Z2))
dw2 = np.dot(dz2, A1.T) / m
db2 = np.sum(np.array(dz2), axis=1, keepdims=True) / m
da1 = np.dot(W2.T, dz2) # 求dw1、db1
dz1= np.multiply(da1, d_Relu(Z1))
dw1 = np.dot(dz1, X.T) / m
db1 = np.sum(np.array(dz1), axis=1, keepdims=True) / m
return dw1,db1,dw2,db2,dw3,db3,dw4,db4
'7.计算测试准确率'
def Computeaccuracy(A4,Y):
m=A4.shape[1]
n=A4.shape[0]
for i in range(m):
loc=np.argmax(A4[:,i])
A4[:,i]=np.matrix(np.zeros((n,1)))
A4[loc,i]=1
accuray=np.sum(np.multiply(A4,Y))/m
return accuray
'8.梯度下降算法'
def Gradiendescent(X,Y,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M1,M2,M3,M4,mode):
if mode==1: #获取掩码模式,不需要掩码,掩码矩阵置为全1(平行训练模式为2,需要子网络)
M1=np.matrix(np.ones(M1.shape))
M2=np.matrix(np.ones(M2.shape))
M3=np.matrix(np.ones(M3.shape))
M4 = np.matrix(np.ones(M4.shape))
A1,A2,A3,A4,Z1,Z2,Z3,Z4=Forwardpropagation(X,np.multiply(W1,M1),b1,np.multiply(W2,M2),b2,np.multiply(W3,M3),b3,np.multiply(W4,M4),b4) #进行前向传播
dw1,db1,dw2,db2,dw3,db3,dw4,db4=Backpropagation(X,Y,A1,A2,A3,A4,W2,W3,W4,Z1,Z2,Z3) #进行反向传播
W1=W1-alpha*np.multiply(dw1,M1) #更新参数
b1=b1-alpha*db1
W2=W2-alpha*np.multiply(dw2,M2)
b2=b2-alpha*db2
W3=W3-alpha*np.multiply(dw3,M3)
b3=b3-alpha*db3
W4=W4-alpha*np.multiply(dw4,M4)
b4=b4-alpha*db4
return W1,W2,W3,W4,b1,b2,b3,b4,A4
'9.剪枝函数'
def Pruning(W1,W2,W3,W4,M1,M2,M3,M4):
if np.sum(M1)/np.sum(np.matrix(np.ones((M1.shape))))>0.6:
threshold1 = abs(W1[M1!= 0]).min() #为第一层剪枝
M1[abs(W1)==threshold1]=0
if np.sum(M2) / np.sum(np.matrix(np.ones((M2.shape)))) > 0.7:
threshold2 = abs(W2[M2 != 0]).min() #为第二层剪枝
M2[abs(W2) == threshold2] = 0
if np.sum(M3) /np.sum(np.matrix(np.ones((M3.shape)))) > 0.8:
threshold3 = abs(W3[M3 != 0]).min() #为第三层剪枝
M3[abs(W3) == threshold3] = 0
if np.sum(M4) /np.sum(np.matrix(np.ones((M4.shape)))) > 0.9:
threshold4 = abs(W4[M4 != 0]).min() #为第四层剪枝
M4[abs(W4) == threshold4] = 0
return M1,M2,M3,M4
'10.生成任务一的掩码矩阵(每训练10次剪一个参数)'
def Get_Mask1(X1,Y1,W1,b1,W2,b2,W3,b3,W4,b4,alpha0):
M11=np.matrix(np.ones(W1.shape)) #任务1在5层网络上的掩码矩阵
M12=np.matrix(np.ones(W2.shape))
M13=np.matrix(np.ones(W3.shape))
M14=np.matrix(np.ones(W4.shape))
for i in range(1,1001): #预训练1000轮
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X1,Y1,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M11,M12,M13,M14,1)[0:8]
print('\r Warm up1: %g' % (i /1000 * 100), '%', end='')
print('\n')
for i in range(1,2001): #开始训练、剪枝
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X1,Y1,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M11,M12,M13,M14,1)[0:8]
if i%12==0: #剪枝
M11,M12,M13,M14=Pruning(W1,W2,W3,W4,M11,M12,M13,M14)
print('\r Task1_pruning: %g' % (i/2000*100),'%', end='')
print('\n',"Task1_pruning end",'\n')
return M11,M12,M13,M14
'11.生成任务二的掩码矩阵(每训练10次剪一个参数)'
def Get_Mask2(X2,Y2,W1,b1,W2,b2,W3,b3,W4,b4,alpha0):
M21=np.matrix(np.ones(W1.shape)) # 任务1在5层网络上的掩码矩阵
M22=np.matrix(np.ones(W2.shape))
M23=np.matrix(np.ones(W3.shape))
M24=np.matrix(np.ones(W4.shape))
for i in range(1,1001): #预训练
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X2,Y2,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M21,M22,M23,M24,1)[0:8]
print('\r Warm up2: %g' % (i / 1000 * 100), '%', end='')
print('\n')
for i in range(1,2001):
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X2,Y2,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M21,M22,M23,M24,1)[0:8] #预训练
if i%12==0:
M21,M22,M23,M24=Pruning(W1,W2,W3,W4,M21,M22,M23,M24) #剪枝
print('\r Task2_pruning: %g' % (i/2000*100),'%', end='')
print('\n',"Task2_pruning end",'\n')
return M21,M22,M23,M24
'11.生成任务三的掩码矩阵(每训练10次剪一个参数)'
def Get_Mask3(X3,Y3,W1,b1,W2,b2,W3,b3,W4,b4,alpha0):
M31=np.matrix(np.ones(W1.shape)) # 任务1在5层网络上的掩码矩阵
M32=np.matrix(np.ones(W2.shape))
M33=np.matrix(np.ones(W3.shape))
M34=np.matrix(np.ones(W4.shape))
for i in range(1,1001): #预训练
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X3,Y3,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M31,M32,M33,M34,1)[0:8]
print('\r Warm up3: %g' % (i / 1000 * 100), '%', end='')
print('\n')
for i in range(1,2001):
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X3,Y3,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M31,M32,M33,M34,1)[0:8] #预训练
if i%12==0:
M31,M32,M33,M34=Pruning(W1,W2,W3,W4,M31,M32,M33,M34) #剪枝
print('\r Task3_pruning: %g' % (i/2000*100),'%', end='')
print('\n',"Task3_pruning end",'\n')
return M31,M32,M33,M34
'11.生成任务四的掩码矩阵(每训练10次剪一个参数)'
def Get_Mask4(X4,Y4,W1,b1,W2,b2,W3,b3,W4,b4,alpha0):
M41=np.matrix(np.ones(W1.shape)) # 任务1在5层网络上的掩码矩阵
M42=np.matrix(np.ones(W2.shape))
M43=np.matrix(np.ones(W3.shape))
M44=np.matrix(np.ones(W4.shape))
for i in range(1,1001): #预训练1000轮
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X4,Y4,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M41,M42,M43,M44,1)[0:8]
print('\r Warm up4: %g' % (i / 1000 * 100), '%', end='')
print('\n')
for i in range(1,2001):
alpha=alpha0/(int(i/850)*0.5+1)
W1,W2,W3,W4,b1,b2,b3,b4=Gradiendescent(X4,Y4,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M41,M42,M43,M44,1)[0:8] #预训练
if i%12==0:
M41,M42,M43,M44=Pruning(W1,W2,W3,W4,M41,M42,M43,M44) #剪枝
print('\r Task4_pruning: %g' % (i/2000*100),'%', end='')
print('\n',"Task4_pruning end",'\n')
return M41,M42,M43,M44
'12.开始运行'
##参数初始化
W1=np.matrix(np.random.randint(100,200,(285,784))*0.00001)
b1=np.matrix(np.random.randint(100,200,(285,1))*0.00001)
W2=np.matrix(np.random.randint(200,300,(104,285))*0.000005)
b2=np.matrix(np.random.randint(200,300,(104,1))*0.000005)
W3=np.matrix(np.random.randint(300,400,(37,104))*0.000003)
b3=np.matrix(np.random.randint(300,400,(37,1))*0.000003)
W4=np.matrix(np.random.randint(400,500,(12,37))*0.0000025)
b4=np.matrix(np.random.randint(400,500,(12,1))*0.0000025)
alpha0=0.00006
iters=330 #平行训练次数
max1=0 #记录平行训练中任务1~4的最高准确率以及平均最高准确率
max2=0
max3=0
max4=0
max_mean=0
##读数据
X1,Y1=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task1\task1_train.csv')
X2,Y2=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task2\task2_train.csv')
X3,Y3=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task3\task3_train.csv')
X4,Y4=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task4\task4_train.csv')
##稀疏剪枝
M11,M12,M13,M14=Get_Mask1(X1,Y1,W1,b1,W2,b2,W3,b3,W4,b4,alpha0)
M21,M22,M23,M24=Get_Mask2(X2,Y2,W1,b1,W2,b2,W3,b3,W4,b4,alpha0)
M31,M32,M33,M34=Get_Mask3(X3,Y3,W1,b1,W2,b2,W3,b3,W4,b4,alpha0)
M41,M42,M43,M44=Get_Mask4(X4,Y4,W1,b1,W2,b2,W3,b3,W4,b4,alpha0)
##平行训练
W11=np.matrix(np.ones(W1.shape)) #储存任务一的子网络
W12=np.matrix(np.ones(W2.shape))
W13=np.matrix(np.ones(W3.shape))
W14=np.matrix(np.ones(W4.shape))
W21=np.matrix(np.ones(W1.shape)) #任务二的子网络
W22=np.matrix(np.ones(W2.shape))
W23=np.matrix(np.ones(W3.shape))
W24=np.matrix(np.ones(W4.shape))
W31=np.matrix(np.ones(W1.shape)) #任务三的子网络
W32=np.matrix(np.ones(W2.shape))
W33=np.matrix(np.ones(W3.shape))
W34=np.matrix(np.ones(W4.shape))
W41=np.matrix(np.ones(W1.shape)) #任务四的子网络
W42=np.matrix(np.ones(W2.shape))
W43=np.matrix(np.ones(W3.shape))
W44=np.matrix(np.ones(W4.shape))
b1_best=np.matrix(np.ones(b1.shape))
b2_best=np.matrix(np.ones(b2.shape))
b3_best=np.matrix(np.ones(b3.shape))
b4_best=np.matrix(np.ones(b4.shape))
'平行训练'
for i in range(1,iters+1):
# alpha=alpha0/(int(i/850)*0.5+1)
alpha=alpha0
W1,W2,W3,W4,b1,b2,b3,b4,A4=Gradiendescent(X1,Y1,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M11,M12,M13,M14,2) #在子网络1训练
accuracy1_train=Computeaccuracy(A4,Y1)
if accuracy1_train>max1:
max1=accuracy1_train
W1,W2,W3,W4,b1,b2,b3,b4,A4=Gradiendescent(X2,Y2,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M21,M22,M23,M24,2) # 在子网络2训练
accuracy2_train = Computeaccuracy(A4, Y2)
if accuracy2_train > max2:
max2 =accuracy2_train
W1,W2,W3,W4,b1,b2,b3,b4,A4=Gradiendescent(X3,Y3,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M31,M32,M33,M34,2) # 在子网络3训练
accuracy3_train = Computeaccuracy(A4,Y3)
if accuracy3_train > max3:
max3 = accuracy3_train
W1,W2,W3,W4,b1,b2,b3,b4,A4=Gradiendescent(X4,Y4,W1,b1,W2,b2,W3,b3,W4,b4,alpha,M41,M42,M43,M44,2) # 在子网络3训练
accuracy4_train = Computeaccuracy(A4,Y4)
if accuracy4_train > max4:
max4=accuracy4_train
if (((max1+max2+max3+max4)/4)>max_mean)&((max(max1,max2,max3,max4)-min(max1,max2,max3,max4))<=0.25): #存储最佳权重矩阵用于测试
max_mean=(max1+max2+max3+max4)/4
W11 = np.multiply(W1, M11) # 获取任务一的权重矩阵
W12 = np.multiply(W2, M12)
W13 = np.multiply(W3, M13)
W14 = np.multiply(W4, M14)
W21 = np.multiply(W1, M21) # 获取任务一的权重矩阵
W22 = np.multiply(W2, M22)
W23 = np.multiply(W3, M23)
W24 = np.multiply(W4, M24)
W31 = np.multiply(W1, M31) # 获取任务三的权重矩阵
W32 = np.multiply(W2, M32)
W33 = np.multiply(W3, M33)
W34 = np.multiply(W4, M34)
W41 = np.multiply(W1, M41) # 获取任务四的权重矩阵
W42 = np.multiply(W2, M42)
W43 = np.multiply(W3, M43)
W44 = np.multiply(W4, M44)
b1_best=b1
b2_best=b2
b3_best=b3
b4_best=b4
print('\r Train_process:%g' % (i /iters * 100),'%',' ','Task1_accuracy:',accuracy1_train, 'Task2_accuracy:',accuracy2_train,'Task3_accuracy:',accuracy3_train,'Task4_accuracy:',accuracy4_train,end='')
print("Best_train_task1:",max1)
print("Best_train_task2:",max2)
print("Best_train_task3:",max3)
print("Best_train_task4:",max4)
'开始测试网络'
print("Begin test....")
X1_test,Y1_test=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task1\task1_test.csv')
X2_test,Y2_test=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task2\task2_test.csv')
X3_test,Y3_test=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task3\task3_test.csv')
X4_test,Y4_test=Datapre(r'C:\Users\MSI-PC\Desktop\Task\Task4\task4_test.csv')
'测试任务一网络'
starttime1 = datetime.datetime.now() #开始计时
A4=Forwardpropagation(X1_test,W11,b1_best,W12,b2_best,W13,b3_best,W14,b4_best)[3]
endtime1 = datetime.datetime.now() #结束计时
accuracy1_test=Computeaccuracy(A4,Y1_test)
time1=(endtime1 - starttime1).microseconds
fps1=1/((time1/6000)/1000)
print("accuracy1_test,fps1:",accuracy1_test,',',fps1)
'测试任务二网络'
starttime2= datetime.datetime.now() #开始计时
A4=Forwardpropagation(X2_test,W21,b1_best,W22,b2_best,W23,b3_best,W24,b4_best)[3]
endtime2 = datetime.datetime.now() #结束计时
accuracy2_test=Computeaccuracy(A4,Y2_test)
time2=(endtime2 - starttime2).microseconds
fps2=1/((time2/6000)/1000)
print("accuracy2_test,fps2:",accuracy2_test,',',fps2)
'测试任务三网络'
starttime3= datetime.datetime.now() #开始计时
A4=Forwardpropagation(X3_test,W31,b1_best,W32,b2_best,W33,b3_best,W34,b4_best)[3]
endtime3 = datetime.datetime.now() #结束计时
accuracy3_test=Computeaccuracy(A4,Y3_test)
time3=(endtime3 - starttime3).microseconds
fps3=1/((time3/6000)/1000)
print("accuracy3_test,fps3:",accuracy3_test,',',fps3)
'测试任务四网络'
starttime4= datetime.datetime.now() #开始计时
A4=Forwardpropagation(X4_test,W41,b1_best,W42,b2_best,W43,b3_best,W44,b4_best)[3]
endtime4 = datetime.datetime.now() #结束计时
accuracy4_test=Computeaccuracy(A4,Y4_test)
time4=(endtime4 - starttime4).microseconds
fps4=1/((time4/6000)/1000)
print("accuracy4_test,fps4:",accuracy4_test,',',fps4)
####只改了阈值,把0.9改成了0.5-0.9
'保存模型'
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W11.csv',W11, delimiter=',') #任务一的模型
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W12.csv',W12, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W13.csv',W13, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W14.csv',W14, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W21.csv',W21, delimiter=',') #任务2的模型
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W22.csv',W22, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W23.csv',W23, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W24.csv',W24, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W31.csv',W31, delimiter=',') #任务三的模型
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W32.csv',W32, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W33.csv',W33, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W34.csv',W34, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W41.csv',W41, delimiter=',') #任务四的模型
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W42.csv',W42, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W43.csv',W43, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W44.csv',W44, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b1.csv',b1_best, delimiter=',') #b值
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b2.csv',b2_best, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b3.csv',b3_best, delimiter=',')
np.savetxt('C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b4.csv',b4_best, delimiter=',')
多任务实验(测试)
import numpy as np
import matplotlib.pyplot as plt
import cv2
'1.读入模型'
W11=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W11.csv","rb"),delimiter=",",skiprows=0) #读入任务一的模型
W12=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W12.csv","rb"),delimiter=",",skiprows=0)
W13=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W13.csv","rb"),delimiter=",",skiprows=0)
W14=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W14.csv","rb"),delimiter=",",skiprows=0)
W21=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W21.csv","rb"),delimiter=",",skiprows=0) #读入任务一的模型
W22=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\Model\\W22.csv","rb"),delimiter=",",skiprows=0)
W23=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W23.csv","rb"),delimiter=",",skiprows=0)
W24=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W24.csv","rb"),delimiter=",",skiprows=0)
W31=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W31.csv","rb"),delimiter=",",skiprows=0) #读入任务三的模型
W32=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W32.csv","rb"),delimiter=",",skiprows=0)
W33=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W33.csv","rb"),delimiter=",",skiprows=0)
W34=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W34.csv","rb"),delimiter=",",skiprows=0)
W41=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W41.csv","rb"),delimiter=",",skiprows=0) #读入任务四的模型
W42=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W42.csv","rb"),delimiter=",",skiprows=0)
W43=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W43.csv","rb"),delimiter=",",skiprows=0)
W44=np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\W44.csv","rb"),delimiter=",",skiprows=0)
b1=np.matrix(np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b1.csv","rb"),delimiter=",",skiprows=0)).T#读入b值
b2=np.matrix(np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b2.csv","rb"),delimiter=",",skiprows=0)).T
b3=np.matrix(np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b3.csv","rb"),delimiter=",",skiprows=0)).T
b4=np.matrix(np.loadtxt(open("C:\\Users\\MSI-PC\\Desktop\\Task\\IPM\\Model\\b4.csv","rb"),delimiter=",",skiprows=0)).T
'2.图片切割子函数'
def Cut_picture(img):
ret, img = cv2.threshold(img, 80, 255, 0) #图像二值化
r1=img.shape[0]
c=img.shape[1]
img_new=np.zeros((1,c))
j=0
loc=[]
loc_=[]
for i in range(r1): #按行切割,除去白色部分
if np.sum(img[i,:])<c*0.995*255: #检测有信息的行,每行有超过0.5%以上的像素是黑色则保留
img_new=np.row_stack((img_new,img[i,:]))
r2= img_new.shape[0] #按列切割:记录切割点
while j<c:
if np.sum(img_new[:,j])<r2*0.995*255: #白往黑过度(检测有信息行),检测黑像素是否多于某值
loc.append(j)
j+=1
while j<c:
if np.sum(img_new[:,j])>r2*0.97*255: #黑往白过度(检测无信息行):每列有超过97%以上的像素是白色则认为无信息,记录位置2
loc.append(j)
break
j+=1
j+=1
i=0#开始切割
while i < len(loc) - 1:
if loc[i + 1] - loc[i] > 2: # 最窄字符宽度
temp = img_new[:, loc[i]:loc[i + 1]]
if np.sum(temp) < 0.97 * temp.shape[0] * temp.shape[1] * 255: # 白色部分小于95%(黑色像素大于5%)
loc_.append(loc[i])
loc_.append(loc[i+1])
i+=2
else:
i+=1
else:
i += 1
picture_value=np.zeros((int(len(loc_)/2),28*28))
k=0
i=0
while i<len(loc_): #对所切图像标准化并储存
picture=img_new[:,loc_[i]:loc_[i+1]]
picture=cv2.resize(picture,(28,28))
picture=picture.reshape(-1,28*28)
picture_value[k,:]=picture
k+=1
i+=2
return picture_value
'3.relu函数'
def Relu(Z):
Z=(np.abs(Z)+Z)/2.0
return Z
'4.Softmax函数'
def Softmax(Z):
col=Z.shape[1]
for i in range(col):
max=np.max(Z[:,i])
Z[:,i]-=max
Z[:,i]=np.exp(Z[:,i])
Z[:,i]=Z[:,i]/np.sum(Z[:,i])
return Z
'5.前向传播函数'
def Forwardpropagation(X,W1,b1,W2,b2,W3,b3,W4,b4):
Z1=np.dot(W1,X)+b1
A1=Relu(Z1)
Z2=np.dot(W2,A1)+b2
A2=Relu(Z2)
Z3=np.dot(W3,A2)+b3
A3=Relu(Z3)
Z4=np.dot(W4,A3)+b4
A4=Softmax(Z4)
return A1,A2,A3,A4,Z1,Z2,Z3,Z4
'7.预测函数'
def Predict(X,task): #预测图片中的值
label=['0','1','2','3','4','5','6','7','8','9','A','a','B','b','c','D','d','E','e','F','f','G','g','H','h','I','i','j','k','L','m','N','n','o','p','Q','q','R','r','S','T','t','u','v','w','x','y','z']
predict_pro=np.zeros((4,12))
predict_value=[]
if task==1:
for i in range(X.shape[0]):
x = np.matrix(X[i, :])
x = 255 - x
A4 = Forwardpropagation(x.reshape(28 * 28, -1), W11, b1, W12, b2, W13, b3, W14, b4)[3]
predict_pro[0, :] = A4.T
predict_value.append(label[np.argmax(predict_pro)])
elif task==2:
for i in range(X.shape[0]):
x = np.matrix(X[i, :])
x = 255 - x
A4 = Forwardpropagation(x.reshape(28 * 28, -1), W21, b1, W22, b2, W23, b3, W24, b4)[3]
predict_pro[1, :] = A4.T
predict_value.append(label[np.argmax(predict_pro)])
elif task==3:
for i in range(X.shape[0]):
x = np.matrix(X[i, :])
x = 255 - x
A4 = Forwardpropagation(x.reshape(28 * 28, -1), W31, b1, W32, b2, W33, b3, W34, b4)[3]
predict_pro[2, :] = A4.T
predict_value.append(label[np.argmax(predict_pro)])
elif task==4:
for i in range(X.shape[0]):
x = np.matrix(X[i, :])
x = 255 - x
A4 = Forwardpropagation(x.reshape(28 * 28, -1), W41, b1, W42, b2, W43, b3, W44, b4)[3]
predict_pro[3, :] = A4.T
predict_value.append(label[np.argmax(predict_pro)])
else:
for i in range(X.shape[0]):
x=np.matrix(X[i,:])
x=255-x
A4=Forwardpropagation(x.reshape(28*28,-1),W11,b1,W12,b2,W13,b3,W14,b4)[3]
predict_pro[0,:]=A4.T
A4=Forwardpropagation(x.reshape(28*28,-1), W21, b1, W22, b2, W23, b3, W24, b4)[3]
predict_pro[1,:]=A4.T
A4 = Forwardpropagation(x.reshape(28*28,-1), W31, b1, W32, b2, W33, b3, W34, b4)[3]
predict_pro[2,:]=A4.T
A4=Forwardpropagation(x.reshape(28*28,-1), W41, b1, W42, b2, W43, b3, W44, b4)[3]
predict_pro[3,:]=A4.T
predict_value.append(label[np.argmax(predict_pro)])
return predict_value
'9.执行任务一'
img= cv2.imread(r'C:\Users\MSI-PC\Desktop\Task\IPM\picture1.png',0)
X= Cut_picture(img)
print("Task1:",Predict(X,1))
plt.figure()
plt.imshow(img,cmap="gray")
plt.title('Task1')
plt.show()
'10.执行任务二'
img= cv2.imread(r'C:\Users\MSI-PC\Desktop\Task\IPM\picture2.jpg',0)
X= Cut_picture(img)
print("Task2:",Predict(X,2))
plt.figure()
plt.imshow(img,cmap="gray")
plt.title('Task2')
plt.show()
'11.执行任务三'
img= cv2.imread(r'C:\Users\MSI-PC\Desktop\Task\IPM\picture3.jpg',0)
X= Cut_picture(img)
print("Task3:",Predict(X,3))
plt.figure()
plt.imshow(img,cmap="gray")
plt.title('Task3')
plt.show()
'12.执行任务四'
img= cv2.imread(r'C:\Users\MSI-PC\Desktop\Task\IPM\picture4.jpg',0)
X= Cut_picture(img)
print("Task4:",Predict(X,4))
plt.figure()
plt.imshow(img,cmap="gray")
plt.title('Task4')
plt.show()
'13.执行混合任务'
img = cv2.imread(r'C:\Users\MSI-PC\Desktop\Task\IPM\picture5.jpg',0)
X= Cut_picture(img)
print("Multi_Task1:",Predict(X,5))
plt.figure()
plt.imshow(img,cmap="gray")
plt.title('Multi-Task1')
plt.show()
img = cv2.imread(r'C:\Users\MSI-PC\Desktop\Task\IPM\picture6.jpg',0)
X= Cut_picture(img)
print("Multi_Task2:",Predict(X,5))
plt.figure()
plt.imshow(img,cmap="gray")
plt.title('Multi-Task2')
plt.show()
左肩理想右肩担当,君子不怨永远不会停下脚步!