1、卷积结构
卷积神经网络结构包括:卷积层,降采样层,全链接层。每一层有多个特征图,每个特征图通过一种卷积滤波器提取输入的一种特征,每个特征图有多个神经元。
卷积层:通过卷积运算,可以使原信号特征增强,并且降低噪音。他的核心是一个对于原图片进行一个卷积运算,每一个卷积运算对应一个卷积核,一个卷积核得出一个特征图,如上图第一次卷积得出四个特征图,第二次卷积有三个卷积核,每一次的卷积会将图片的像素降低。
卷积层的map个数是在网络初始化指定的,而卷积层的map的大小是由卷积核和上一层输入map的大小决定的,假设上一层的map大小是n*n、卷积核的大小是k*k,则该层的map大小是(n-k+1)*(n-k+1)。
降采样:使用降采样的原因是,根据图像局部相关性的原理,对图像进行子采样可以减少计算量,同时保持图像旋转不变性。他是将多个像素压缩成一个像素,如上图将两个像素压缩为一个。
采样层是对上一层map的一个采样处理,这里的采样方式是对上一层map的相邻小区域进行聚合统计,区域大小为scale*scale,有些实现是取小区域的最大值,而ToolBox里面的实现是采用2*2小区域的均值。
全连接层:采用softmax全连接,得到的激活值即卷积神经网络提取到的图片特征。
2、训练算法
训练算法主要包括四步,这四步被分为两个阶段:
第一阶段,向前传播阶段:
(1)、从样本集中取一个样本,输入网络;
(2)、计算相应的实际输出;在此阶段,信息从输入层经过逐级的变换,传送到输出层。这个过程也是网络在完成训练后正常执行时执行的过程。
第二阶段,向后传播阶段:
(1)、计算实际输出与相应的理想输出的差;
(2)、按极小化误差的方法调整权矩阵。
这两个阶段的工作一般应受到精度要求的控制。
网络的训练过程如下:
(1)、选定训练组,从样本集中分别随机地寻求N个样本作为训练组;
(2)、将各权值、阈值,置成小的接近于0的随机值,并初始化精度控制参数和学习率;
(3)、从训练组中取一个输入模式加到网络,并给出它的目标输出向量;
(4)、计算出中间层输出向量,计算出网络的实际输出向量;
(5)、将输出向量中的元素与目标向量中的元素进行比较,计算出输出误差;对于中间层的隐单元也需要计算出误差;
(6)、依次计算出各权值的调整量和阈值的调整量;
(7)、调整权值和调整阈值;
(8)、当经历M后,判断指标是否满足精度要求,如果不满足,则返回(3),继续迭代;如果满足就进入下一步;
(9)、训练结束,将权值和阈值保存在文件中。这时可以认为各个权值已经达到稳定,分类器已经形成。再一次进行训练,直接从文件导出权值和阈值进行训练,不需要进行初始化。
3、实验结果
对于这些实验代码和效果图,都是从网上学习得到
3、1实现全连接层的前向传输和后向传输
#%%
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import datasets
import os
from keras.models import Sequential
from keras.layers import Dense, Activation
#%%
x = tf.random.normal([2,28*28])
w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
b1 = tf.Variable(tf.zeros([256]))
o1 = tf.matmul(x,w1) + b1
o1
#%%
x = tf.random.normal([4,28*28])
fc1 = layers.Dense(256, activation=tf.nn.relu)
fc2 = layers.Dense(128, activation=tf.nn.relu)
fc3 = layers.Dense(64, activation=tf.nn.relu)
fc4 = layers.Dense(10, activation=None)
h1 = fc1(x)
h2 = fc2(h1)
h3 = fc3(h2)
h4 = fc4(h3)
model = tf.keras.Sequential([
layers.Dense(256, activation=tf.nn.relu) ,
layers.Dense(128, activation=tf.nn.relu) ,
layers.Dense(64, activation=tf.nn.relu) ,
layers.Dense(10, activation=None) ,
])
out = model(x)
#%%
256*784+256+128*256+128+64*128+64+10*64+10
#%%
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# x: [60k, 28, 28],[10k, 28, 28]
# y: [60k],[10K]
(x, y), (x_test,y_test) = datasets.mnist.load_data()
# x: [0~255] => [0~1.]
x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(y, dtype=tf.int32)
x_test = tf.convert_to_tensor(x_test, dtype=tf.float32) / 255.
y_test = tf.convert_to_tensor(y_test, dtype=tf.int32)
print(x.shape, y.shape, x.dtype, y.dtype)
print(tf.reduce_min(x), tf.reduce_max(x))
print(tf.reduce_min(y), tf.reduce_max(y))
train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)
test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test)).batch(128)
train_iter = iter(train_db)
sample = next(train_iter)
print('batch:', sample[0].shape, sample[1].shape)
# [b, 784] => [b, 256] => [b, 128] => [b, 10]
# [dim_in, dim_out], [dim_out]
# 隐藏层1张量
w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
b1 = tf.Variable(tf.zeros([256]))
# 隐藏层2张量
w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1))
b2 = tf.Variable(tf.zeros([128]))
# 隐藏层3张量
w3 = tf.Variable(tf.random.truncated_normal([128, 64], stddev=0.1))
b3 = tf.Variable(tf.zeros([64]))
# 输出层张量
w4 = tf.Variable(tf.random.truncated_normal([64, 10], stddev=0.1))
b4 = tf.Variable(tf.zeros([10]))
lr = 1e-3
for epoch in range(10): # iterate db for 10
for step, (x, y) in enumerate(train_db): # for every batch
# x:[128, 28, 28]
# y: [128]
# [b, 28, 28] => [b, 28*28]
x = tf.reshape(x, [-1, 28*28])
with tf.GradientTape() as tape: # tf.Variable
# x: [b, 28*28]
# 隐藏层1前向计算,[b, 28*28] => [b, 256]
h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
h1 = tf.nn.relu(h1)
# 隐藏层2前向计算,[b, 256] => [b, 128]
h2 = h1@w2 + b2
h2 = tf.nn.relu(h2)
# 隐藏层3前向计算,[b, 128] => [b, 64]
h3 = h2@w3 + b3
h3 = tf.nn.relu(h3)
# 输出层前向计算,[b, 64] => [b, 10]
h4 = h3@w4 + b4
out = h4
# compute loss
# out: [b, 10]
# y: [b] => [b, 10]
y_onehot = tf.one_hot(y, depth=10)
# mse = mean(sum(y-out)^2)
# [b, 10]
loss = tf.square(y_onehot - out)
# mean: scalar
loss = tf.reduce_mean(loss)
# compute gradients
grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3, w4, b4])
# print(grads)
# w1 = w1 - lr * w1_grad
w1.assign_sub(lr * grads[0])
b1.assign_sub(lr * grads[1])
w2.assign_sub(lr * grads[2])
b2.assign_sub(lr * grads[3])
w3.assign_sub(lr * grads[4])
b3.assign_sub(lr * grads[5])
w4.assign_sub(lr * grads[6])
b4.assign_sub(lr * grads[7])
if step % 100 == 0:
print(epoch, step, 'loss:', float(loss))
#test/evluation
total_correct,total_num=0,0
for step,(x,y)in enumerate(test_db):
#[b,28,28]=>[b,28*28]
x=tf.reshape(x,[-1,28*28])
#[b,784]=>[b,256]=>[b,128]=>[b,10]
h1=tf.nn.relu(x@w1+b1)
h2 = tf.nn.relu(h1 @ w2 + b2)
out = h2 @ w3 + b3
#out:[b,10]
#prob=[b,10]~[0-1]
prob=tf.nn.softmax(out,axis=1)
#int64
pred=tf.argmax(prob,axis=1)
pred=tf.cast(pred,tf.int32)
#y:[b]
#[b].int32
correct=tf.cast(tf.equal(pred,y),dtype=tf.int32)
correct=tf.reduce_sum(correct)
total_correct += int(correct)
total_num++x.shape[0]
acc=total_correct/total_num
print('test acc:',acc)
#%%
3、2 全连接层的神经网络学习
采用的数据集是汽车的效能(公里数每加仑),气缸数,排量,马力,重量,标签采用产地。
这个神经网络只有三层,最后通过训练集的训练,再到测试集进行测试,最后得到一个两个数据集上面的损失变化。
可以看出,在训练集上面随着训练次数的增加,平均损失不断在波动,而测试集因为训练集上面的训练之后的参数已经确定,比较拟合整体数据的特征,所以平均损失趋于稳定。
#%%
from __future__ import absolute_import, division, print_function, unicode_literals
import pathlib
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, losses
print(tf.__version__)
# 在线下载汽车效能数据集
dataset_path = keras.utils.get_file("auto-mpg.data", "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
# 效能(公里数每加仑),气缸数,排量,马力,重量
# 加速度,型号年份,产地
column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',
'Acceleration', 'Model Year', 'Origin']
raw_dataset = pd.read_csv(dataset_path, names=column_names,
na_values = "?", comment='\t',
sep=" ", skipinitialspace=True)
dataset = raw_dataset.copy()
# 查看部分数据
dataset.tail()
dataset.head()
dataset
#%%
#%%
# 统计空白数据,并清除
dataset.isna().sum()
dataset = dataset.dropna()
dataset.isna().sum()
dataset
#%%
# 处理类别型数据,其中origin列代表了类别1,2,3,分布代表产地:美国、欧洲、日本
# 其弹出这一列
origin = dataset.pop('Origin')
# 根据origin列来写入新列
dataset['USA'] = (origin == 1)*1.0
dataset['Europe'] = (origin == 2)*1.0
dataset['Japan'] = (origin == 3)*1.0
dataset.tail()
# 切分为训练集和测试集
train_dataset = dataset.sample(frac=0.8,random_state=0)
test_dataset = dataset.drop(train_dataset.index)
#%% 统计数据
sns.pairplot(train_dataset[["Cylinders", "Displacement", "Weight", "MPG"]],
diag_kind="kde")
#%%
# 查看训练集的输入X的统计数据
train_stats = train_dataset.describe()
train_stats.pop("MPG")
train_stats = train_stats.transpose()
train_stats
# 移动MPG油耗效能这一列为真实标签Y
train_labels = train_dataset.pop('MPG')
test_labels = test_dataset.pop('MPG')
# 标准化数据
def norm(x):
return (x - train_stats['mean']) / train_stats['std']
normed_train_data = norm(train_dataset)
normed_test_data = norm(test_dataset)
#%%
print(normed_train_data.shape,train_labels.shape)
print(normed_test_data.shape, test_labels.shape)
#%%
class Network(keras.Model):
# 回归网络
def __init__(self):
super(Network, self).__init__()
# 创建3个全连接层
self.fc1 = layers.Dense(64, activation='relu')
self.fc2 = layers.Dense(64, activation='relu')
self.fc3 = layers.Dense(1)
def call(self, inputs, training=None, mask=None):
# 依次通过3个全连接层
x = self.fc1(inputs)
x = self.fc2(x)
x = self.fc3(x)
return x
model = Network()
model.build(input_shape=(None, 9))
model.summary()
optimizer = tf.keras.optimizers.RMSprop(0.001)
train_db = tf.data.Dataset.from_tensor_slices((normed_train_data.values, train_labels.values))
train_db = train_db.shuffle(100).batch(32)
# # 未训练时测试
# example_batch = normed_train_data[:10]
# example_result = model.predict(example_batch)
# example_result
train_mae_losses = []
test_mae_losses = []
for epoch in range(200):
for step, (x,y) in enumerate(train_db):
with tf.GradientTape() as tape:
out = model(x)
loss = tf.reduce_mean(losses.MSE(y, out))
mae_loss = tf.reduce_mean(losses.MAE(y, out))
if step % 10 == 0:
print(epoch, step, float(loss))
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
train_mae_losses.append(float(mae_loss))
out = model(tf.constant(normed_test_data.values))
test_mae_losses.append(tf.reduce_mean(losses.MAE(test_labels, out)))
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.plot(train_mae_losses, label='Train')
plt.plot(test_mae_losses, label='Test')
plt.legend()
# plt.ylim([0,10])
plt.legend()
plt.savefig('auto.svg')
plt.show()
#%%
这是训练集上面的损失变化,每十轮训练输出一次损失变化,可以看到在前面几大轮中,损失就大幅度降低,之后就开始小幅度波动。
3、3 卷积层和降采样层的展示
这里的图片文件可以自己找到图片到代码文件同目录下即可
# -*- coding: utf-8 -*-
from theano.tensor.nnet import conv
import theano.tensor as T
import numpy, theano
rng = numpy.random.RandomState(23455)
# symbol variable
input = T.tensor4(name='input')
# 初始化权值,卷积核的权值
w_shape = (2, 3, 9, 9) # 两个卷积核,三个通道,卷积核大小为9*9
w_bound = numpy.sqrt(3 * 9 * 9)
W = theano.shared(numpy.asarray(rng.uniform(low=-1.0 / w_bound, high=1.0 / w_bound, size=w_shape),
dtype=input.dtype), name='W')
b_shape = (2,)
b = theano.shared(numpy.asarray(rng.uniform(low=-.5, high=.5, size=b_shape),
dtype=input.dtype), name='b')
conv_out = conv.conv2d(input, W)
# T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
# dimshuffle(self,*pattern)
# >>>b1 = b.dimshuffle('x',0,'x','x')
# >>>b1.shape.eval()
# array([1,2,1,1])
output = T.nnet.sigmoid(conv_out + b.dimshuffle('x', 0, 'x', 'x'))
f = theano.function([input], output)
# demo
import pylab
from PIL import Image
from matplotlib.pyplot import *
# 打开图片并对图片数据进行处理
img = Image.open("img.jpg")
width, height = img.size
img = numpy.asarray(img, dtype='float32') / 256. # (height, width, 3)
# 将照片转化为四维的张量(1,3,height,width)
img_rgb = img.swapaxes(0, 2).swapaxes(1, 2) # (3,height,width)
minibatch_img = img_rgb.reshape(1, 3, height, width)
filtered_img = f(minibatch_img)
# 展示原来的图片和两次卷积之后的图片
pylab.figure(1)
pylab.subplot(1, 3, 1);
pylab.axis('off');
pylab.imshow(img)
title('origin image')
pylab.gray()
pylab.subplot(2, 3, 2);
pylab.axis("off")
pylab.imshow(filtered_img[0, 0, :, :]) # 第一层卷积核卷积
title('convolution 1')
pylab.subplot(2, 3, 3);
pylab.axis("off")
pylab.imshow(filtered_img[0, 1, :, :]) # 第二层卷积核卷积
title('convolution 2')
# pylab.show()
# maxpooling最大值降采样
from theano.tensor.signal.pool import pool_2d
input = T.tensor4('input')
maxpool_shape = (2, 2)#降采样大小为2*2
pooled_img = pool_2d(input, maxpool_shape, ignore_border=False)
maxpool = theano.function(inputs=[input],
outputs=[pooled_img])
pooled_res = numpy.squeeze(maxpool(filtered_img))
# pylab.figure(2)
pylab.subplot(235);
pylab.axis('off');
pylab.imshow(pooled_res[0, :, :])
title('down sampled 1')
pylab.subplot(236);
pylab.axis('off');
pylab.imshow(pooled_res[1, :, :])
title('down sampled 2')
pylab.show()
3、4 完整的神经网络
class BasicBlock(layers.Layer):
# 残差模块
def __init__(self, filter_num, stride=1):
super(BasicBlock, self).__init__()
# 第一个卷积单元
self.conv1 = layers.Conv2D(filter_num, (3, 3), strides=stride, padding='same')
self.bn1 = layers.BatchNormalization()
self.relu = layers.Activation('relu')
# 第二个卷积单元
self.conv2 = layers.Conv2D(filter_num, (3, 3), strides=1, padding='same')
self.bn2 = layers.BatchNormalization()
if stride != 1:# 通过1x1卷积完成shape匹配
self.downsample = Sequential()
self.downsample.add(layers.Conv2D(filter_num, (1, 1), strides=stride))
else:# shape匹配,直接短接
self.downsample = lambda x:x
def call(self, inputs, training=None):
# [b, h, w, c],通过第一个卷积单元
out = self.conv1(inputs)
out = self.bn1(out)
out = self.relu(out)
# 通过第二个卷积单元
out = self.conv2(out)
out = self.bn2(out)
# 通过identity模块
identity = self.downsample(inputs)
# 2条路径输出直接相加
output = layers.add([out, identity])
output = tf.nn.relu(output) # 激活函数
return output
class ResNet(keras.Model):
# 通用的ResNet实现类
def __init__(self, layer_dims, num_classes=10): # [2, 2, 2, 2]
super(ResNet, self).__init__()
# 根网络,预处理
self.stem = Sequential([layers.Conv2D(64, (3, 3), strides=(1, 1)),
layers.BatchNormalization(),
layers.Activation('relu'),
layers.MaxPool2D(pool_size=(2, 2), strides=(1, 1), padding='same')
])
# 堆叠4个Block,每个block包含了多个BasicBlock,设置步长不一样
self.layer1 = self.build_resblock(64, layer_dims[0])
self.layer2 = self.build_resblock(128, layer_dims[1], stride=2)
self.layer3 = self.build_resblock(256, layer_dims[2], stride=2)
self.layer4 = self.build_resblock(512, layer_dims[3], stride=2)
# 通过Pooling层将高宽降低为1x1
self.avgpool = layers.GlobalAveragePooling2D()
# 最后连接一个全连接层分类
self.fc = layers.Dense(num_classes)
def call(self, inputs, training=None):
# 通过根网络
x = self.stem(inputs)
# 一次通过4个模块
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
# 通过池化层
x = self.avgpool(x)
# 通过全连接层
x = self.fc(x)
return x
def build_resblock(self, filter_num, blocks, stride=1):
# 辅助函数,堆叠filter_num个BasicBlock
res_blocks = Sequential()
# 只有第一个BasicBlock的步长可能不为1,实现下采样
res_blocks.add(BasicBlock(filter_num, stride))
for _ in range(1, blocks):#其他BasicBlock步长都为1
res_blocks.add(BasicBlock(filter_num, stride=1))
return res_blocks
采用cifar10训练网络:
import tensorflow as tf
from tensorflow.keras import layers, optimizers, datasets, Sequential
import os
from resnet import resnet18
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
tf.random.set_seed(2345)
def preprocess(x, y):
# 将数据映射到-1~1
x = 2*tf.cast(x, dtype=tf.float32) / 255. - 1
y = tf.cast(y, dtype=tf.int32) # 类型转换
return x,y
(x,y), (x_test, y_test) = datasets.cifar10.load_data() # 加载数据集
y = tf.squeeze(y, axis=1) # 删除不必要的维度
y_test = tf.squeeze(y_test, axis=1) # 删除不必要的维度
print(x.shape, y.shape, x_test.shape, y_test.shape)
train_db = tf.data.Dataset.from_tensor_slices((x,y)) # 构建训练集
# 随机打散,预处理,批量化
train_db = train_db.shuffle(1000).map(preprocess).batch(512)
test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test)) #构建测试集
# 随机打散,预处理,批量化
test_db = test_db.map(preprocess).batch(512)
# 采样一个样本
sample = next(iter(train_db))
print('sample:', sample[0].shape, sample[1].shape,
tf.reduce_min(sample[0]), tf.reduce_max(sample[0]))
def main():
# [b, 32, 32, 3] => [b, 1, 1, 512]
model = resnet18() # ResNet18网络
model.build(input_shape=(None, 32, 32, 3))
model.summary() # 统计网络参数
optimizer = optimizers.Adam(lr=1e-4) # 构建优化器
for epoch in range(100): # 训练epoch
for step, (x,y) in enumerate(train_db):
with tf.GradientTape() as tape:
# [b, 32, 32, 3] => [b, 10],前向传播
logits = model(x)
# [b] => [b, 10],one-hot编码
y_onehot = tf.one_hot(y, depth=10)
# 计算交叉熵
loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True)
loss = tf.reduce_mean(loss)
# 计算梯度信息
grads = tape.gradient(loss, model.trainable_variables)
# 更新网络参数
optimizer.apply_gradients(zip(grads, model.trainable_variables))
if step %50 == 0:
print(epoch, step, 'loss:', float(loss))
total_num = 0
total_correct = 0
for x,y in test_db:
logits = model(x)
prob = tf.nn.softmax(logits, axis=1)
pred = tf.argmax(prob, axis=1)
pred = tf.cast(pred, dtype=tf.int32)
correct = tf.cast(tf.equal(pred, y), dtype=tf.int32)
correct = tf.reduce_sum(correct)
total_num += x.shape[0]
total_correct += int(correct)
acc = total_correct / total_num
print(epoch, 'acc:', acc)
if __name__ == '__main__':
main()
随着网络的训练,测试的准确度不断地提高,loss则不断地下降,不过网络的参数太多,数据集也稍微偏大,训练的速度有些慢,最后可以达到非常高的一个精确度。