![9042871bc38ca9e4d4d57def53dc3db0.png](https://i-blog.csdnimg.cn/blog_migrate/ff8f0d67620e86159eb9790a5ab7e3c0.jpeg)
一维卷积神经网络
在处理序列问题的作用
在计算机视觉中我们经常使用卷积神经网络,由于其能够自动提取图像特征,并且可以高效的处理数据。对于序列处理问题,我们可以将序列看成一维空间,也就是我们前面说的数据维度:向量(1D张量)。
主要的作用:在某些序列问题上,其效果跟RNN一样,而且计算代价更小; - 在音频生成和机器翻译领域取得了巨大成功; - 对于文本分类和时间序列预测等简单任务,小型的一维卷积神经网络可以替代 RNN,速度更快。
一维卷积的形式
参考二维卷积,对于图片中局部的区域进行变换,能够获得提取该区域的特征。对于一维卷积也可这样的理解,提取局部一维序列的特征。
在深入的理解一下,例如对于文本的学习,在某个句子中的词,使得一维卷积能够学习到一个具体的表达模式,该模式可能在其他的位置上被识别。具体, - 比如下图中的长度为5的卷积窗口, - 在进行卷积的时候,能够学习长度不大于5的单词 - 这样能够在输入句子中任何位置识别到这些单词 - 字符级的一维卷积神经网络能够学会单词构词法
![1481e116c0d6941c6171abf5c0c48e8d.png](https://i-blog.csdnimg.cn/blog_migrate/92addb7db853d056f47889b1020c33d1.jpeg)
序列数据的一维池化
可以类比二维卷积神经的池化运算,从输入的一维序列中提取一段数据,然后对其进行计算最大值(最大化池化)或平均值(平均值池化),这个其实就一维采样。
利用一维卷积神经网络构建IMDB模型
Keras 中的一维卷积神经网络是 Conv1D 层 keras.layers.Conv1D( filters, kernel_size, strides=1, padding='valid', data_format='channels_last', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
输入尺寸
- 3D 张量 ,尺寸为
(batch_size, steps, input_dim)
输出尺寸
- 3D 张量,尺寸为
(batch_size, new_steps, filters)
# 数据准备
from keras.datasets import imdb
from keras.preprocessing import sequence
max_features = 10000
max_len = 500
print('Loading data...')
(x_train,y_train),(x_test,y_test) = imdb.load_data(num_words=max_features)
print(len(x_train),'train sequences')
print(len(x_test),'test sequences')
# 截断成一个相同长度的序列
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train,maxlen=max_len)
x_test = sequence.pad_sequences(x_test,maxlen = max_len)
print("x_train shape:",x_train.shape)
print("x_test shape:",x_test.shape)
Using TensorFlow backend.
Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
x_train shape: (25000, 500)
x_test shape: (25000, 500)
模型的建立
- Conv1D 层和 MaxPooling1D层的堆叠,
- 最后是一个全局池化层或 Flatten 层,将三维输出转换为二维输出,
- 模型中添加一个或多个 Dense 层,用于分类或回归。
具体的coding 如下:
from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop
model = Sequential()
model.add(layers.Embedding(max_features,128,input_length = max_len))
model.add(layers.Conv1D(32,7,activation = 'relu'))
model.add(layers.MaxPool1D(5))
model.add(layers.Conv1D(32,7,activation = 'relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_2 (Embedding) (None, 500, 128) 1280000
_________________________________________________________________
conv1d_3 (Conv1D) (None, 494, 32) 28704
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 98, 32) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 92, 32) 7200
_________________________________________________________________
global_max_pooling1d_2 (Glob (None, 32) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 33
=================================================================
Total params: 1,315,937
Trainable params: 1,315,937
Non-trainable params: 0
_________________________________________________________________
model.compile(optimizer=RMSprop(lr = 1e-4),
loss = 'binary_crossentropy',
metrics = ['acc'])
history = model.fit(x_train,y_train,
epochs=10,
batch_size=128,
validation_split=0.2)
WARNING:tensorflow:From D:Anaconda3envstfgpulibsite-packagestensorflowpythonopsmath_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From D:Anaconda3envstfgpulibsite-packagestensorflowpythonopsmath_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
Train on 20000 samples, validate on 5000 samples
Epoch 1/10
20000/20000 [==============================] - 6s 276us/step - loss: 0.7679 - acc: 0.5200 - val_loss: 0.6864 - val_acc: 0.5700
Epoch 2/10
20000/20000 [==============================] - 2s 89us/step - loss: 0.6682 - acc: 0.6618 - val_loss: 0.6665 - val_acc: 0.6352
Epoch 3/10
20000/20000 [==============================] - 2s 90us/step - loss: 0.6258 - acc: 0.7547 - val_loss: 0.6145 - val_acc: 0.7174
Epoch 4/10
20000/20000 [==============================] - 2s 89us/step - loss: 0.5332 - acc: 0.8044 - val_loss: 0.4957 - val_acc: 0.8056
Epoch 5/10
20000/20000 [==============================] - 2s 88us/step - loss: 0.4122 - acc: 0.8430 - val_loss: 0.4312 - val_acc: 0.8338
Epoch 6/10
20000/20000 [==============================] - 2s 88us/step - loss: 0.3445 - acc: 0.8679 - val_loss: 0.4125 - val_acc: 0.8442
Epoch 7/10
20000/20000 [==============================] - 2s 89us/step - loss: 0.3053 - acc: 0.8723 - val_loss: 0.3937 - val_acc: 0.8370
Epoch 8/10
20000/20000 [==============================] - 2s 88us/step - loss: 0.2730 - acc: 0.8618 - val_loss: 0.4081 - val_acc: 0.8222
Epoch 9/10
20000/20000 [==============================] - 2s 89us/step - loss: 0.2480 - acc: 0.8425 - val_loss: 0.4634 - val_acc: 0.7872
Epoch 10/10
20000/20000 [==============================] - 2s 88us/step - loss: 0.2252 - acc: 0.8300 - val_loss: 0.4411 - val_acc: 0.7730
#绘制相关的图
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
![afcf7e1270648119acf8fbe4dae091b3.png](https://i-blog.csdnimg.cn/blog_migrate/d95c8d3329902e96dde3d64dac63fd3c.jpeg)
![07267d1cf965e905bec037714dfba9aa.png](https://i-blog.csdnimg.cn/blog_migrate/383bf57baf6712bd336637b02f028530.jpeg)
结果对比
如上图中给出了模型的训练结果和验证结果。验证精度略低于 LSTM ,但在 CPU 和 GPU 上的运行速度都要更快。
分享关于人工智能,机器学习,深度学习以及计算机视觉的好文章,同时自己对于这个领域学习心得笔记。想要一起深入学习人工智能的小伙伴一起结伴学习吧!扫码上车!
![0f50f27621c79171d8da9c17dfa2c20e.png](https://i-blog.csdnimg.cn/blog_migrate/8a0a54129f812f160abfb05a82a10fc7.jpeg)