>- **🍨 本文为[🔗365天深度学习训练营](https://mp.weixin.qq.com/s/rbOOmire8OocQ90QM78DRA) 中的学习记录博客**
>- **🍖 原作者:[K同学啊 | 接辅导、项目定制](https://mtyjkh.blog.csdn.net/)**
系统环境:WIN10-WSL2-Ubuntu22.04
- 语言环境:Python3.9.18
- 深度学习环境:Pytorch2.1.2
- 显卡:NVIDIA Tesla P40
本周介绍的是DenseNet,密集神经网络
他的思路是在每一层都存在与前面所有层的密集连接(dense connection),从而能实现每一层的特征复用。
密集连接要求每一层的特征图大小保持一致,因此无法简单的使用池化层来减少特征。
为此,DenseNet采用了DenseBlock+Transition的结构,在DenseBlock中保持特征图大小相同进行密集连接,在Transition层中运用池化层降低特征图大小
tensorflow框架下实现DenseNet的代码为:
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, Activation, MaxPooling2D, Dense, GlobalAveragePooling2D, ZeroPadding2D, AveragePooling2D
from tensorflow.keras.models import Model
class DenseLayer(Model):
def __init__(self, bottleneck_size, growth_rate):
super().__init__()
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.c1 = Conv2D(filters=bottleneck_size, kernel_size=(1, 1), strides=1)
self.b2 = BatchNormalization()
self.a2 = Activation('relu')
self.c2 = Conv2D(filters=growth_rate, kernel_size=(3, 3), strides=1, padding='same')
def call(self, x):
x = self.b1(x)
x = self.a1(x)
x = self.c1(x)
x = self.b2(x)
x = self.a2(x)
return self.c2(x)
class DenseBlock(Model):
def __init__(self, num_layers, growth_rate):
super().__init__()
self.dense_layers = [DenseLayer(4 * growth_rate, growth_rate) for _ in range(num_layers)]
def call(self, x):
for layer in self.dense_layers:
new_x = layer(x)
x = tf.concat([x, new_x], axis=-1)
return x
class Transition(Model):
def __init__(self, filters, compression_rate=0.5):
super().__init__()
self.b = BatchNormalization()
self.a = Activation('relu')
self.c = Conv2D(int(filters * compression_rate), kernel_size=(1, 1), strides=1)
self.p = AveragePooling2D(pool_size=(2, 2), strides=2)
def call(self, x):
x = self.b(x)
x = self.a(x)
x = self.c(x)
return self.p(x)
class DenseNet(Model):
def __init__(self, block_list=[6, 12, 24, 16], compression_rate=0.5, initial_filters=64):
super().__init__()
self.padding = ZeroPadding2D(((3, 3), (3, 3))) # Adjusted padding for 224x224 input
self.c1 = Conv2D(initial_filters, kernel_size=(7, 7), strides=2, padding='valid')
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.p1 = MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')
self.blocks = tf.keras.Sequential()
filters = initial_filters
for i, num_layers in enumerate(block_list):
self.blocks.add(DenseBlock(num_layers, growth_rate=32))
filters += 32 * num_layers
if i != len(block_list) - 1: # No transition layer after the last block
self.blocks.add(Transition(filters, compression_rate))
filters = int(filters * compression_rate)
self.p2 = GlobalAveragePooling2D()
self.d2 = Dense(1000, activation='softmax')
def call(self, inputs):
x = self.padding(inputs)
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.p1(x)
x = self.blocks(x)
x = self.p2(x)
return self.d2(x)
model = DenseNet()
print(model)
进行训练:
#设置初始学习率
initial_learning_rate = 1e-4
opt = tf.keras.optimizers.Adam(learning_rate=initial_learning_rate)
model.compile(optimizer=opt,
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
'''训练模型'''
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
结果为:
Epoch 1/10
2024-03-15 14:13:45.809368: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8904
2024-03-15 14:13:48.629780: I external/local_tsl/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2024-03-15 14:13:52.577198: I external/local_xla/xla/service/service.cc:168] XLA service 0x296ca7e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-03-15 14:13:52.577261: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): Tesla P40, Compute Capability 6.1
2024-03-15 14:13:52.606880: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1710483232.769751 636683 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
57/57 [==============================] - 126s 403ms/step - loss: 1.7279 - accuracy: 0.5531 - val_loss: 3.1508 - val_accuracy: 0.3097
Epoch 2/10
57/57 [==============================] - 12s 216ms/step - loss: 0.6443 - accuracy: 0.7478 - val_loss: 2.0589 - val_accuracy: 0.4336
Epoch 3/10
57/57 [==============================] - 11s 200ms/step - loss: 0.4558 - accuracy: 0.8296 - val_loss: 1.4113 - val_accuracy: 0.4425
Epoch 4/10
57/57 [==============================] - 11s 190ms/step - loss: 0.3793 - accuracy: 0.8562 - val_loss: 0.8519 - val_accuracy: 0.6903
Epoch 5/10
57/57 [==============================] - 11s 190ms/step - loss: 0.2191 - accuracy: 0.9469 - val_loss: 0.4990 - val_accuracy: 0.8053
Epoch 6/10
57/57 [==============================] - 11s 196ms/step - loss: 0.1028 - accuracy: 0.9668 - val_loss: 0.5199 - val_accuracy: 0.8407
Epoch 7/10
57/57 [==============================] - 11s 189ms/step - loss: 0.1232 - accuracy: 0.9646 - val_loss: 3.4060 - val_accuracy: 0.4956
Epoch 8/10
57/57 [==============================] - 11s 188ms/step - loss: 0.1023 - accuracy: 0.9712 - val_loss: 0.8763 - val_accuracy: 0.7965
Epoch 9/10
57/57 [==============================] - 11s 195ms/step - loss: 0.0793 - accuracy: 0.9779 - val_loss: 0.9300 - val_accuracy: 0.8319
Epoch 10/10
57/57 [==============================] - 11s 189ms/step - loss: 0.1567 - accuracy: 0.9447 - val_loss: 1.8501 - val_accuracy: 0.5752