网络搭建参照——以图片对坐标为例

Aquei:-D

已于 2022-03-15 07:24:37 修改

阅读量1.6k

点赞数

文章标签： pytorch 深度学习

于 2022-03-12 11:25:01 首次发布

本文链接：https://blog.csdn.net/qq_45361660/article/details/123440160

版权

该博客介绍了如何使用PyTorch搭建卷积神经网络，包括数据预处理、训练和测试流程。作者提供了用于预处理数据的`preprocess_data.py`文件，以及进行训练和测试的`trainer.py`。文章还提到了在有限的GPU内存条件下进行测试的可能方法，并列出了相关源代码文件。

摘要由CSDN通过智能技术生成

Conv Net
===========================
Author : Aquei:-D

# Requirements
Python 3.8.3\
Cuda compilation tools, release 11.0, V11.0.221\
Pytorch 1.6.0\
sys\
yacs\
pickle\
yaml\
json\
os\
torchvision

# How To Use
1. Execute the compiling file tools/preprocess_data.py, which will generate a CNN_output_data folder for you in the output folder and store the preprocessed data file in the format.data.
2. Execute the compiling file engine/trainer.py, and you can start training or testing (if you need to test, just annotate the train(epoch) call in the run())
3. After normal execution, you will find that the output folder contains CNN_output_parameter, CNN_test_output, CNN_output_data, and log.txt. The CNN_output_parameter is used to hold the network and optimizer parameters, requiring 2.56GB of storage per save. CNN_test_output is used to output the.obj file of the test image.
4. If you need to continue with the last argument, uncomment the following sections in the run()：
~~~~
net.load_state_dict(torch.load(cfg.OUTPUT.PARAMETER + cfg.OUTPUT.SAVE_NET_FILENAME)) \
print('loaded net successfully!')
optimizer.load_state_dict(torch.load(cfg.OUTPUT.PARAMETER + cfg.OUTPUT.SAVE_OPTIMIZER_FILENAME)) \
print('loaded optimizer successfully!')
At the same time, you will need to modify the load model network parameters and optimizer parameters within defaults.py. \
This is explained in detail in the defaults.py file.

# Introduction
**defaults.py** \
This is used to store file paths and model parameters \
**trainer.py** \
This is used for training and testing \
**cnn.py** \
This includes the convolutional neural network, and Kaiming Initialization method, consisting of 12 convolutional layers and 6 activation functions \
**draw.py** \
This is used to visualize the training process \
**preprocess_data.py** \
This is used to preprocess the training label data \
**datasets_transform.py** \
This is used to process input image data, including normalization, data validation, and data enhancement \

# Problem Specification
1. At present, there is a problem that there is not enough video card memory for training after loading training parameters and optimizing parameters, but it can be predicted if only testing is carried out without training.

1. defaults.py

# -*- coding: UTF-8 -*-
from yacs.config import CfgNode


# -----------------------------------------------------------------------------
# Config definition
# -----------------------------------------------------------------------------

_C = CfgNode()
_C.MODEL = CfgNode()
_C.MODEL.DEVICE1 = 'cuda'
_C.MODEL.DEVICE2 = 'cpu'
_C.MODEL.CONFIG = '../config/config_cache/'

# -----------------------------------------------------------------------------
# Dataset
# -----------------------------------------------------------------------------

_C.DATASETS = CfgNode()
_C.DATASETS.IMAGES_PATH = 'D:/TEST64/images/'
_C.DATASETS.SAVE_INTERVAL = 100
# 保存参数轮次间隔
_C.DATASETS.TRANSFORM_RESIZE = 240
# 统一缩放
_C.DATASETS.BRIGHTNESS = (0.75, 1.5)
# 亮度
_C.DATASETS.CONTRAST = (0.83333, 1.2)
# 对比度
_C.DATASETS.SATURATION = (0.83333, 1.2)
# 饱和度
_C.DATASETS.HUE = (-0.2, 0.2)
# 色相
_C.DATASETS.RANDOMROTATION = 5
# 随机旋转角度（-5°, 5°）
_C.DATASETS.DEGREES = 0
# 仿射变换不进行再次旋转
_C.DATASETS.TRANSLATE = (0.2, 0.2)
# 仿射变换进行平移时长宽区间的比例系数
_C.DATASETS.SCALE = (0.95, 1.05)
# 仿射变换缩放比例
_C.DATASETS.SHEAR = (-5, 5, -5, 5)
# 仿射变换错切角度范围
_C.DATASETS.RANDOMERASING_P = 0.5
# 随机擦除概率
_C.DATASETS.RANDOMERASING_SCALE = (0.001, 0.01)
# 随机擦除按均匀分布概率抽样，遮挡区域的面积 = image * scale
_C.DATASETS.RANDOMERASING_RATIO = (0.5, 2.0)
# 随机擦除遮挡区域的宽高比范围
_C.DATASETS.RANDOMERASING_VALUE = 0
# 随机擦除遮挡区域的像素值
_C.DATASETS.TRANSFORM_MEAN = 0.31625810265541077
_C.DATASETS.TRANSFORM_STD = 0.28204897140662966

# ---------------------------------------------------------------------------- #
# Solver
# ---------------------------------------------------------------------------- #

_C.SOLVER = CfgNode()
_C.SOLVER.NUM_WORKERS = 8
_C.SOLVER.BASE_LR = 1e-3
_C.SOLVER.ADJUST_LR = 1e-4
_C.SOLVER.FIRST_ADJUST_LIMIT = 100
_C.SOLVER.SECOND_ADJUST_LIMIT = 1000

# ---------------------------------------------------------------------------- #
# Visualization
# ---------------------------------------------------------------------------- #

_C.VISUAL = CfgNode()
_C.VISUAL.TITLE_FRONT_SIZE = 24
_C.VISUAL.LABEL_FRONT_SIZE = 20
_C.VISUAL.X_LABEL = 'The Number of Training Iterations'
_C.VISUAL.LINE_COLOR = 'c'
_C.VISUAL.TITLE = 'The Training Process'
_C.VISUAL.LINE_LABEL = 'Loss'

# ---------------------------------------------------------------------------- #
# Input
# ---------------------------------------------------------------------------- #

_C.INPUT = CfgNode()
# _C.INPUT.VERTICS_PATH = '//192.168.20.63/ai/double_camera_data/2020-08-21/160810/output_2/total/'
_C.INPUT.VERTICS_PATH = 'D:/TEST64/labels/'
_C.INPUT.SAVE_RESIZE_IMAGES = 'D:/CNN_20210516/'
# _C.INPUT.SAVE_RESIZE_IMAGES = 'D:/CNN_1193Dataset/images/'
_C.INPUT.CHECK = '../output/check/'
_C.INPUT.BATCH_SIZE = 4
_C.INPUT.BASE_EPOCH = 1
_C.INPUT.VERTICS_NUM = 22971
_C.INPUT.PCA_DIMENSION = 57


# ---------------------------------------------------------------------------- #
# Output
# ---------------------------------------------------------------------------- #

_C.OUTPUT = CfgNode()
# 保存、加载模型参数的地址
_C.OUTPUT.PARAMETER