使用PyTorch来进行肺癌早期检测：3、训练模型

charlessun9

已于 2023-03-17 09:47:51 修改

阅读量1k

点赞数 3

分类专栏： # 肺癌的早期检测文章标签： pytorch 深度学习机器学习

于 2023-03-15 17:40:56 首次发布

本文链接：https://blog.csdn.net/charlessun9/article/details/129442197

版权

肺癌的早期检测专栏收录该内容

4 篇文章 16 订阅

订阅专栏

文章详细介绍了使用PyTorch进行CT数据分类模型的训练过程，包括数据加载、模型构建、训练和验证循环、损失计算、权重初始化以及利用TensorBoard进行可视化。模型基于CNN，通过DataLoader加载数据，训练过程中记录并展示训练和验证的指标。

摘要由CSDN通过智能技术生成

上一章我们主要介绍了数据处理的代码，这章主要是训练模型的代码，书中的意思是先构建出一个CNN模型，不管模型效果，后期对模型和数据进行修改，完善。

一、主要内容

二、函数说明

1、sys.argv

2、argparse.ArgumentParser()

7、enumerateWithEstimate函数

5、ComputeBatchLoss（）函数

一、主要内容

使用PyTorch的DataLoader来加载数据
实现一个对CT数据进行分类的模型
为我们的应用程序设置基本框架
记录和显示指标（训练和验证过程的loss，accuracy等）

我们的基本结构如下：

初始化我们的模型和数据加载
设定迭代周期，并循环训练，执行以下步骤

循环遍历LunaDataset并获取每批训练数据
数据加载器将数据加载进来
将数据分批次传入模型得到结果
根据预测结果和真实数据的差异计算我们的损失
将关于模型性能的指标记录到临时数据结构中
通过误差反向传播来更新模型权重
获取验证集数据
加载验证集数据
使用模型预测验证集数据，并计算损失
记录模型在验证数据上的执行情况
输出迭代周期的进度和性能信息

二、函数说明

按照上篇文章提到，把代码中的函数和类与其他的复杂函数一起介绍起来太过于杂乱，这篇文章开始，就在第二部分介绍各种函数，第三部分介绍每块代码是要做什么。在第三部分代码中遇到各类小问题可以来第二部分找一找解答。

1、sys.argv

sys.argv[ ]说白了就是一个从程序外部获取参数的桥梁：

Python中sys.argv[]的用法简明解释

#...line 32 如果调用者不提供参数，则从命令行获取参数
def __init__(self, sys_argv=None):
        if sys_argv is None:
            sys_argv = sys.argv[1:]

2、argparse.ArgumentParser()

argparse 模块可以让人轻松编写用户友好的命令行接口。程序定义它需要的参数，然后 argparse 将弄清如何从 sys.argv 解析出那些参数。 argparse 模块还会自动生成帮助和使用手册，并在用户给程序传入无效参数时报出错误信息。

argparse.ArgumentParser()用法解析

3、时间戳 datetime

使用时间戳帮助了解训练运行情况

datetime.datetime.now().strftime()用法的一些参数：

datetime.datetime.now().strftime

        self.time_str = datetime.datetime.now().strftime('%Y-%m-%d_%H.%M.%S')

4、神经网络权重初始化

权重初始化的目的是防止在深度神经网络的正向（前向）传播过程中层激活函数的输出损失梯度出现爆炸或消失。如果发生任何一种情况，损失梯度太大或太小，就无法有效地向后传播，并且即便可以向后传播，网络也需要花更长时间来达到收敛。

这个网络上有通用模板，可以查找一下。

介绍：神经网络中的权重初始化方式和pytorch应用

5、hasattr函数

hasattr() 函数用于判断对象是否包含对应的属性。

hasattr(object, name)
object -- 对象。
name -- 字符串，属性名。
return
如果对象有该属性返回 True，否则返回 False。

6、detach函数

当我们再训练网络的时候可能希望保持一部分的网络参数不变，只对其中一部分的参数进行调整；或者值训练部分分支网络，并不让其梯度对主网络的梯度造成影响，这时候我们就需要使用detach()函数来切断一些分支的反向传播

pytorch的两个函数 .detach() .detach_() 的作用和区别_LoveMIss-Y的博客-CSDN博客

7、enumerateWithEstimate函数

函数位置：util\util.py

函数主要用了yield关键字，使enumerateWithEstimate函数变为一个迭代器生成器，不断的迭代加载数据集，并根据每次迭代的时间来预估加载完整个数据集所需要的总时间。

# 函数实现预估加载完整个迭代器所需要的时间。具体原理：
# step1：使用yield关键字，每次加载一部分数据集，统计这部分数据集的平均单个数据集的使用时间delta_t = 花费的时间/该部分数据集样本数
# step2：根据迭代器长度，预估加载整个数据集所花时间 t_dataset = delta_t * 数据集长度
def enumerateWithEstimate(
        iter,           # 数据集的一个迭代器。函数目的就是统计加载完整个数据集所需要的时间。
        desc_str,       # 打印log的时候的说明文本。自己随便定义就行。
        start_ndx=0,    # 开始统计前跳过的统计此时。比如start_ndx=3,则意思是第1，2次统计不打印，第三次开始打印。
        print_ndx=4,    # 相邻两次打印日志的统计次数间隔print_ndx = print_ndx * backoff,缺省的初始值为4
        backoff=None,   # 相邻两次打印日志的统计次数间隔的倍数。print_ndx = print_ndx * backoff
        iter_len=None,  # 迭代器的长度，不指定时，iter_len = len(iter)
):
    """
    In terms of behavior, `enumerateWithEstimate` is almost identical
    to the standard `enumerate` (the differences are things like how
    our function returns a generator, while `enumerate` returns a
    specialized `<enumerate object at 0x...>`).
    However, the side effects (logging, specifically) are what make the
    function interesting.
    :param iter: `iter` is the iterable that will be passed into
        `enumerate`. Required.
    :param desc_str: This is a human-readable string that describes
        what the loop is doing. The value is arbitrary, but should be
        kept reasonably short. Things like `"epoch 4 training"` or
        `"deleting temp files"` or similar would all make sense.
    :param start_ndx: This parameter defines how many iterations of the
        loop should be skipped before timing actually starts. Skipping
        a few iterations can be useful if there are startup costs like
        caching that are only paid early on, resulting in a skewed
        average when those early iterations dominate the average time
        per iteration.
        NOTE: Using `start_ndx` to skip some iterations makes the time
        spent performing those iterations not be included in the
        displayed duration. Please account for this if you use the
        displayed duration for anything formal.
        This parameter defaults to `0`.
    :param print_ndx: determines which loop interation that the timing
        logging will start on. The intent is that we don't start
        logging until we've given the loop a few iterations to let the
        average time-per-iteration a chance to stablize a bit. We
        require that `print_ndx` not be less than `start_ndx` times
        `backoff`, since `start_ndx` greater than `0` implies that the
        early N iterations are unstable from a timing perspective.
        `print_ndx` defaults to `4`.
    :param backoff: This is used to how many iterations to skip before
        logging again. Frequent logging is less interesting later on,
        so by default we double the gap between logging messages each
        time after the first.
        `backoff` defaults to `2` unless iter_len is > 1000, in which
        case it defaults to `4`.
    :param iter_len: Since we need to know the number of items to
        estimate when the loop will finish, that can be provided by
        passing in a value for `iter_len`. If a value isn't provided,
        then it will be set by using the value of `len(iter)`.
    :return:
    """
    if iter_len is None:
        iter_len = len(iter)
 
    if backoff is None:
        backoff = 2
        while backoff ** 7 < iter_len:
            backoff *= 2
 
    assert backoff >= 2
    while print_ndx < start_ndx * backoff:
        print_ndx *= backoff
 
    log.warning("{} ----/{}, starting".format(
        desc_str,
        iter_len,
    ))
    start_ts = time.time()
    for (current_ndx, item) in enumerate(iter):
        yield (current_ndx, item)
        if current_ndx == print_ndx:
            # ... <1> step1：计算若干隔数据集加载时间；step2：平均得到每个数据集加载时间；step3：乘以数据集长度得到预计加载所有数据的时间
            duration_sec = ((time.time() - start_ts)
                            / (current_ndx - start_ndx + 1)
                            * (iter_len-start_ndx)
                            )
 
            done_dt = datetime.datetime.fromtimestamp(start_ts + duration_sec)
            done_td = datetime.timedelta(seconds=duration_sec)
 
            log.info("{} {:-4}/{}, done at {}, {}".format(
                desc_str,
                current_ndx,
                iter_len,
                str(done_dt).rsplit('.', 1)[0],     # 运行了current_ndx次后，预估的加载完整个数据集后的系统时间
                str(done_td).rsplit('.', 1)[0],     # 运行了current_ndx次后，预估的加载完整个数据集所需要的秒数
            ))
 
            print_ndx *= backoff
 
        if current_ndx + 1 == start_ndx:
            start_ts = time.time()
 
    log.warning("{} ----/{}, done at {}".format(
        desc_str,
        iter_len,
        str(datetime.datetime.now()).rsplit('.', 1)[0],
    ))