项目实训-智能生物序列分析平台-后端模块（5）

煮酒cos

已于 2022-06-07 21:50:29 修改

阅读量263

点赞数

分类专栏：项目实训文章标签：机器学习

于 2022-06-02 21:28:50 首次发布

本文链接：https://blog.csdn.net/qq_49215659/article/details/125107183

版权

本文详细介绍了智能生物序列分析平台后端模块的训练部分，包括参数定义、数据集划分、模型训练和性能评估。重点讲解了数据统计、模型训练耗时记录以及训练过程中的数据提取和管理。通过调用learner训练主函数，对不同模型或参数进行测试比较，并绘制ROC和PR曲线进行可视化分析。

摘要由CSDN通过智能技术生成

训练部分代码

train部分代码完成

train代码

这部分是服务器的训练的主代码，内容比较多，而且变量的设计复杂，需要仔细的剥开讲讲。

def SL_train(config, modelsORpara):
    torch.cuda.set_device(config.device)

    roc_datas, prc_datas = [], []
    repres_list, label_list = [], []
    train_seq, test_seq = [], []
    train_label, test_label = [], []
    pos_list = []
    neg_list = []
    best_performance = []
    data_statistic = []  # train pos, train neg, test pos, test neg
    time_use = []

    step_log_interval = []
    train_metric_record = []
    train_loss_record = []

    step_test_interval = []
    test_metric_record = []
    test_loss_record = []

首先是定义了一大波参数，我们开始顺次标注一下

roc_datas, prc_datas：用来存放画图的roc和pr数据
repres_list, label_list：用来存放模型得倒的特征提取和标签列表
train_seq, test_seq：测试的sequence和测试的sequence
train_label, test_label：训练集的label和测试集的label
pos_list, neg_list：存放模型预测的正负样本的置信度
best_performance：存放模型最好的表现的数组
data_statistic：数据统计的函数统计这几个train pos, train neg, test pos, test neg的数量
time_use：每个模型训练的耗时记录
step_log_interval：用来画epochlog的step的横坐标
train_metric_record：用来画epochlog的其一纵坐标准确度
train_loss_record：用来画epochlog的其一纵坐标损失值（loss）

    if_same = config.if_same
    if_same = True
    savepath = '/data/result/' + config.learn_name
    # if not os.path.exists('../data/result/'):
    #     os.mkdir('../data/result/')
    if not os.path.exists(savepath + '/plot'):
        os.mkdir(savepath + '/plot')

    util_file.filiter_fasta(config.path_data, savepath, skip_first=False)