2024-deepfake-resnet50-100-10000

比赛链接

准备阶段

首先是一些基本准备操作,比如确认目录下是否有数据集、第三方库的安装等

  • !ls /kaggle/input
  • !pip install transformers datasets

正式coding

首先引入相关库

from torch.utils.data import DataLoader
from transformers import AutoImageProcessor,ResNetForImageClassification, AutoConfig, Trainer
from datasets import Dataset
from torch import nn
import datasets
import torch
import numpy as np
from PIL import Image
import pandas as pd
torch.manual_seed(0)
torch.backends.cudnn.deterministic = False
torch.backends.cudnn.benchmark = True

读入数据

train_label = pd.read_csv('/kaggle/input/deepfake/phase1/trainset_label.txt')
val_label = pd.read_csv('/kaggle/input/deepfake/phase1/valset_label.txt')

train_label['path'] = '/kaggle/input/deepfake/phase1/trainset/' + train_label['img_name']
val_label['path'] = '/kaggle/input/deepfake/phase1/valset/' + val_label['img_name']

引入模型

processor = AutoImageProcessor.from_pretrained("microsoft/resnet-50")
model = ResNetForImageClassification.from_pretrained("microsoft/resnet-50")

自定义新的二分层

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 定义最后的二分类线性层
cls = nn.Sequential(
    nn.Linear(1000, 1),
    nn.Sigmoid()
)
model.add_module("cls", cls)
model = model.to(device)

这里有个小问题:我不知道为什么我在model后面加入了cls层,但是后续使用model(**inputs)时,还是输出的 classifier层的输出,还要手动写一个binary_result=model.cls(outputs.get("logits")),暂时找到原因

选取部分数据

数据总共有15.9G,全部载入的话,再加上resnet50,会超出显存,因此选取前10000个

train_dataset = Dataset.from_pandas(train_label.head(10000))
val_dataset = Dataset.from_pandas(val_label.head(10000))

确认dataset无误后删除Dataframe释放内存

with torch.no_grad():
    outputs = model(**processor(Image.open(train_dataset[2]['path']), return_tensors="pt").to(device))
    logits = (outputs).logits
    print(type(logits),logits.shape)
    predicted_label = logits.argmax(-1).item()
    print(predicted_label)
    print(model.config.id2label[predicted_label])
del train_label, val_label

定义Dataloader并确认无误

train_loader = DataLoader(train_dataset,batch_size=128,shuffle=True)
val_loader = DataLoader(val_dataset,batch_size=128,shuffle=False)

for item in val_loader:
    path = item["path"]
    label = item["target"]
    print(path[:5])
    print(label[:5])
    break

是否使用已训练模型继续训练

import os
model_path = "./model.ckpt"
if os.path.exists(model_path):
    model = torch.load(model_path)
    print("使用以往的模型继续训练")
else:
    print("使用hugging face预训练模型")

开始训练

epochs = 100
criterion = nn.BCELoss().to(device)
optimizer = torch.optim.SGD(model.cls.parameters(), lr=1e-4)
running_loss_list = []
# model = model.cpu()
model.train()
for epoch in range(epochs):
    print('Epoch {}/{}'.format(epoch+1, epochs))
    print('-' * 10)
    running_loss = 0.0
    for item in train_loader:
        optimizer.zero_grad()
        label = item["target"].unsqueeze(1)
        paths = item["path"]
        images = []
        for path in paths:
            image = processor(Image.open(path), return_tensors="pt").get('pixel_values').to(device)
            images.append(image)

        images = torch.cat(images, dim=0)
        outputs = model(pixel_values=images)
        logits = outputs.get("logits")
        binary_result = model.cls(logits)
        binary_result = binary_result.cpu()
        loss = criterion(binary_result, label.to(dtype=torch.float))
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print("-----    " , running_loss , "    -----")
    running_loss_list.append(running_loss)
    

训练结束后保存模型
torch.save(model,f'resnet-{epochs}-{len(train_dataset)}.ckpt')

绘制loss曲线

import matplotlib.pyplot as plt

x = list(range(len(running_loss_list)))

plt.figure()

plt.plot(x, running_loss_list, marker='o')

plt.title('loss - epoch')
plt.ylabel('loss')
plt.grid(True) 

plt.show()

loss曲线

验证模型

model.eval()
eval_accuracy = 0.0
for item in val_loader:
    with torch.no_grad():
        label = item["target"].unsqueeze(1)
        paths = item["path"]
        images = []
        for path in paths:
            image = processor(Image.open(path), return_tensors="pt").get('pixel_values').to(device)
            images.append(image)

        images = torch.cat(images, dim=0)
        outputs = model(pixel_values=images)
        logits = outputs.get("logits")
        binary_result = model.cls(logits)
        binary_result = binary_result.cpu()
        binary_result = (binary_result >= 0.5).to(dtype=torch.int)
        accuracy = torch.sum(binary_result == label).item()
        eval_accuracy += accuracy
print(eval_accuracy/len(val_dataset))

最后准确度为0.6011,耗时4h

后续准备在此基础上,进行全量数据训练

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值