基于ResNet50的杂草检测和清除

一、简介:杂草检测

        问题描述:

        杂草是农业经营中不受欢迎的入侵者,它们通过窃取营养、水、土地和其他关键资源来破坏种植,这些入侵者会导致产量下降和资源部署效率低下。一种已知的方法是使用杀虫剂来清除杂草,但杀虫剂会给人类带来健康风险。我们的目标是利用计算机视觉技术可以自动检测杂草的存在,开发一种只在杂草上而不是在作物上喷洒农药的系统,并使用针对性的修复技术将其从田地中清除,从而最小化杂草对环境的负面影响。

        预期解决方案:

       我们期待您将其部署到模拟的生产环境中——这里推理时间和二分类准确度(F1分数)将作为评分的主要依据。

        数据集:

      https://filerepo.idzcn.com/hack2023/Weed_Detection5a431d7.zip

        图像展示:
crop

 

weed
        标签展示:

        corp:0 0.478516 0.560547 0.847656 0.625000

        weed:1 0.514648 0.441406 0.861328 0.671875

二、数据预处理

        数据集结构:

        数据集分为图片(images)和标签(labels)。

数据集结构

        标签含义:第一个数据代表种类,0为农作物,1为杂草。后面四个数据分别为x_center、y_center、height、width。上面的corp标签含义展示如下:

      数据集处理:

        先把所有图像的文件名放到data.text中方便后续处理,再对图像做对比度增强和归一化处理,代码如下:

def get_file_name(images_dir):
    images_files = [f for f in os.listdir(images_dir) if f.endswith('.jpeg')]
    images_files.sort()
    with open(r'./data.txt','a') as f:
        for i in images_files:
            f.write(i+'\n')
    f.close()
    print('文件写入完成!!!')
get_file_name('/content/drive/MyDrive/weeds/data')
transformer = transforms.Compose([
    transforms.ToTensor(),
    transforms.ColorJitter(contrast=0.5),  # 增强对比度
    transforms.Normalize(mean=[0.5], std=[0.5])  # 归一化
])

train_images_tensor = []
with open(r'/content/data.txt','r') as f:
    file_name_url=[i.split('\n')[0] for i in f.readlines()]
for i in range(len(file_name_url)):
    image = Image.open('/content/drive/MyDrive/weeds/data/'+file_name_url[i])
    tensor = transformer(image.convert('L')).type(torch.float16)
    train_images_tensor.append(tensor)

        接下来处理标签并把数据集划分成两份,分别是train、test数据集,train数据集占总数的7成,test数据集用剩下的部分。

image_train = []
image_test = []
for i in range(len(train_images_tensor)):

    if i<=len(train_images_tensor)*0.7:
      image_train.append(train_images_tensor[i])
    else:
      image_test.append(train_images_tensor[i])

transformerlab = transforms.Compose([
    transforms.ToTensor()
])

train_lables_tensor = []
with open(r'/content/data.txt','r') as f:
    file_name_url=[i.split('.')[0] for i in f.readlines()]
train_lables_tensor = []

for i in range(len(file_name_url)):
    image = open('/content/drive/MyDrive/weeds/data/' + file_name_url[i] + '.txt')
    labels = image.readline()[0]
    labels = float(labels)
    tensor = torch.tensor(labels, dtype=torch.float16)  # 使用float16数据类型
    train_lables_tensor.append(tensor)

lables_train = []
lables_test = []
for i in range(len(train_lables_tensor)):

    if i <=len(train_lables_tensor)*0.7:
      lables_train.append(train_lables_tensor[i])
    else:
      lables_test.append(train_lables_tensor[i])
 构建数据集:

        用PyTorch框架来构建一个图像分类的数据集和数据加载器。

train_datas_tensor = torch.stack(image_train)
train_labels_tensor = torch.stack(lables_train)
test_datas_tensor = torch.stack(image_test)
test_labels_tensor = torch.stack(lables_test)
train_dataset = TensorDataset(train_labels_tensor, train_datas_tensor)
train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_dataset = TensorDataset(test_labels_tensor, test_datas_tensor)
test_dataloader = DataLoader(test_dataset, batch_size=32, shuffle=True)

三、使用ResNet50进行训练

ResNet50:

        ResNet50的主要特点是使用了残差块(residual block),可以有效地解决深度网络的退化问题,提高了网络的性能和稳定性。

        ResNet50的Backbone部分结构如下图所示:

图片转自知乎@臭咸鱼

        ResNet50的结构可以分为七个部分:第一部分是一个卷积层,用于对输入图像进行预处理,降低图像的尺寸和通道数。第二到第六部分是由残差块组成的五个阶段(stage),每个阶段包含多个残差块,用于提取图像的特征。每个阶段的第一个残差块会对输入进行下采样,降低特征图的尺寸,增加特征图的通道数。每个残差块由三个卷积层和一个跳跃连接(skip connection)组成,跳跃连接可以将输入和输出相加,形成残差学习的机制。第七部分是一个全局平均池化层和一个全连接层,用于对特征图进行汇总和分类,输出最终的预测结果。

构建ResNet50:

        使用PyTorch框架来构建一个Resnet50网络。

class Residual(nn.Module):
    def __init__(self, input_channels, num_channels, use_conv=False, strides=1):
        super().__init__()
        self.conv1 = nn.Conv2d(input_channels, num_channels, kernel_size=3, padding=1, stride=strides)
        self.conv2 = nn.Conv2d(num_channels, num_channels, kernel_size=3, padding=1)
        if use_conv:
            self.conv3 = nn.Conv2d(input_channels, num_channels, kernel_size=1, stride=strides)
        else:
            self.conv3 = None
        self.bn1 = nn.BatchNorm2d(num_channels)
        self.bn2 = nn.BatchNorm2d(num_channels)

    def forward(self, X):
        Y = F.relu(self.bn1(self.conv1(X)))
        Y = self.bn2(self.conv2(Y))
        if self.conv3:
            X = self.conv3(X)
        Y += X
        return F.relu(Y)
b1 = nn.Sequential(nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3),
                   nn.BatchNorm2d(64), nn.ReLU(),
                   nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
def resnet_block(input_channels, num_channels, num_residuals, first_block=False):
    blk = []
    for i in range(num_residuals):
        if i == 0 and not first_block:
            blk.append(Residual(input_channels, num_channels, use_conv=True, strides=2))
        else:
            blk.append(Residual(num_channels, num_channels))
    return blk

b2 = nn.Sequential(*resnet_block(64, 64, 2, first_block=True))
b3 = nn.Sequential(*resnet_block(64, 128, 2))
b4 = nn.Sequential(*resnet_block(128, 256, 2))
b5 = nn.Sequential(*resnet_block(256, 512, 2))
net = nn.Sequential(b1, b2, b3, b4, b5,
                    nn.AdaptiveAvgPool2d((1, 1)),
                    nn.Flatten(), nn.Linear(512, 10))
net = torchvision.models.resnet50(pretrained=True)
net.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)

num_features = net.fc.in_features
net.fc = nn.Linear(num_features, 2)
使用GPU训练:

        设置好训练参数,这里损失函数使用交叉熵损失函数,优化器用随机梯度下降(SGD)算法实现。

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device).half()

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

        在GPU上训练19次并导出模型。

for epoch in range(1, 20):
    running_loss = 0.0
    num_images = 0
    loop = tqdm(enumerate(train_dataloader, 0))

    for step, data in loop:
        labels, inputs = data[0].to('cuda').float(), data[1].to('cuda').float()
        # labels, inputs = data[0].float(), data[1].float()
        optimizer.zero_grad()
        inputs = inputs.half()
        outputs = net(inputs)
        # 创建包含相同数量的目标值的示例目标张量
        target = labels  # 使用实际标签作为目标

        # 使用 MSE 损失函数
        loss = criterion(outputs, target.long())

        loss.backward()
        optimizer.step()

        num_images += inputs.size(0)
        running_loss += loss.item()
        loop.set_description(f'Epoch [{epoch}/20]')
        loop.set_postfix(loss=running_loss / (step + 1))

print('Finish!!!')

torch.save(net.state_dict(), '/content/drive/MyDrive/weeds/detectmodles/resnet.pth')

 测试模型:

        对导出的模型进行测试集的预测时间和F1分数的测试。

all_predictions = []
all_labels = []
start_time = time.time()
with torch.no_grad():
    for data in test_dataloader:
        images, labels = data[1].to('cuda').float(), data[0].to('cuda').long()  # 将标签转换为整数类型
        net = net.float()
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        all_predictions.extend(predicted.cpu().numpy())
        all_labels.extend(labels.cpu().numpy())

end_time = time.time()
elapsed_time = end_time - start_time
print(f'测试集用的时间为: {elapsed_time:.2f} seconds')
f1 = f1_score(all_labels, all_predictions, average='binary')
print(f'测试F1分数: {f1:.4f}')

        结果如下:

         从测试集选一张图片进行展示:

import matplotlib.pyplot as plt
import numpy as np


sample_image, true_label = next(iter(test_dataloader))

sample_image, true_label = data[1].to('cuda').float(), data[0].to('cuda').long()
with torch.no_grad():
    net = net.float()
    model_output = net(sample_image)

_, predicted_label = torch.max(model_output, 1)

sample_image = sample_image.cpu().numpy()[0]
predicted_label = predicted_label[0].item()

true_label = true_label[0].item()

# 获取类别标签
class_labels = ['corp', 'weed']

# 显示图像
plt.imshow(np.transpose(sample_image, (1, 2, 0)))  # 转置图片的维度顺序
plt.title(f'TRUE LABEL IS: {class_labels[true_label]}, PREDICT LABEL IS: {class_labels[predicted_label]}')
plt.axis('off')
plt.show()

        结果如下:

 四、转移模型到CPU

        构建ResNet50模型:

        构建模型框架。

class Residual(nn.Module):
    def __init__(self, input_channels, num_channels, use_conv=False, strides=1):
        super().__init__()
        self.conv1 = nn.Conv2d(input_channels, num_channels, kernel_size=3, padding=1, stride=strides)
        self.conv2 = nn.Conv2d(num_channels, num_channels, kernel_size=3, padding=1)
        if use_conv:
            self.conv3 = nn.Conv2d(input_channels, num_channels, kernel_size=1, stride=strides)
        else:
            self.conv3 = None
        self.bn1 = nn.BatchNorm2d(num_channels)
        self.bn2 = nn.BatchNorm2d(num_channels)

    def forward(self, X):
        Y = F.relu(self.bn1(self.conv1(X)))
        Y = self.bn2(self.conv2(Y))
        if self.conv3:
            X = self.conv3(X)
        Y += X
        return F.relu(Y)
b1 = nn.Sequential(nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3),
                   nn.BatchNorm2d(64), nn.ReLU(),
                   nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
def resnet_block(input_channels, num_channels, num_residuals, first_block=False):
    blk = []
    for i in range(num_residuals):
        if i == 0 and not first_block:
            blk.append(Residual(input_channels, num_channels, use_conv=True, strides=2))
        else:
            blk.append(Residual(num_channels, num_channels))
    return blk

b2 = nn.Sequential(*resnet_block(64, 64, 2, first_block=True))
b3 = nn.Sequential(*resnet_block(64, 128, 2))
b4 = nn.Sequential(*resnet_block(128, 256, 2))
b5 = nn.Sequential(*resnet_block(256, 512, 2))
net = nn.Sequential(b1, b2, b3, b4, b5,
                    nn.AdaptiveAvgPool2d((1, 1)),
                    nn.Flatten(), nn.Linear(512, 10))
 
resnet50_model = torchvision.models.resnet50(pretrained=True)
resnet50_model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
num_features = resnet50_model.fc.in_features
resnet50_model.fc = nn.Linear(num_features, 2)

         导入训练好的模型并将模型转移到CPU上。

resnet50_model.load_state_dict(torch.load('resnet.pth', map_location=torch.device('cpu')))
net = resnet50_model
net.to('cpu')
        CPU测试:

        在CPU上使用课程公共测试集测试的预测时间和F1分数的测试分别为9.96s和0.9796。

all_predictions = []
all_labels = []
start_time = time.time()
with torch.no_grad():
    for data in test_dataloader:
        images, labels = data[1].to(torch.device('cpu')).float(), data[0].to(torch.device('cpu')).long()  # 将标签转换为整数类型
        net = net.float()
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        all_predictions.extend(predicted.cpu().numpy())
        all_labels.extend(labels.cpu().numpy())

end_time = time.time()
elapsed_time = end_time - start_time
print(f'测试集用的时间为: {elapsed_time:.2f} seconds')
f1 = f1_score(all_labels, all_predictions, average='binary')
print(f'测试F1分数: {f1:.4f}')

        测试结果如下:

 五、使用oneAPI工具进行优化

oneAPI优化:

        这里使用Inter Extension for PytorchIntel Optimization for PyTorch进行优化。重新定义优化器和损失函数并使用oneAPI组件进行优化。

criterion = nn.CrossEntropyLoss()
optimizer =  optim.Adam(net.parameters(), lr=0.001, weight_decay=1e-4)
net,optimizer = ipex.optimize(net,optimizer=optimizer)
优化后模型测试:

        使用课程公共测试集进行测试。

all_predictions = []
all_labels = []
start_time = time.time()
with torch.no_grad():
    for data in test_dataloader:
        images, labels = data[1].to(torch.device('cpu')).float(), data[0].to(torch.device('cpu')).long()  # 将标签转换为整数类型
        net = net.float()
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        all_predictions.extend(predicted.cpu().numpy())
        all_labels.extend(labels.cpu().numpy())

end_time = time.time()
elapsed_time = end_time - start_time
print(f'测试集用的时间为: {elapsed_time:.2f} seconds')
f1 = f1_score(all_labels, all_predictions, average='binary')
print(f'测试F1分数: {f1:.4f}')

         结果如下:

         oneAPI优化后的模型预测时间为6.25s,F1分数为0.9796。可以看到优化后的模型预测时间缩短了3.71s,速度提升已十分明显而且F1分数并没有减小,模型的精度也没有缺失。由于测试数据集较小,在数据量更大的测试集进行测试时预测时间的差距还会更大,oneAPI组件的优化效果会更明显。

  • 0
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值