无人机视觉定位数据集

计算机C9硕士，算法工程师，为奥迪在奋斗

于 2024-09-27 07:24:46 发布

阅读量656

点赞数 21

分类专栏：数据集文章标签：无人机无人机视觉定位 QQ767172261 数据集地面俯视图

本文链接：https://blog.csdn.net/2401_83580557/article/details/142583281

版权

数据集专栏收录该内容

502 篇文章 20 订阅

订阅专栏

无人机视觉定位数据集，将无人机拍摄的地面俯视图与相应的遥感影像进行匹配，可以实现无人机的精确快速定位，且不会产生误差累积，能作为当前无人机组合导航系统的重要补充，无人机影像收集自中国多个地区，涵盖不同地形特征和大部分中国地区；匹配的底图影像则是从谷歌地图获取的卫星图像。数据集旨在通过提供多样化的数据来支持无人机视觉定位模型的训练和测试。该数据集包含 6,742 幅无人机图像和 11 幅卫星影像。无人机图像空间分辨率为0.1-0.2m，卫星影像空间分辨率为0.3m，数据集大小为16.4GB。

无人机视觉定位数据集（Drone Visual Localization Dataset, DVLD）

摘要

无人机视觉定位数据集（DVLD）是一个专为无人机视觉定位任务设计的数据集，旨在通过将无人机拍摄的地面俯视图与相应的遥感影像进行匹配，实现无人机的精确快速定位。该数据集包含6,742幅无人机图像和11幅卫星影像，涵盖了中国多个地区的不同地形特征。数据集的空间分辨率分别为0.1-0.2米（无人机图像）和0.3米（卫星影像），总数据量约为16.4GB。DVLD提供了多样化的数据，支持无人机视觉定位模型的训练和测试，可以作为当前无人机组合导航系统的重要补充。

数据集特点

多样化地形：数据集涵盖中国多个地区的不同地形特征，包括城市、乡村、山地、平原等。
高分辨率图像：无人机图像的空间分辨率为0.1-0.2米，卫星影像的空间分辨率为0.3米，确保了图像的高清晰度。
精确匹配：每张无人机图像都与对应的卫星影像进行了精确匹配，适用于视觉定位任务。
大样本量：包含6,742幅无人机图像和11幅卫星影像，提供了丰富的训练和测试数据。
标准化格式：图像采用标准的JPEG或TIFF格式存储，便于处理和分析。
易于使用：数据集已经划分好了训练集、验证集和测试集，可以直接用于训练视觉定位模型。

数据集构成

总图像数量：
- 无人机图像: 6,742幅
- 卫星影像: 11幅
空间分辨率：
- 无人机图像: 0.1-0.2米
- 卫星影像: 0.3米
数据量：约16.4GB
数据集划分：
- 训练集: 约5,000幅无人机图像
- 验证集: 约1,000幅无人机图像
- 测试集: 约742幅无人机图像
数据集结构：

drone_visual_localization_dataset/
├── drone_images/
│   ├── train/
│   ├── val/
│   └── test/
└── satellite_images/
    ├── satellite_image_01.tif
    ├── satellite_image_02.tif
    └── ...

drone_images/ 目录下存放无人机拍摄的地面俯视图。
satellite_images/ 目录下存放从谷歌地图获取的卫星影像。

标注信息

每张无人机图像都与对应的卫星影像进行了精确匹配，并且提供了地理坐标信息（经纬度）。标注文件通常以CSV或JSON格式提供，包含以下字段：

image_id：无人机图像的唯一标识符。
latitude：无人机图像中心点的纬度。
longitude：无人机图像中心点的经度。
satellite_image_id：对应的卫星影像的唯一标识符。
x_offset：无人机图像在卫星影像中的X轴偏移量。
y_offset：无人机图像在卫星影像中的Y轴偏移量。

示例标注文件（CSV格式）：

image_id,latitude,longitude,satellite_image_id,x_offset,y_offset
000001.jpg,39.9042,116.4074,satellite_image_01.tif,100,200
000002.jpg,39.9045,116.4078,satellite_image_01.tif,110,210
...

示例代码

以下是一个详细的Python脚本示例，用于加载数据集中的一对无人机图像-卫星影像对，并可视化其中的匹配信息。此外，还包括了如何使用常用的深度学习框架（如PyTorch）进行训练的基本步骤。

加载并可视化图像与匹配信息

import os
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle

# 数据集目录路径
data_dir = 'path/to/dvld_dataset'
drone_image_dir = os.path.join(data_dir, 'drone_images/train')
satellite_image_dir = os.path.join(data_dir, 'satellite_images')

# 读取标注文件
annotations_path = os.path.join(data_dir, 'annotations.csv')
annotations = pd.read_csv(annotations_path)

# 选取一张无人机图像及其对应卫星影像
drone_image_file = annotations.iloc[0]['image_id']
satellite_image_file = annotations.iloc[0]['satellite_image_id']

drone_image_path = os.path.join(drone_image_dir, drone_image_file)
satellite_image_path = os.path.join(satellite_image_dir, satellite_image_file + '.tif')

# 加载图像
drone_image = cv2.imread(drone_image_path, cv2.IMREAD_COLOR)
satellite_image = cv2.imread(satellite_image_path, cv2.IMREAD_COLOR)

# 获取匹配信息
x_offset = int(annotations.iloc[0]['x_offset'])
y_offset = int(annotations.iloc[0]['y_offset'])

# 可视化匹配信息
fig, ax = plt.subplots(1, 2, figsize=(12, 6))

ax[0].imshow(cv2.cvtColor(drone_image, cv2.COLOR_BGR2RGB))
ax[0].set_title('Drone Image')
ax[0].axis('off')

ax[1].imshow(cv2.cvtColor(satellite_image, cv2.COLOR_BGR2RGB))
rect = Rectangle((x_offset, y_offset), drone_image.shape[1], drone_image.shape[0], linewidth=2, edgecolor='r', facecolor='none')
ax[1].add_patch(rect)
ax[1].set_title('Satellite Image with Matched Region')
ax[1].axis('off')

plt.show()

使用PyTorch进行训练

假设你已经安装了PyTorch，并且配置文件（例如config.yaml）已经准备好，以下是使用PyTorch进行训练的基本步骤：

安装依赖：
```
pip install torch torchvision
```

创建数据加载器：创建一个简单的数据加载器来加载无人机图像和卫星影像。

import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from PIL import Image

class DroneLocalizationDataset(Dataset):
    def __init__(self, drone_image_dir, satellite_image_dir, annotations, transform=None):
        self.drone_image_dir = drone_image_dir
        self.satellite_image_dir = satellite_image_dir
        self.annotations = annotations
        self.transform = transform

    def __len__(self):
        return len(self.annotations)

    def __getitem__(self, idx):
        row = self.annotations.iloc[idx]
        drone_image_file = row['image_id']
        satellite_image_file = row['satellite_image_id']
        x_offset = row['x_offset']
        y_offset = row['y_offset']

        drone_image_path = os.path.join(self.drone_image_dir, drone_image_file)
        satellite_image_path = os.path.join(self.satellite_image_dir, satellite_image_file + '.tif')

        drone_image = Image.open(drone_image_path).convert('RGB')
        satellite_image = Image.open(satellite_image_path).convert('RGB')

        if self.transform:
            drone_image = self.transform(drone_image)
            satellite_image = self.transform(satellite_image)

        return drone_image, satellite_image, (x_offset, y_offset)

# 数据预处理
transform = transforms.Compose([
    transforms.Resize((256, 256)),
    transforms.ToTensor(),
])

train_dataset = DroneLocalizationDataset(drone_image_dir, satellite_image_dir, annotations, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=4, shuffle=True, num_workers=2)

定义模型：使用PyTorch定义一个简单的卷积神经网络模型。

import torch.nn as nn
import torch.nn.functional as F

class LocalizationModel(nn.Module):
    def __init__(self):
        super(LocalizationModel, self).__init__()
        
        # Drone image branch
        self.drone_conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
        self.drone_pool1 = nn.MaxPool2d(2, 2)
        self.drone_conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.drone_pool2 = nn.MaxPool2d(2, 2)
        self.drone_fc1 = nn.Linear(64 * 64 * 64, 512)
        
        # Satellite image branch
        self.satellite_conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
        self.satellite_pool1 = nn.MaxPool2d(2, 2)
        self.satellite_conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.satellite_pool2 = nn.MaxPool2d(2, 2)
        self.satellite_fc1 = nn.Linear(64 * 64 * 64, 512)
        
        # Fusion layer
        self.fusion_fc1 = nn.Linear(512 + 512, 256)
        self.fusion_fc2 = nn.Linear(256, 2)  # Output (x_offset, y_offset)

    def forward(self, drone_image, satellite_image):
        # Drone image branch
        x = F.relu(self.drone_conv1(drone_image))
        x = self.drone_pool1(x)
        x = F.relu(self.drone_conv2(x))
        x = self.drone_pool2(x)
        x = x.view(-1, 64 * 64 * 64)
        x = F.relu(self.drone_fc1(x))
        
        # Satellite image branch
        y = F.relu(self.satellite_conv1(satellite_image))
        y = self.satellite_pool1(y)
        y = F.relu(self.satellite_conv2(y))
        y = self.satellite_pool2(y)
        y = y.view(-1, 64 * 64 * 64)
        y = F.relu(self.satellite_fc1(y))
        
        # Fusion
        z = torch.cat((x, y), dim=1)
        z = F.relu(self.fusion_fc1(z))
        z = self.fusion_fc2(z)
        
        return z

model = LocalizationModel()

定义损失函数和优化器：

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

训练模型：

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

num_epochs = 50

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for drone_images, satellite_images, offsets in train_loader:
        drone_images = drone_images.to(device)
        satellite_images = satellite_images.to(device)
        offsets = torch.tensor(offsets, dtype=torch.float32).to(device)

        optimizer.zero_grad()

        outputs = model(drone_images, satellite_images)
        loss = criterion(outputs, offsets)

        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}')

print('Training complete.')

评估模型：

model.eval()
with torch.no_grad():
    for drone_images, satellite_images, offsets in val_loader:
        drone_images = drone_images.to(device)
        satellite_images = satellite_images.to(device)
        offsets = torch.tensor(offsets, dtype=torch.float32).to(device)

        outputs = model(drone_images, satellite_images)
        loss = criterion(outputs, offsets)

        # 可以在这里添加更多的评估指标，如平均误差等
        print(f'Validation Loss: {loss.item():.4f}')

推理和可视化：

def visualize_predictions(model, data_loader, device, n_samples=5):
    model.eval()
    with torch.no_grad():
        for i, (drone_images, satellite_images, offsets) in enumerate(data_loader):
            if i >= n_samples:
                break
            
            drone_images = drone_images.to(device)
            satellite_images = satellite_images.to(device)
            offsets = torch.tensor(offsets, dtype=torch.float32).to(device)

            outputs = model(drone_images, satellite_images)

            fig, axs = plt.subplots(1, 2, figsize=(12, 6))
            axs[0].imshow(cv2.cvtColor(np.array(drone_images[0].cpu().permute(1, 2, 0)), cv2.COLOR_RGB2BGR))
            axs[0].set_title('Drone Image')
            axs[0].axis('off')

            axs[1].imshow(cv2.cvtColor(np.array(satellite_images[0].cpu().permute(1, 2, 0)), cv2.COLOR_RGB2BGR))
            rect = Rectangle((outputs[0, 0].item(), outputs[0, 1].item()), drone_images.shape[2], drone_images.shape[3], linewidth=2, edgecolor='r', facecolor='none')
            axs[1].add_patch(rect)
            axs[1].set_title('Satellite Image with Predicted Region')
            axs[1].axis('off')

            plt.show()

# 使用验证集进行可视化
visualize_predictions(model, val_loader, device, n_samples=5)

通过上述步骤，你可以轻松地使用这个数据集来训练一个高效的无人机视觉定位模型。