eloftr特征匹配结果可视化

Candice_jy

已于 2024-07-23 15:41:21 修改

阅读量1.6k

点赞数 31

文章标签： python opencv

于 2024-07-23 15:38:08 首次发布

本文链接：https://blog.csdn.net/qq_52039107/article/details/140634975

版权

一、EfficientLoFTR实验结果复现

论文链接：zju3dv.github.io/efficientloftr/files/EfficientLoFTR.pdf

代码链接：zju3dv/EfficientLoFTR (github.com)

测试运行：由于weights文件中只有outdoor权重，因此以室外数据集测试为主，命令如下

bash scripts/reproduce_test/outdoor_full_auc.sh

运行结果如下：

二、评估指标解释

2.1 AUC指标

如下图所示，考虑一个分类问题，横坐标为置信度（或对应任务的误差），橙色虚线为设置的阈值大小，不同的阈值对应不同的混淆矩阵，可以得到多个混淆矩阵

TPR和FPR定义如下，每一个混淆矩阵对应一个TPR和一个FPR

ROC曲线如下图所示，曲线上每一个点对应一个混淆矩阵

TPR和FPR指标的分母是固定的，分别为数据集中正样本和负样本的数量

对于分子，希望TP更大，FP更小即TPR更大，FPR更小

对应ROC曲线中左上角区域

量化评估：曲线下的区域面积越大越好（AUC）

参考资料：【小萌五分钟】机器学习 | 模型评估: ROC曲线与AUC值_哔哩哔哩_bilibili

2.2 eloftr评估指标

auc@10: 0.7166339555133289

AUC@10: 表示在误差容忍度为10像素时的ROC曲线下面积（Area Under the Curve）。这个值越高，表示在特征点匹配任务中，误差小于10像素的情况下，模型的性能越好。

值0.7166表示在10像素误差下，模型的特征匹配性能较好。

auc@20: 0.8318667754522098

AUC@20: 表示在误差容忍度为20像素时的ROC曲线下面积。

值0.8318表示在20像素误差下，模型的特征匹配性能更好。

auc@5: 0.5551221177666241

AUC@5: 表示在误差容忍度为5像素时的ROC曲线下面积。

值0.5551表示在5像素误差下，模型的特征匹配性能相对较低。

num_matches:3288.1826666666666

num_matches: 表示模型在测试数据集上找到的匹配对的数量。这是一个统计指标，表示特征匹配算法找到了多少个特征点对。

prec@5e-04: 0.96871016885266293

Precision@5e-04: 在5e-04（0.0005）阈值下的精度。精度（Precision）表示在所有预测为正例的匹配对中，真正为正例的比例。

值0.9687表示在误差阈值为0.0005的情况下，模型的特征匹配预测非常准确。

三、匹配结果可视化

由于并未给出直观的特征匹配结果图，因此自己补充代码以便查看效果

3.1 匹配结果存储：batch变量

文件src/lightning/lightning_loftr.py

test_step函数：处理每个测试批次，进行模型前向传播，记录计算时间，并计算指标

test_epoch_end函数：在测试结束后聚合所有批次的指标，计算平均匹配时间，并记录测试结果。

中间变量batch（dict类型），其keys包括：

'bs' = {int} 1

'pair_names' = {list:2} [['xxx.jpg'],['xxx.jpg']]'

'image0' = {Tensor:(1,1,480,640)}

'image1' = {Tensor:(1,1,480,640)}

'mkpts0_f' = {Tensor:(2339,2)}

'mkpts1_f' = {Tensor:(2339,2)}

'mconf' = {Tensor:(2339,)}

其中'image0' 'image1'中的值处于0-1，应该是图像标准化结果，因此在可视化匹配结果的过程中没有使用它们，而是根据'pair_names'直接导入的原图。

3.2 完整代码

    def draw_matches(self,batch):
        img0_path = os.path.join('data/scannet/test/', batch['pair_names'][0][0])
        image0 = cv2.imread(img0_path)

        img1_path = os.path.join('data/scannet/test/', batch['pair_names'][1][0])
        image1 = cv2.imread(img1_path)
        mkpts0_f = batch['mkpts0_f']
        mkpts1_f = batch['mkpts1_f']
        mconf = batch['mconf']

        mkpts0_f = mkpts0_f.cpu().numpy()
        mkpts1_f = mkpts1_f.cpu().numpy()
        mconf = mconf.cpu().numpy()

        height1,width1 = image0.shape[:2]
        height2,width2 = image1.shape[:2]
        new_height = max(height1,height2)
        new_width = width1 + width2

        stitched_image = np.zeros((new_height, new_width, 3), dtype=np.uint8)
        stitched_image[:height1, :width1, :3] = image0
        stitched_image[:height2, width1:width1+width2,:3] = image1

        mkpts1_f_shifted = mkpts1_f + np.array([width1,0])

        mconf_norm = (mconf - np.min(mconf))/(np.max(mconf)-np.min(mconf))

        colors = plt.cm.jet(1-mconf_norm)
        for pt1, pt2, color in zip(mkpts0_f, mkpts1_f_shifted, colors):
            pt1 = tuple(map(int,pt1))
            pt2 = tuple(map(int,pt2))

            cv2.line(stitched_image,pt1,pt2,color[:3]*255,2)
            cv2.circle(stitched_image,pt1,5,color,-1)
            cv2.circle(stitched_image,pt2,5,color,-1)

        return stitched_image

    def save_match_result(self,batch,batch_idx):

        result_image = self.draw_matches(batch)
        output_path = os.path.join(self.dump_dir, f'match_{batch_idx}.png')
        cv2.imwrite(output_path, result_image)
        print(f"Matching result saved to {self.dump_dir}")

四、其他记录

4.1 报错

image =cv2.resize(image,resize)
cv2.error: 0pencv(4.4.0)/tmp/pip-reg-build-99ib2vsi/opencv/modules/imgproc/src/resize.cpp:3929:error:(-215:Assertion failed)!ssize.empty()in function 'resize'

原因：图像路径错误

在调用 cv2.resize 函数之前，源图像 (ssize) 为空或没有正确加载

4.2 参数配置

4.3 cv的一些基本操作

（其中操作图像需要保证其是Numpy数组，即ndarray）

设置路径

# 路径拼接
img1_path = scene0707_00/color/15.jpg
img_path = os.path.join('data/test/', img1_path)

# 输出路径及文件名字设置
dump_dir = 'dump/eloftr_full_sacnnet'
output_path = os.path.join(dump_dir, f'match_{batch_idx}.png')  # 其中batch_idx为变量

图像的读写

#其中image_path和output_path均需要明确文件扩展名来确定图像的格式（如'.jpg','png','.bmp'等）
# image_path = 'xxx/img.jpg'
# output_path = 'xxx/output_name.jpg'
image = cv2.imread(image_path)
cv2.imwrite(output_path, result_image)

打印查看

# 打印数据x的类型
print(type(x))

# 打印dict数据的键值
print(example_dict.keys())

4.4 Tensor与Numpy

4.4.1 报错

File "EfficientLofTR-main/src/lightning/lightning loftr.py", line 138, in draw matches
stitched_image[:height1, :width1,:3]= image0/ tensor.py", line 972, in __array_.File "/home/puzek/anaconda3/envs/eloftr/lib/python3.8/site-packages/torch/
return self.numpy().astype(dtype, copy=False)
TypeError: can't convert cuda:1 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

原因：正在尝试将一个在 GPU 上的 PyTorch tensor 直接转换为 NumPy 数组，而 PyTorch tensor 必须在 CPU 上才能进行这种转换。你需要将 tensor 从 GPU 移动到 CPU，然后再进行转换。你可以使用 Tensor.cpu() 方法将 tensor 移动到 CPU。

if isinstance(image0, torch.Tensor):
    image0 = image0.cpu().numpy()

4.4.2 Tensor与Numpy

Tensor 和 NumPy 数组都是多维数组（或张量），它们在深度学习和科学计算中非常常用。以下是 Tensor（主要指 PyTorch 的 Tensor）和 NumPy 数组之间的一些关键区别：

1. 所属库

Tensor：主要由深度学习框架（如 PyTorch、TensorFlow）提供。
NumPy 数组：由 NumPy 库提供，这是一个广泛用于科学计算的 Python 库。

2. 功能和用途

Tensor：设计用于深度学习，可以在 GPU 上运行，支持自动微分（autograd）功能，这对于梯度计算和反向传播至关重要。
NumPy 数组：设计用于一般的数值计算和科学计算，提供丰富的线性代数、傅里叶变换、统计等功能，但不具备 GPU 支持和自动微分功能。

3. 设备支持

Tensor：可以在 CPU 和 GPU 上运行。通过 .to(device) 方法，可以将张量在不同设备之间移动。
NumPy 数组：只能在 CPU 上运行。

4. 自动微分

Tensor：支持自动微分，这是深度学习模型训练所必需的。通过 requires_grad 属性，可以跟踪所有操作并自动计算梯度。
NumPy 数组：不支持自动微分。

5. 操作和兼容性

Tensor：支持的大部分操作与 NumPy 类似，并且 PyTorch 提供了很多与 NumPy 兼容的接口。
NumPy 数组：提供了丰富的科学计算函数，但没有深度学习框架的特定功能。

6. 转换

Tensor 转 NumPy：

import torch
tensor = torch.tensor([1, 2, 3])
numpy_array = tensor.numpy()  # 注意：这个操作是共享内存的

NumPy 转 Tensor：

import torch
import numpy as np
numpy_array = np.array([1, 2, 3])
tensor = torch.tensor(numpy_array)  # 这是一个拷贝操作
# 或者
tensor = torch.from_numpy(numpy_array)  # 这个操作是共享内存的

示例代码

import torch
import numpy as np

# 创建一个 NumPy 数组
numpy_array = np.array([[1, 2, 3], [4, 5, 6]])
print("NumPy Array:")
print(numpy_array)

# 将 NumPy 数组转换为 PyTorch Tensor
tensor = torch.tensor(numpy_array)
print("\nTensor from NumPy Array:")
print(tensor)

# 将 Tensor 转换回 NumPy 数组
numpy_array_from_tensor = tensor.numpy()
print("\nNumPy Array from Tensor:")
print(numpy_array_from_tensor)

# 检查共享内存
print("\nCheck Shared Memory:")
print(f"NumPy array memory id: {id(numpy_array_from_tensor)}")
print(f"Tensor memory id: {id(tensor)}")

# 创建一个新的 PyTorch Tensor（在 GPU 上）
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tensor_gpu = torch.tensor([[7, 8, 9], [10, 11, 12]], device=device)
print(f"\nTensor on {device}:")
print(tensor_gpu)

# 将 GPU Tensor 转换为 CPU NumPy 数组
numpy_array_from_tensor_gpu = tensor_gpu.cpu().numpy()
print("\nNumPy Array from GPU Tensor:")
print(numpy_array_from_tensor_gpu)

4.4.3 修改图像维度

如果是 PyTorch tensor，转换为numpy后通常是 (channels, height, width)，需要先转换为 (height, width, channels)

# 如果通道维度在前面 (1, H, W)，调整为 (H, W, 1) 并复制到 3 通道
if image0.ndim == 3 and image0.shape[0] == 1:
    image0 = np.repeat(image0.squeeze(0), 3, axis=-1)

若要将形状为 (1, 480, 640) 的图像转换为 (480, 640)，你需要去掉第一个维度。可以使用 NumPy 的 squeeze() 方法来完成这项工作

if image0.ndim == 3 and image0.shape[0] == 1:
    image0 = np.squeeze(image0, axis=0)  # 去掉第一个维度