可见光(RGB)-热成像(thermal)图像翻译(风格迁移)实践记录

@daviiid

已于 2023-07-25 14:13:40 修改

阅读量8.3k

点赞数 21

分类专栏： AI 文章标签：计算机视觉图像处理生成对抗网络深度学习人工智能

于 2023-07-24 17:44:38 首次发布

本文链接：https://blog.csdn.net/wb3533366/article/details/131728464

版权

AI 专栏收录该内容

14 篇文章

订阅专栏

文章目录

概要

训练一个从可见光（RGB）到红外热成像（Thermal）的Image2Image模型。难点：

温度信息无法通过RGB像素表征，因此同样的颜色信息可能对应不同的Thermal图像；
不同采集设备因参数不同从而采集到的红外图像质量区别很大，不同作者对数据的预处理方式不一样也会导致数据集差异大；
丢失的信息基本无法迁移，比如浓雾后的人体；

训练数据

这部分主要来源于网络共享，并且都已注明出处，如侵权请联系

LLVIP (paired)
https://bupt-ai-cz.github.io/LLVIP/
https://blog.csdn.net/qq_29562209/article/details/126665611
Https://pan.baidu.com/s/1yxbnLUiK8xa0mAt5cDNKig密码：b88p
M3FD (paired)
https://github.com/dlut-dimt/TarDAL
https://pan.baidu.com/s/1GoJrrl_mn2HNQVDSUdPCrw?pwd=M3FD
FLIR (paired， unaligned)
https://www.flir.com/oem/adas/adas-dataset-form/
https://avoid.overfit.cn/post/cb714527964e49bd9858eb2a4b2a1e62
v1 : https://pan.baidu.com/s/11GJe4MdM_NH6fuENCQ2MtQ 提取码:019b
v2 : https://pan.baidu.com/s/1ooLmEm39Y_LSinU860Zj1w?pwd=3cp3#list/path=%2F
KAIST (paired)
链接：https://pan.baidu.com/s/1V6qOIUIo2yojy-se_oWjtQ 提取码：9yhh
VOC
https://github.com/vlkniaz/ThermalGAN
OTCBVS
http://vcipl-okstate.org/pbvs/bench/
BU-TIV (Thermal Infrared Video) Benchmark
http://csr.bu.edu/BU-TIV/BUTIV.html#:~:text=If%20you%20find,AT%20gmail.com
VTUAV
https://zhang-pengyu.github.io/DUT-VTUAV/

FLIR数据对齐

对齐前

对齐后

官网下载的数据集没有对齐，无法直接使用。因此，需要把RGB对齐到Thermal（RGB的FOV大于Thermal），方法：

分别计算RGB和T的关键点，可以手动选一组点，或者用SIFT等算法生成一组，代码使用第一种方法；
计算AffineMatrix;
对RGB图形进行仿射变换；

#!/usr/bin/env python
# coding=utf-8

import numpy as np
import cv2
import pdb
import os
import tqdm

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP', '.tiff'
]

def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)

def make_dataset(dir):
    images = []
    assert os.path.isdir(dir), '%s is not a valid directory' % dir

    for root, _, fnames in sorted(os.walk(dir)):
        for fname in fnames:
            if is_image_file(fname):
                path = os.path.join(root, fname)
                images.append(path)

    return images

def main():
    rgb_image = cv2.imread('train/RGB/FLIR_00088.jpg')
    thermal_image = cv2.imread('train/thermal_8_bit/FLIR_00088.jpeg')

    rgb_point = np.array(
        [
            [308,741],
            [579,606],
            [1328,754]
        ],
        dtype=np.float32
    )

    thermal_point = np.array(
        [
            [63,238],
            [172,182],
            [478,244]
        ],
        dtype=np.float32
    )

    M =cv2.getAffineTransform(rgb_point, thermal_point)
    aligned_rgb_img = cv2.warpAffine(rgb_image, M, (thermal_image.shape[1], thermal_image.shape[0]))
    #cv2.imshow("affine", aligned_rgb_img)
    #cv2.waitKey()


    dataroot = './train'   
    phase = 'train'
    ### input A (label maps)
    dir_A = 'A'
    dir_A = os.path.join(dataroot, phase + dir_A)
    A_paths = sorted(make_dataset(dir_A))
    A_paths_aligned = []
    B_paths_aligned = []

    ### input B (real images)
    dir_B = 'B'
    dir_B = os.path.join(dataroot, phase + dir_B)  
    B_paths = sorted(make_dataset(dir_B))

    for line in A_paths:
        _, filename = os.path.split(line)
        B_path = os.path.join(dir_B, filename.replace('jpg', 'jpeg'))
        if os.path.exists(B_path):
            A_paths_aligned.append(line)
            B_paths_aligned.append(B_path)

    A_paths = A_paths_aligned
    B_paths = B_paths_aligned

    for idx in tqdm.trange(len(A_paths)):
        A = A_paths[idx]
        img_A = cv2.imread(A)
        img_B = cv2.imread(B_paths[idx])
        aligned_A = cv2.warpAffine(img_A, M, (img_B.shape[1], img_B.shape[0]))
        cv2.imwrite(A.replace('trainA', 'trainA_Aligned'), aligned_A)
        im_AB = np.concatenate([aligned_A, img_B], 1)
        cv2.imwrite(A.replace('trainA', 'train_AB'), im_AB)

if __name__ == '__main__':
    main()