图像交换部分区域或帧

最新推荐文章于 2024-07-25 12:21:29 发布

人间惆怅客_

最新推荐文章于 2024-07-25 12:21:29 发布

阅读量454

点赞数 5

分类专栏：深度学习综合文章标签：计算机视觉图像处理 pillow pytorch

本文链接：https://blog.csdn.net/qq_44498688/article/details/139378656

版权

深度学习综合专栏收录该内容

1 篇文章 0 订阅

订阅专栏

生成一个boundingbox，或区间

给定矩形框占图像的面积比例，和图像的宽W高H，生成矩形框。
根据给定的矩形框，交换两张图像的部分区域。

这里为了方便展示，简化问题，给定一个图像数组mels，对第 i 和第 i+2 个图像交换矩形框区域的部分：
get_cutmix() 输入一个torch张量，输出一个torch张量。

import yaml
import math
import numpy as np
import random
import matplotlib.pyplot as plt
import torch
from PIL import Image

def get_boundingbox(W, H, lamb=0.25):
    rate = math.sqrt(lamb)
    rx, ry = (1 - rate) * W * random.random(), (1 - rate) * H * random.random()
    rw, rh = W * rate, H * rate
    return int(rx), int(ry), int(rw), int(rh)

def get_cutmix(mels, lamb=0.25):
    B, C, H, W = mels.shape
    new_mels = []
    for i in range(B):
        rx, ry, rw, rh = get_boundingbox(H, W, lamb=lamb)
        print(rx, ry, rw, rh)
        j = (i + 2) % B
        MB = mels[j, :, rx:rx + rw, ry:ry + rh]
        print(MB.shape)
        MA = mels[i, :, ...]
        MA[:, rx:rx + rw, ry:ry + rh] = MB
        new_mels.append(MA)
    return torch.stack(new_mels, dim=0)

下图就是CutMix的效果，出自论文 CutMix： Regularization Strategy to Train Strong Classifiers with Localizable Features 1905.04899v2。

另一个类似的，交换部分帧，对于序列型的数据比如LogMelSpectrogram：
get_rand_frame()输入一个numpy的数组[np.array]，输出一个numpy的数组[np.array]。

def get_rand_frame(mels, frame_width):

    B = len(mels)
    new_mels = []
    for i in range(B):
        j = (i + 2) % B
        W = mels[j].shape[1]
        rx = (W - frame_width) * random.random()
        rx = int(rx)
        print("rx:", rx)
        MB = mels[j][:, rx:rx+frame_width]
        print("MB:", MB.shape)

        MA = mels[i]
        W = mels[i].shape[1]
        rx = int((W - frame_width) * random.random())
        print("rx:", rx)
        MA[:, rx:rx + frame_width] = MB
        print("MA:", MA.shape)
        new_mels.append(MA)
    return new_mels

def main():
    root = "D:/kingz/ucasFiles/paper1/"
    img_paths = ["spec_denoise", "spec_normal", "denoised_true_0", "addednoised_true_1", "denoised_1"]
    imgs = []
    h = 222

    for pa in img_paths:
        img_tmp = np.array(Image.open(root+pa+".png").convert("L"), np.float64)
        imgs.append(img_tmp[:h, :])
        print(img_tmp[:h, :].shape)
    new_imgs = get_rand_frame(imgs, frame_width=40)
    # plt.figure(0)
    for i in range(5):
        # plt.subplot(3, 2, i+1)
        plt.figure(i)
        plt.imshow(new_imgs[i].astype(np.uint8))
        plt.xticks([])
        plt.yticks([])
        plt.savefig(f"./cutframe_{i}.png", bbox_inches='tight', pad_inches=0.0, dpi=300)

效果如下：

在这里插入图片描述
如果你想加载一篇你写过的.md文件，在上方工具栏可以选择导入功能进行对应扩展名的文件导入，
继续你的创作。

[1]: CutMix Regularization Strategy to Train Strong Classifiers with Localizable Features 1905.04899v2
[2]: Anomalous_Sound_Detection_Based_on_Interpolation_Deep_Neural_Network