Hybrid LSTM and Encoder-Decoder Architecturefor Detection of Image Forgeries的复现梳理

最新推荐文章于 2023-01-12 23:17:01 发布

ztg111

最新推荐文章于 2023-01-12 23:17:01 发布

阅读量4.2k

点赞数 1

文章标签： python 图像处理深度学习

本文链接：https://blog.csdn.net/ztg111/article/details/124139936

版权

论文就是标题这篇，19年bappy大佬的又一篇顶刊，整体工作是对他们之前一篇也是基于LSTM实现复制篡改检测工作的改进，但这篇给了源码。

code：jawadbappy/forgery_localization_HLED: Implementation of "Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries" paper. (github.com)

代码初看上去及其庞大，但是按照readme的引导后，实现异常的简单，就是源码用的是tf.slim，一个小众的库，看懂不成问题，但是之后改进可能会有些麻烦，想着之后把它用keras重写一下。

本文旨在帮大家理清跑训练需要用那些文件，至于剩下了一大堆干啥的我也不知道，所有的内容也没仔细看，能调通就大吉大利了。

首先现在基本都是tf2.及以上了，最基本的兼容tf1必须加上，

import tensorflow as tf
import tensorflow.compat.v1 as tf

from tensorflow.python.ops.rnn import static_rnn
import tf_slim as slim
tf.disable_v2_behavior()

那个tf_slim也要自己pyinstall下来才行，之后改动源码里一些函数调用方法，改完大致如下

stacked_lstm_cell =tf.compat.v1.nn.rnn_cell.MultiRNNCell(
            [tf.compat.v1.nn.rnn_cell.DropoutWrapper(tf.compat.v1.nn.rnn_cell.BasicLSTMCell(n_hidden), output_keep_prob=0.9) for _ in range(2)])
        out, state = static_rnn(stacked_lstm_cell, xCell, dtype=tf.float32)

之后就是训练需要的数据集，两个h5文件。训练集的话作者在云盘提供了三个挺大的数据集，先都下载解压后可以看到都是彩图与对应掩码的png文件，然后用下面的代码转成train_img.hdf5文件。这是转载另一个大佬的：(1条消息) Python 万级图片数据集成到HDF5文件中，绝对路径_weixin_44576543的博客-CSDN博客_python读取hdf5图

import os
import cv2
import numpy as np
import h5py
from matplotlib import pyplot as plt
import tensorflow as tf
from tensorflow.python.client import device_lib  # gpu cpu information
from datetime import datetime
import time
import os

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
log_device_placement = True

img1 = []
dir = 'C:/Users/Administrator/PycharmProjects/convlstm/Bernard/Computer_Vision/spliced_copymove_NIST/rgb_imgs'
# img1=np.zeros()
for filename in os.listdir(dir):
    img1.append(dir + '/' + filename)
#     #   *print(filename)*
mask = []

mdir = 'C:/Users/Administrator/PycharmProjects/convlstm/Bernard/Computer_Vision/spliced_copymove_NIST/masks'
for filename in os.listdir(mdir):
    mask.append(mdir + '/' + filename)

with tf.device('/gpu:0'):
    #   # num=13470

    imagedata_shape = (len(img1), 256, 256, 3)
    maskdata_shape = (len(img1), 256, 256)
    mean = np.zeros(imagedata_shape[1:], np.float32)

    f = h5py.File('spliced_NIST.hdf5', mode='w')
    f.create_dataset("train_img", imagedata_shape, np.int8)
    f.create_dataset("train_labels", maskdata_shape, np.int8)
    f.create_dataset("train_mean", imagedata_shape[1:], np.float32)

    start_time = time.time()

    for i in range(len(img1)):
        #   #     print(i)
        if i % 1000 == 0 and i > 1:
            print('image_data: {}/{}'.format(i, len(img1)))

        img = cv2.imread(img1[i])
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        height, width = img.shape[:2]
        size = (int(width * 0.25), int(height * 0.25))
        img = cv2.resize(img, size)
        #    #    imgData[i]=img
        mean += img / float(len(img1))

        mask1 = cv2.imread(mask[i], cv2.IMREAD_GRAYSCALE)
        mask1 = cv2.resize(mask1, size)
        for ii in range(256):
            for j in range(256):
                if mask1[ii][j] > 0:
                    mask1[ii][j] = 1
                else:
                    mask1[ii][j] = 0

        f['train_img'][i, ...] = img  # 将数据写入文件的主键data下面
        f['train_labels'][i, ...] = mask1  # 将数据写入文件的主键labels下面
        f["train_mean"][...] = mean
    f.close()  # 关闭文件
    duration = time.time() - start_time
    print(duration)

接下来是要把这个train_img.hdf5提取重采样特征做成另一个hdf5文件，这个源码里有，还有两种计算方式，一种调用cuda和外部库，很麻烦，我也没调通，反正下面有里一个差不多结果的sklearn实现（是作者说的结果差不多），改个地址直接运行extract_resamp_feat.py就会生成train_img_feat.hdf5，现在有这两个已经可以跑训练了，但是源码里是直接用4个训练集和一个测试集，所以有10个h5文件同时加载，哪里改动挺简单就不在这里赘述了，反正有几个数据集都行，在一个batch里按比例自己分配就行。

再说一个都有些疑惑的店，作者在源码里处理输入的img时统一除以了mx，也就是127，通过查看作者自己提供的8张测试集的hdf5文件，可以发现它的三通道里的值都在0-127之间，所以除127就是一个归一化，但是作者提供的那三个数据集里面的png图片值都在0-255之间，所以私以为要达到相同效果的归一化就应该除以255。但作者在源码里除之前还有一个操作看不懂，

Img=np.uint8(Img)
Img=np.multiply(Img,1.0/mx)

理论上unit8之后的值都在0-255之间了，他还除以127是咋回事？所以这也有可能不是一个简单的归一化操作，但是该是个啥我也不知道了。

我也还没跑出结果，有大佬有结果了麻烦私信一下，谢谢。

ztg111

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
8
评论
Hybrid LSTM and Encoder-Decoder Architecturefor Detection of Image Forgeries的复现梳理

Hybrid LSTM and Encoder-Decoder Architecturefor Detection of Image Forgeries的复现梳理
复制链接

扫一扫