图片透视变换

最新推荐文章于 2023-04-06 18:59:08 发布

ab0902cd

最新推荐文章于 2023-04-06 18:59:08 发布

阅读量371

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/ab0902cd/article/details/103278546

版权

python 专栏收录该内容

10 篇文章

订阅专栏

在训练模型时，数据量不够的情况下，通过做数据增广来增加数据的复杂性，透视变换是一种比较好的数据增广手段，下面介绍如何对图片图片透视变换（Perspective Transformation）：透视变换主要是基于图片的平移、旋转特性对图片进行一系列的矩阵变换操作，达到3D的变换效果，透视变换的原理可以参考：

https://stackoverflow.com/questions/17087446/how-to-calculate-perspective-transform-for-opencv-from-rotation-angles

变换主要思想是平面旋转，同时沿着某个基准轴进行旋转，需要计算每种变换的矩阵，其推导过程较为复杂，具体可参考上面链接：

实现过程可参考链接：https://github.com/eborboihuc/rotate_3d

作者提供了沿着x,y,z轴的任意角度的变换过程，其参数含义如下：

gamma : the rotation around the x axis
phi : the rotation around the y axis
gamma : the rotation around the z axis

如果图片的黑色填充尺寸过大，可以对图片进行裁剪操作，主要需要对图片进行灰度化，然后按通域的找到图片的外接矩形进行裁剪，参考如下：

ImageTr = ImageTransformer(os.path.join(remake_src, imgfile))
            ang = '45'
            res, mat = ImageTr.rotate_along_axis(gamma=0, phi=45, theta=0, dx=100,dy=100, dz=100)  # rotate around z axis
            # show_image("res", res)
            print(mat)
            # plt.imshow(res)
            grey = cv2.cvtColor(res, cv2.COLOR_BGR2GRAY)
            ret, thresh = cv2.threshold(grey, 10, 255, cv2.THRESH_BINARY)
            out = cv2.findContours(thresh, 1, 2)
            x, y, w, h = cv2.boundingRect(out[0])
            crop = res[y:y + h, x:x + w]
            offset = np.hstack((np.ones((4,1))*x,np.ones((4,1))*y))
            lines =''
            flag = os.path.exists(os.path.join(remake_src,txtname))
            if not flag:
                continue
            with open(os.path.join(remake_src, txtname), 'r', encoding='utf-8')as f:
                for line in f.readlines():
                    pt_ = np.array(line.split(',')[0:8], np.int32).reshape(4, 2)
                    pt_new = np.insert(pt_, 2, values=np.ones(4), axis=1) # pt_new dimension: 4*3
                    pt_result = mat.dot(pt_new.T)  # mat:3*3
                    poly = np.divide(pt_result, pt_result[2])
                    poly_ = np.delete(poly, 2, axis=0).T - np.array([x,y])
                    tmp = list(map(int,poly_.reshape((-1))))
                    tmp.append(line.split(",", 8)[8])
                    lines += ','.join(list(map(str, tmp)))
                with open(os.path.join(remake_dst, pre + '.txt'), 'w', encoding='utf-8') as ff:
                    ff.write(lines)
                    cv2.imwrite(remake_dst + pre + back, crop, [int(cv2.IMWRITE_JPEG_QUALITY), 95])

参考：https://nbviewer.jupyter.org/github/manisoftwartist/perspectiveproj/blob/master/perspective.ipynb