PyraNet数据预处理

最新推荐文章于 2020-09-30 11:50:40 发布

枯叶蝶KYD

最新推荐文章于 2020-09-30 11:50:40 发布

阅读量2.2k

点赞数 2

本文链接：https://blog.csdn.net/u013548568/article/details/79123649

版权

1、

训练

local inp = crop(img, c, s, 0, self.inputRes)

首先做crop，目的是crop成想要的形状，c是center的位置，s是scale的位置，0是旋转角度，输入分辨率是256

1.1 图像和尺度预处理

1）scaleFactor <2

scaleFactor 是什么呢？ scaleFactor = (200 * scale) / res

说白了就是原图的高相对于256的大小是多少，如果小于2就设置为1,如果scaleFactor<2，就设置为scaleFactor=1,他的意思是这时候目标的小人并不是特别大，在原图中取的话256应该可以把人圈出来，不需要图像缩放到一个小图，在小图上面取，这就是他的意思，所以下面的这两行代码仅仅对于scaleFactor>2的情况下有用

local newSize = math.floor(math.max(ht,wd) / scaleFactor)
tmpImg = image.scale(img,newSize)

综合上面的分析

scaleFactor的作用是是不是需要缩放图像！！！！！！！！！！！！！！！！也即是不是在缩放后的图像上取人！！！！！！！！！！！！！！！！如果人不大，不需要缩放图像，设置其为1！！！！！！！！！！！！！！！！！！否则人很大的话，不改变它的值，将图像缩放！！！！！！！！！！！！！！！！！！！！！！！！！

2）

scaleFactor>2

    if scaleFactor < 2.0 then scaleFactor = 1

    else
        local newSize = math.floor(math.max(ht,wd) / scaleFactor)
        if newSize < 2 then
           -- Zoomed out so much that the image is now a single pixel or less
           if ndim == 2 then newImg = newImg:view(newImg:size(2),newImg:size(3)) end
           return newImg
        else
           tmpImg = image.scale(img,newSize)
           ht,wd = tmpImg:size(2),tmpImg:size(3)
        end
    end

定义一个新的newsize, math.floor(math.max(ht,wd) / scaleFactor),目标也是让他在256左右，以便将人体缩放到这个大小

但是作者不是直接缩放人体，而是将图像进行缩放，假设一个人缩放到256需要缩小3倍，作者首先将图像缩小3倍，所以下面的

c,s都要缩小三倍

    local c,s = center:float()/scaleFactor, scale/scaleFactor
    local ul = M.transform({1,1}, c, s, 0, res, true)
    local br = M.transform({res+1,res+1}, c, s, 0, res, true)

这三句话是最关键的。如果scaleFactor>2了，这个时候表明图像中的人太大了，需要在缩小后的图像上取人上面已经将图像所放了，所以这里必须将c,s按照相同的比例缩放，在缩放后的图像上进行裁剪

1.2 正式裁剪

接下来是crop操作，根据下面的讲解，我们可以知道，ul是（0,0）点映射到原来的图中的坐标位置，br是（256,256）点映射到原来的图中的坐标位置,他的意思是用256*256讲一个人包住，那么返回到原图中的大小是多少呢？就是br-ul

通过实验可以发现，如果scaleFactor>2，通常br-ul是（255，256），（255,255），（256，255）这几个组合，而scaleFactor<2确实任意的大小，why?????????????

这是因为在获得变换矩阵的时候，s=1的缘故！！！！！！！！！！！！！！！！！！！！

简单解析一下！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！！

在getTransfrom函数中，

这仅仅是对于scaleFactor>2的情形成立，对于scaleFactor<2.出来的结果则五花八门

为什么这样做！！！！！！！！！！！！！！！！！！！！！！！！

我觉得既然scaleFactor很大了，就在上文中缩放了的图像上面去做，s=1也是合理的，因为图像已经缩放好了，反之，scaleFactor<2就需要在未缩放好的图像中切割了！！！！！！！！！

if scaleFactor >= 2 then br:add(-(br - ul - res)) end

这句话是为了让br-ul是256的大小

    local pad = math.ceil(torch.norm((ul - br):float())/2 - (br[1]-ul[1])/2)
    if rot ~= 0 then ul:add(-pad); br:add(pad) end

这句话是为了定义旋转，打pad，暂且不管，因为此时crop操作旋转角度为0

    local old_ = {1,-1,math.max(1, ul[2]), math.min(br[2], ht+1) - 1,
                       math.max(1, ul[1]), math.min(br[1], wd+1) - 1}

这句话定义从原图中的哪一部分来剪切我们要的人

    local new_ = {1,-1,math.max(1, -ul[2] + 2), math.min(br[2], ht+1) - ul[2],
                       math.max(1, -ul[1] + 2), math.min(br[1], wd+1) - ul[1]}

这句话定义把旧图抠出来的人放在新图的什么位置

local newImg = torch.zeros(img:size(1), br[2] - ul[2], br[1] - ul[1])

这句话定义了新的图像大小，如果是scaleFactor>2，是256*256,否则的话就是m*n了。

if not pcall(function() newImg:sub(unpack(new_)):copy(tmpImg:sub(unpack(old_))) end) then 
end

从老图像中的区域拷贝过来复制到新的图像的

if scaleFactor < 2 then newImg = image.scale(newImg,res,res) end

之前会担心，scaleFactor<2的时候，如果经过transform得到的变换区域大于了256*256是不是就把一些身体部位切没了？？？当然不会，因为上面的newImg的大小是br-ul, 保证了把整个人切进去，然后一个image.scale将图像缩放到256*256

function M.crop2(img, center, scale, rot, res)
    local ndim = img:nDimension()
    if ndim == 2 then img = img:view(1,img:size(1),img:size(2)) end
    local ht,wd = img:size(2), img:size(3)
    local tmpImg,newImg = img, torch.zeros(img:size(1), res, res)
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    --local img_his = tmpImg
    local cc = center
    local ss = scale
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    -- Modify crop approach depending on whether we zoom in/out
    -- This is for efficiency in extreme scaling case
    --scale = scale[1]
    --print(type(scale))
    local scaleFactor = (200 * scale) / res
    --print(torch.type(scale))
    if scaleFactor < 2.0 then scaleFactor = 1


    else
        local newSize = math.floor(math.max(ht,wd) / scaleFactor)
        if newSize < 2 then
           -- Zoomed out so much that the image is now a single pixel or less
           if ndim == 2 then newImg = newImg:view(newImg:size(2),newImg:size(3)) end
           return newImg
        else
           tmpImg = image.scale(img,newSize)
           ht,wd = tmpImg:size(2),tmpImg:size(3)
        end
    end


    -- Calculate upper left and bottom right coordinates defining crop region
    local c,s = center:float()/scaleFactor, scale/scaleFactor
    local ul = M.transform({1,1}, c, s, 0, res, true)
    local br = M.transform({res+1,res+1}, c, s, 0, res, true)
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    local ul_his = ul
    local br_his = br
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    if scaleFactor >= 2 then br:add(-(br - ul - res)) end


    -- If the image is to be rotated, pad the cropped area
    local pad = math.ceil(torch.norm((ul - br):float())/2 - (br[1]-ul[1])/2)
    if rot ~= 0 then ul:add(-pad); br:add(pad) end


    -- Define the range of pixels to take from the old image
    local old_ = {1,-1,math.max(1, ul[2]), math.min(br[2], ht+1) - 1,
                       math.max(1, ul[1]), math.min(br[1], wd+1) - 1}
    -- And where to put them in the new image
    local new_ = {1,-1,math.max(1, -ul[2] + 2), math.min(br[2], ht+1) - ul[2],
                       math.max(1, -ul[1] + 2), math.min(br[1], wd+1) - ul[1]}


    -- Initialize new image and copy pixels over
    local newImg = torch.zeros(img:size(1), br[2] - ul[2], br[1] - ul[1])
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    local width_his = br[2] - ul[2]
    local height_his = br[1] - ul[1]
    local img_his = newImg
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    if not pcall(function() newImg:sub(unpack(new_)):copy(tmpImg:sub(unpack(old_))) end) then
       print("Error occurred during crop!")
       print(width_his)
       print(height_his)
       print(cc)
       print(ss)
       print(newImg:size())
       print(img_his:size())




    end

transform会首先通过getTransform获得变换矩阵t，然后通过变换矩阵t进行坐标变换，利用t的逆乘以坐标获得变换之后的坐标

通常的旋转变换矩阵是

t*x = x_new x = inv(t)*x_new

通过这个函数可以发现它是已知变换后的点，求出原来的点在原来图中的位置，具体怎样做的看一下getTransform函数

function M.transform(pt, center, scale, rot, res, invert)
    local pt_ = torch.ones(3)
    pt_[1],pt_[2] = pt[1]-1,pt[2]-1

    local t = M.getTransform(center, scale, rot, res)
    if invert then
        t = torch.inverse(t)
    end
    local new_point = (t*pt_):sub(1,2)

    return new_point:int():add(1)
end

给出了一个人的center位置，一个人的尺度信息，旋转角度，和缩放后的分辨率

1）因为scale是h/200，所以这里的h = 人体矩形框的高

2）t是旋转矩阵

3）t[1][1],t[2][2],来确定将这个人变成256的大小需要的scale的大小

4）t[1][3],t[2][3]是为了将人体框的中心点平移到（128,128）

如果x=a,y=b，也即恰巧是中心点的位置，那么讲过矩阵平移缩放后就在（128,128）这一个点处，反过来，如果用

x = inv(t)*x_new计算出来的就是对应到未缩放的原图的点的位置

5）旋转矩阵

M是绕着任意点旋转变换矩阵，任意点是（tx,ty），在这个程序里面是tx=128，ty=128,所以是绕着128,128旋转

function M.getTransform(center, scale, rot, res)
    local h = 200 * scale
    local t = torch.eye(3)

    -- Scaling
    t[1][1] = res / h
    t[2][2] = res / h

    -- Translation
    t[1][3] = res * (-center[1] / h + .5)
    t[2][3] = res * (-center[2] / h + .5)

    -- Rotation
    if rot ~= 0 then
        rot = -rot
        local r = torch.eye(3)
        local ang = rot * math.pi / 180
        local s = math.sin(ang)
        local c = math.cos(ang)
        r[1][1] = c
        r[1][2] = -s
        r[2][1] = s
        r[2][2] = c
        -- Need to make sure rotation is around center
        local t_ = torch.eye(3)
        t_[1][3] = -res/2
        t_[2][3] = -res/2
        local t_inv = torch.eye(3)
        t_inv[1][3] = res/2
        t_inv[2][3] = res/2
        t = t_inv * r * t_ * t
    end

    return t
end

6）综合以上，解析一下返回的矩阵t

t = t_inv * r * t_ * t

t*(x,y,1)首先通过平移缩放把整个人平移到以128,128为中心，缩放到256,256这个框可以把人框柱，然后将图像绕着128,128旋转，用t的逆乘以变换后的坐标就可以返回到原图中了

1.3

标签制作

drawGaussian(out[i], transform(torch.add(pts[i],1), c, s, 0, self.outputRes), self.gsize)

画高斯图

注意到了没有！！！！！！！！！！，此处只传进去了5个参数，

invert参数丢失了，why???????????????????????

因为画高斯图的时候需要把坐标对应到256*256的高斯图上面，他不是说是一个反变换的过程，相反他是一个变换的过程，将图像中位于c,尺度为s的这个人用256*256的框包住，各个点的坐标都映射到哪了

所以直接local new_point = (t*pt_):sub(1,2)，此处的t不是变换矩阵的逆

function M.drawGaussian(img, pt, sigma)
    -- Draw a 2D gaussian
    -- Check that any part of the gaussian is in-bounds
    local tmpSize = math.ceil(3*sigma)
    local ul = {math.floor(pt[1] - tmpSize), math.floor(pt[2] - tmpSize)}
    local br = {math.floor(pt[1] + tmpSize), math.floor(pt[2] + tmpSize)}
    -- If not, return the image as is
    if (ul[1] > img:size(2) or ul[2] > img:size(1) or br[1] < 1 or br[2] < 1) then return img end
    -- Generate gaussian
    local size = 2*tmpSize + 1
    local g = image.gaussian(size)
    -- Usable gaussian range
    local g_x = {math.max(1, -ul[1]), math.min(br[1], img:size(2)) - math.max(1, ul[1]) + math.max(1, -ul[1])}
    local g_y = {math.max(1, -ul[2]), math.min(br[2], img:size(1)) - math.max(1, ul[2]) + math.max(1, -ul[2])}
    -- Image range
    local img_x = {math.max(1, ul[1]), math.min(br[1], img:size(2))}
    local img_y = {math.max(1, ul[2]), math.min(br[2], img:size(1))}
    assert(g_x[1] > 0 and g_y[1] > 0)
    img:sub(img_y[1], img_y[2], img_x[1], img_x[2]):cmax(g:sub(g_y[1], g_y[2], g_x[1], g_x[2]))
    return img
end

画高斯图的时候保证所有的点都在高斯图的范围之内，不在范围之内的剪切掉

1.4 疑问？

这样的封闭的变换和反变换真的就可以天衣无缝的对应上吗？神奇！！！再探究！！！

1.5旋转的部分待定

2、测试

做测试的时候会有什么不同呢？测试的时候肯定是不旋转的看一下代码，看下一章