踩着坑画bounding-box

首先介绍一下工作要求:针对nyu_depth_v2_labeled.mat数据,实现用python读取mat文件里面的数据并在mat中图片上画出每个物品的bounding-box。
首先,通过h5_file = h5py.File(“nyu_depth_v2_labeled.mat”)用h5py将mat数据转化为矩阵;

file=scipy.io.loadmat('splits.mat')
##遍历mat文件中所有的items并输出每个items的数值
for name,value in f.items():
    print "Name ", name
    print "Value", value
variables = f.items()
for var in variables:
    name = var[0]
    data = var[1]
    print "Name ", name  # Name
    if type(data) is h5py.Dataset:
        # If DataSet pull the associated Data
        # If not a dataset, you may need to access the element sub-items
        value = data.value
        print "Value", value  # NumPy Array / Value

通过上面的代码可以查看mat文件中存储的所有字段:

annotation:  N=1449,number of images
accelData:4     4*1449 accelerometer values indicated when each frame was taken:contain the roll, yaw, pitch and tilt angle of the device.
depths:1449     in-painted depth maps:HxWxN=480*640*1449     H and W are the height and width
images:1449     RGB images:HxWx3xN=480*640*3*1449
instances:1449  instance maps:HxWxN=480*640*1449 
labels:1449     HxWxN=480*640*1449  range from 1..C where C is the total number of classes. 
                If a pixel’s label value is 0, then that pixel is ‘unlabeled’.
names:1         Cx1 cell array of the english names of each class.    C=894 
namesToIds:     map from english label names to class IDs (with C key-value pairs) 
                1*6 [370776473621111] 
rawDepthFilenames:1    Nx1 cell array of the filenames (in the Raw dataset) 
                       used for each of the depth images in the labeled dataset.
rawDepths:1449  raw depth maps:HxWxN=480*640*1449
                These depth maps capture the depth images after they have been projected onto the RGB image plane but before the missing depth values have been filled in.
                Additionally, the depth non-linearity from the Kinect device has been removed 
                and the values of each depth image are in meters.
rawRgbFilenames:1      Nx1 cell array of the filenames (in the Raw dataset)
                       used for each of the RGB images in the labeled dataset.
sceneTypes:1           Nx1 cell array of the scene type from which each image was taken.
scenes:1               Nx1 cell array of the name of the scene from which each image was taken.

通过labels里面存放的每张图片中所有物品所属的类别标签,来将每种物品用bounding-box框出来

    labels = h5_file['labels']   # 640*480
    images = h5_file['images']   # 640*480
    scenes = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in              h5_file['sceneTypes'][0]]

    print("processing images")
    for i, image in enumerate(images):
        print("image", i + 1, "/", len(images))
        draw_box(i, scenes[i], image.T, labels[i, :, :].T)

获取mat数据里面的images、labels和scenes数据,在images里面的图片画框,labels用来获取每张图片中物品的类别,scenes可将框好后图片按照场景类别进行分类保存。
接下来就是画bounding-box的关键代码:
先计算每张图中所有的类别标签,存储到list中:

def draw_box(i, scene, image, label):
    L=[]
    shape = list(label.shape) + [3]
    for j in xrange(shape[0]):
        for k in xrange(shape[1]):
            if (label[j, k]!=0):
                L.append(label[j,k])
    L1=list(set(L))       

获取每个像素点所属的类别,0为背景类,不计算在内;由于同类别的像素点很多,所以最后再对整个list进行去重,最后保存在list中的就是没有重复的几个类别(每张图包含的类别都不一样)
接下来就是按照list里面的类别对整张图片进行遍历,找出相同类别的像素块:

    image=image.copy()
    for X in L1: 
        minX = shape[0]
        minY = shape[1]
        maxX = maxY = 0
        for j in xrange(shape[0]):
            for k in xrange(shape[1]):
                if (label[j, k]==X):
                    if (k<minX): minX=k
                    if (k>maxX): maxX=k
                    if (j<minY): minY=j
                    if (j>maxY): maxY=j
        cv2.rectangle(image, (minX, minY), (maxX, maxY), (0, 255, 0), 2)
    imsave("%s/%05d_bounding_box.png" % (folder, i), image)

由于opencv和python的接口有一定的bug,不能直接在image上画框,需要先copy一份
注意,python中获取到的labels的shape,shape[0]代表高H,shape[1]代表宽W
在计算bounding-box左上角和右下角的坐标时,要将k与X进行比较,将j与Y进行比较。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值