py-faster-rcnn详解(2)——pascal _voc.py接口说明

imdb对象是一个pascol _voc的类实例,该类继承自imdb,用于数据交互。

初始化函数

在初始化自身的同时,先调用了父类的初始化方法,将imdb _name传入,例如(‘voc_2007_trainval’)
  class pascal _voc(imdb):
  def __init__(self, image_set, year, devkit_path=None):
      imdb.__init__(self, 'voc_' + year + '_' + image_set)
      self._year = year
      self._image_set = image_set
      self._devkit_path = self._get_default_path() if devkit_path is None  
                          else devkit_path
      self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
      self._classes = ('__background__', # always index 0
                       'aeroplane', 'bicycle', 'bird', 'boat',
                       'bottle', 'bus', 'car', 'cat', 'chair',
                       'cow', 'diningtable', 'dog', 'horse',
                       'motorbike', 'person', 'pottedplant',
                       'sheep', 'sofa', 'train', 'tvmonitor')
      self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))
      self._image_ext = '.jpg'
      self._image_index = self._load_image_set_index()
      # Default to roidb handler
      self._roidb_handler = self.selective_search_roidb
      self._salt = str(uuid.uuid4())
      self._comp_id = 'comp4'

      # PASCAL specific config options
      self.config = {'cleanup'     : True,
                     'use_salt'    : True,
                     'use_diff'    : False,
                     'matlab_eval' : False,
                     'rpn_file'    : None,
                     'min_size'    : 2}

      assert os.path.exists(self._devkit_path),  
              'VOCdevkit path does not exist: {}'.format(self._devkit_path)
      assert os.path.exists(self._data_path),  
              'Path does not exist: {}'.format(self._data_path)

 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35

image _path _from _index

根据图片的索引,比如‘000001’获取在JPEGImages下对应的图片路径

  def image_path_at(self, i):
      """
      Return the absolute path to image i in the image sequence.
      """
      return self.image_path_from_index(self._image_index[i])


  def image_path_from_index(self, index):
      """
      Construct an image path from the image's "index" identifier.
      """
      image_path = os.path.join(self._data_path, 'JPEGImages',
                                index + self._image_ext)
      assert os.path.exists(image_path),  
              'Path does not exist: {}'.format(image_path)
      return image_path

# load _image _set _index
# 该函数根据/VOCdevkit2007/VOC2007/ImageSets/Main/ <image _set >.txt加载图像的索引

  def _load_image_set_index(self):
      """
      Load the indexes listed in this dataset's image set file.
      """
      # Example path to image set file:
      # self._devkit_path + /VOCdevkit2007/VOC2007/ImageSets/Main/val.txt
      image_set_file = os.path.join(self._data_path, 'ImageSets', 'Main',
                                    self._image_set + '.txt')
      assert os.path.exists(image_set_file),  
              'Path does not exist: {}'.format(image_set_file)
      with open(image_set_file) as f:
          image_index = [x.strip() for x in f.readlines()]
      return image_index
 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33

_get _default _path

返回默认的数据源路径,这里是放在data下的VOCDevkit2007
 def _get_default_path(self):
      """
      Return the default path where PASCAL VOC is expected to be installed.
      """
      return os.path.join(cfg.DATA_DIR, 'VOCdevkit' + self._year)

 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

gt _roidb

这个函数返回roidb数据对象。首先它会在cache路径下找到以扩展名’.pkl’结尾的缓存,这个文件是通过cPickle工具将roidb序列化存储的。如果该文件存在,那么它会先读取这里的内容,以提高效率(所以如果你换数据集的时候,要先把cache文件给删除,否则会造成错误)。接着,它将调用 _load _pascal _annotation这个私有函数加载roidb中的数据,并将其保存在缓存文件中,返回roidb。
  def gt_roidb(self):
      """
      Return the database of ground-truth regions of interest.

      This function loads/saves from/to a cache file to speed up future calls.
      """
      cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
      if os.path.exists(cache_file):
          with open(cache_file, 'rb') as fid:
              roidb = cPickle.load(fid)
          print '{} gt roidb loaded from {}'.format(self.name, cache_file)
          return roidb

      gt_roidb = [self._load_pascal_annotation(index)
                  for index in self.image_index]
      with open(cache_file, 'wb') as fid:
          cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)
      print 'wrote gt roidb to {}'.format(cache_file)

      return gt_roidb
 
 
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23

 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

_load _pascal _annotation

  根据每个图像的索引,到Annotations这个文件夹下去找相应的xml标注数据,然后加载所有的bounding box对象。
最后将这些成员变量组装成roidb返回
def _load_pascal_annotation(self, index):
      """
      Load image and bounding boxes info from XML file in the PASCAL VOC
      format.
      """
      filename = os.path.join(self._data_path, 'Annotations', index + '.xml')
      tree = ET.parse(filename)
      objs = tree.findall('object')
      if not self.config['use_diff']:
          # Exclude the samples labeled as difficult
          non_diff_objs = [
              obj for obj in objs if int(obj.find('difficult').text) == 0]
          # if len(non_diff_objs) != len(objs):
          #     print 'Removed {} difficult objects'.format(
          #         len(objs) - len(non_diff_objs))
          objs = non_diff_objs
      num_objs = len(objs)

      boxes = np.zeros((num_objs, 4), dtype=np.uint16)
      gt_classes = np.zeros((num_objs), dtype=np.int32)
      overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
      # "Seg" area for pascal is just the box area
      seg_areas = np.zeros((num_objs), dtype=np.float32)

      # Load object bounding boxes into a data frame.
      for ix, obj in enumerate(objs):
          bbox = obj.find('bndbox')
          # Make pixel indexes 0-based
          x1 = float(bbox.find('xmin').text) - 1
          y1 = float(bbox.find('ymin').text) - 1
          x2 = float(bbox.find('xmax').text) - 1
          y2 = float(bbox.find('ymax').text) - 1
          cls = self._class_to_ind[obj.find('name').text.lower().strip()]
          boxes[ix, :] = [x1, y1, x2, y2]
          gt_classes[ix] = cls
          overlaps[ix, cls] = 1.0
          seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)

      overlaps = scipy.sparse.csr_matrix(overlaps)

      return {'boxes' : boxes,
              'gt_classes': gt_classes,
              'gt_overlaps' : overlaps,
              'flipped' : False,
              'seg_areas' : seg_areas}
 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
 
 
  • 1
  • 2
  • 3
  • 4
  • 1
  • 2
  • 3
  • 4

rpn _roidb

  经过RPN网络产生了proposal以后,这个函数作用是将这些proposal 的 roi与groudtruth结合起来,送入网络训练。最后用merge _roidbs将gr _roidb与rpn _roidb合并输出。
def rpn_roidb(self):
          if int(self._year) == 2007 or self._image_set != 'test':
              gt_roidb = self.gt_roidb()
              rpn_roidb = self._load_rpn_roidb(gt_roidb)
              roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)
          else:
              roidb = self._load_rpn_roidb(None)

          return roidb

  def _load_rpn_roidb(self, gt_roidb):
          filename = self.config['rpn_file']
          print 'loading {}'.format(filename)
          assert os.path.exists(filename),  
                 'rpn data not found at: {}'.format(filename)
          with open(filename, 'rb') as f:
              box_list = cPickle.load(f)
          return self.create_roidb_from_box_list(box_list, gt_roidb) 

 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值