Motion-based Segmentation and RecognitionDataset

Motion-based Segmentation and RecognitionDataset
(this is a draft versionof this page)

       Please cite:  
  (1)Segmentation and RecognitionUsing Structure from Motion Point Clouds, ECCV 2008 (pdf)
Brostow, Shotton, Fauqueur, Cipolla (bibtex)
  (2)Semantic Object Classes inVideo: A High-Definition Ground Truth Database (pdf)
Pattern Recognition Letters (to appear)
Brostow, Fauqueur, Cipolla (bibtex)
    
 Description:   The Cambridge-driving LabeledVideo Database (CamVid) is the first collection of videos with objectclass semantic labels, complete with metadata. The database providesground truth labels that associate each pixel with one of 32 semantic classes.

The database addresses the need for experimental data to quantitativelyevaluate emerging algorithms. While most videos are filmed withfixed-position CCTV-style cameras, our data was captured from theperspective of a driving automobile. The driving scenario increases thenumber and heterogeneity of the observed object
classes.

Over ten minutes of high quality 30Hz footage is being provided, withcorresponding semantically labeled images at 1Hz and in part, 15Hz. TheCamVid Database offers four contributions that are relevant to objectanalysis researchers. First, the per-pixel semantic segmentation ofover 700 images was specified manually, and was then inspected andconfirmed by a second person for accuracy. Second, the high-quality andlarge resolution color video images in the database represent valuableextended duration digitized footage to those interested in drivingscenarios or ego-motion. Third, we filmed calibration sequences for thecamera color response and intrinsics, and computed a 3D camera pose foreach frame in the sequences. Finally, in support of expanding this orother databases, we offer custom-made labeling software for assistingusers who wish to paint precise class-labels for other images andvideos. We evaluated the relevance of the database by measuring theperformance of an algorithm from each of three distinct domains:multi-class object recognition, pedestrian detection, and labelpropagation.
    
 Overview Video:  
Avi, 30 Mb, xVid compressed.(playbacktips or get the free Mac/Windows player.
or
Mpg, 11 Mb, mpeg-1 compressed(more compatible, but lower quality)


 

CamVid Database

(just samples shown. For all thevideos, see below)



 Original Video Sequences: Link to FTP server withvideo files (very big!)
Linkto codecs + utility for extracting frames from those big files

(read the inventory.txt)
 
Labeled Images
(701 so far)
 
Linkto zip file with painted class labels for stills from the videosequences.
Txtfile listing classes and label colors as RGB triples (sorted).
(Note: the corresponding raw input images only - at 1Hz,
already extracted from the respective videos areheretemporarily(556Mb).)
 
Camera extrinsics
  Linkto files and code (if link breaks someday, go here)
The relevant line that you care about to get the projection matrix of 1camera is in MotBoostEvalOneFrame.m (see howLoadBoujou_2Dtrax_3dBans_Misc.m calls it):
curC = Cs( frameNum-offsetForFrameNums,    1:3);
   Examplecamera posetrajectory, stored in Boujou Animation Format:
each line containing "AddDecompCameraKey" has a K and R matrix and tvector,
so that P = K * R * [I -t]
 


   seq06R0

Description: 3030 frames at 30Hz == 1:41 min
Sample Frame           
VideoFilein MXF format *
   
seq16E5

Description: 6120 frames at 30Hz == 3:24 min
Sample Frame      
VideoFiles 1 and 2 inMXF format* (note: these are 2halves of 1 zip file)



seq16E5_15Hz
(see also CamSeq01)

Description: 202 frames at 30Hz == 0:06 min
Sample Frame
VideoFiles 1 and 2 inMXF format * (note: same files asabove, but use a different script)

   
seq05VD

Description: 5130 frames at 30Hz == 2:51 min
Sample Frame
VideoFileinMXF format*
   seq01TP

Description: 3720 frames at 30Hz == 2:04 min
Sample Frame 
VideoFilein MXF format *

    
   
   Listingof (RGB)-Classassignments (alphabetical)      Listingin color-order used by MSRC(with "XX")
  
Moving objects
Animal
Pedestrian
Child
Rolling cart/luggage/pram
Bicyclist
Motorcycle/scooter
Car (sedan/wagon)
SUV / pickup truck
Truck / bus
Train
Misc
Road
Road == drivable surface
Shoulder
Lane markings drivable
Non-Drivable
Ceiling
Sky
Tunnel
Archway
Fixed objects
Building
Wall
Tree
Vegetation misc.
Fence
Sidewalk
Parking block
Column/pole
Traffic cone
Bridge
Sign / symbol
Misc text
Traffic light
Other





Hand-Labeled Frames:


seq06R0

Description: 101 frames at 1Hz == 1:41 min
Sample Frame       PreviewVideo




seq16E5

Description: 204 frames at 1Hz == 3:24 min
Sample Frame       PreviewVideo

seq16E5_15Hz
(see also CamSeq01)

Description: 101 frames at 15Hz == 0:06 min
Sample Frame       PreviewVideo




seq05VD

Description: 101 frames at 1Hz == 1:41 min
Sample Frame       PreviewVideo




seq01TP

Description: 124 frames at 1Hz == 2:04 min
Sample Frame       PreviewVideo










Paint-Stroke Logs of ManualLabeling:

Example log file, whereeachof the user's mouse-strokes was recorded to include:
the class label being applied, size and type of brush orpre-segmentation used, location of each click point and drag-path, andduration for each stroke.





InteractLabeler Software:

InteractLabeler.zipforWindows (3.4Mb)
InteractLabelerDocumentation
InteractLabelerinstructions, as given to volunteers






*MXF format:

This format is like Avi orQuicktime in that it is a wrapper for multimedia files. In our case,just the video channel has data, and is HD format. To decode, use thisutility ( link)along with the scripts provided.



   
   


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
地形数据测量是许多地貌研究应用程序的基本方面,尤其是那些包括地形监测和地形变化研究的应用程序。然而,大多数测量技术需要相对昂贵的技术或专门的用户监督。 Motion(SfM)摄影测量技术的结构通过允许使用消费级数码相机和高度自动化的数据处理(可以免费使用)减少了这两个限制。因此,SfM摄影测量法提供了快速,自动化和低成本获取3D数据的可能性,这不可避免地引起了地貌界的极大兴趣。在此贡献中,介绍了SfM摄影测量的基本概念,同时也承认了其传统。举几个例子来说明SfM在地貌研究中的应用潜力。特别是,SfM摄影测量为地貌学家提供了一种工具,用于在一定范围内对3-D形式进行高分辨率表征,并用于变化检测。 SfM数据处理的高度自动化既创造了机遇,也带来了威胁,特别是因为用户控制倾向于将重点放在最终产品的可视化上,而不是固有的数据质量上。因此,这项贡献旨在指导潜在的新用户成功地将SfM应用于一系列地貌研究。 关键词:运动结构,近距离摄影测量,智能手机技术,测量系统,表面形态echnology reduces both these constraints by allowing the use of consumer grade digital cameras and highly automated data processing, which can be free to use. SfM photogrammetry therefore offers the possibility of fast, automated and low-cost acquisition of 3-D data, which has inevitably created great interest amongst the geomorphological community. In this contribution, the basic concepts of SfM photogrammetry are presented, whilst recognising its heritage. A few examples are employed to illustrate the potential of SfM applications for geomorphological research. In particular, SfM photogrammetry offers to geomorphologists a tool for high-resolution characterisation of 3-D forms at a range of scales and for change detection purposes. The high level of automation of SfM data processing creates both opportunities and threats, particularly because user control tends to focus upon visualisation of the final product rather than upon inherent data quality. Accordingly, this contribution seeks to guide potential new users in successfully applying SfM for a range of geomorphic studies.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值