pyannote源码阅读(二)
前言
这篇文章主要解读Annotation类的设计。
pyannote.core.Annotation设计要点
Track
使用了多媒体处理领域的Track的概念,可以参考音轨、Text Track等概念。
也可以简单参考一下这篇文章。
WebVTT中的文本轨道(Text Tracks,TT)到底是什么鬼?
pyannote.core.Annotation'
实例是有序的非空轨道(tracks)集合:
- 有序,因为区段按开始时间排序(如果是平局,则按结束时间排序)
- 集合,因为不能添加两次相同的轨道(track)
- 非空,因为无法添加空轨道(track)
轨道(track)是 (support, name) 对,其中 support'
是 Segment 实例,而 name
是附加标识符,以便可以添加具有相同support的多个轨道。
要定义上面描述的annotation:
In [1]: from pyannote.core import Annotation, Segment
In [6]: annotation = Annotation()
...: annotation[Segment(1, 5)] = 'Carol'
...: annotation[Segment(6, 8)] = 'Bob'
...: annotation[Segment(12, 18)] = 'Carol'
...: annotation[Segment(7, 20)] = 'Alice'
...:
上面的写法实际上是下面的简略形式:
In [6]: annotation = Annotation()
...: annotation[Segment(1, 5), '_'] = 'Carol'
...: annotation[Segment(6, 8), '_'] = 'Bob'
...: annotation[Segment(12, 18), '_'] = 'Carol'
...: annotation[Segment(7, 20), '_'] = 'Alice'
...:
所有的tracks 共享了相同的名字(默认值) '_'
.
补充:
定义或者增加一个标注的基本形式如下,在这行代码中,四个基本概念都出现。
annotation[segment, track] = label
如果两个tracks
具有相同的support
, 使用不同的track name
:
In [6]: annotation = Annotation(uri='my_video_file', modality='speaker')
...: annotation[Segment(1, 5), 1] = 'Carol' # track name = 1
...: annotation[Segment(1, 5), 2] = 'Bob' # track name = 2
...: annotation[Segment(12, 18)] = 'Carol'
...:
The track name does not have to be unique over the whole set of tracks.
说明
The optional uri and modality keywords argument can be used to remember
which document and modality (e.g. speaker or face) it describes.
Several convenient methods are available. Here are a few examples:
In [9]: annotation.labels() # sorted list of labels
Out[9]: ['Bob', 'Carol']
In [10]: annotation.chart() # label duration chart
Out[10]: [('Carol', 10), ('Bob', 4)]
In [11]: list(annotation.itertracks())
Out[11]: [(<Segment(1, 5)>, 1), (<Segment(1, 5)>, 2), (<Segment(12, 18)>, u'_')]
In [12]: annotation.label_timeline('Carol')
Out[12]: <Timeline(uri=my_video_file, segments=[<Segment(1, 5)>, <Segment(12, 18)>])>