Linux 声音编程教程（八）_linux libass-CSDN博客

原文：Linux Sound Programming

协议：CC BY-NC-SA 4.0

二十六、字幕和隐藏字幕

许多 Karaoke 系统使用字幕 ¹ 叠加在某种电影上。像kmid这样的程序和我的 Java 程序在某种画布对象上播放歌词。这给了一个相当无聊的背景。视频 CD 或 MPEG-4 文件有更好的背景，但歌词被硬编码到背景视频中，所以对它们进行操作的机会很小。CD+G 文件将歌词与视频分开，但似乎没有任何方法可以直接从 Linux 播放它们。它们可以被转换成 MP3+G，并且它们可以由 VLC 播放，它加载 MP3 文件并拾取相应的.cdg文件。

本章考虑可以独立制作的字幕，以某种方式与视频和音频结合，然后播放。目前的情况并不完全令人满意。

资源

查看以下资源:

《用 Linux 做字幕教程》( http://sub.wordnerd.de/linux-subs.html )

字幕格式

本章关注的是所谓的软字幕，字幕存储在一个独立于视频或音频的文件中，并在渲染过程中被合并。维基百科页面“字幕(captioning)”(http://en.wikipedia.org/wiki/Subtitle_(captioning`)是一篇长文，探讨了许多关于字幕的问题。它还包含一个字幕格式列表，但是在这个上下文中最有用的是 SubStation Alpha。

MPlayer

根据 MPlayer 页面“字幕和 OSD”(www.mplayerhq.hu/DOCS/HTML/en/subosd.htm)，以下是 MPlayer 可识别的格式:

沃博布
外出留言
闭路字幕
微 DVD
subrip(子 ip)
子观众
萨米人
虚拟播放器
无线电报
社会保障总署(Social Security Administration)
PJS(凤凰日本学会)
MPsub
aititle
JACOsub

可见光通讯

根据 VLC ( www.videolan.org/vlc/features.php?cat=sub ，Linux 下支持的字幕格式包括以下几种:

数字影碟
文本文件(MicroDVD、SubRIP、SubViewer、SSA1-5、SAMI、VPlayer)
隐藏字幕
沃博布
通用字幕格式(USF)
SVCD/CVD
二乙烯基苯
外出留言
CMML(移动通信)
凯特

如果您播放某种视频文件，比如说XYZ.mpg，并且还有一个具有相同根名称和适当扩展名的文件，比如XYZ.ass(变电站 Alpha 的扩展名)，那么 VLC 将自动加载字幕文件并播放它。如果字幕文件有不同的名称，那么它可以从 VLC 菜单视频➤字幕轨道加载。然而，这似乎没有共享名称可靠。

Gnome 字幕制作

看到“Gnome 字幕 1.3 出来了！”( http://gnome-subtitles.sourceforge.net/ )。Gnome 支持 Adobe Encore DVD、Advanced Sub Station、Alpha AQ、Title DKS 字幕格式 FAB 副标题 Karaoke 歌词 LRC Karaoke 歌词 VKT MAC Sub MicroDVD MPlayer MPlayer 2 MP Sub Panimator Phoenix Japanimation Society Power DivX Sofni Sub creator 1 . x Sub rip Sub Station Alpha Sub viewer 1.0、SubViewer 2.0 和 ViPlay 字幕文件。

阿尔法变电站

SSA/ASS 规范位于 MooDub.free ( http://moodub.free.fr/video/ass-specs.doc )。它很简短，似乎包含了一些关于后来的规范和实现的小错误。比如时间格式不一样。还是后来的都是错的？

SSA/ASS 文件可以独立使用。它们也可以包含在诸如 Matroska 文件的容器格式中，这将在第三章中简要讨论。当它们被嵌入到 MKV 文件中时，会受到一些限制( www.matroska.org/technical/specs/subtitles/ssa.html )，例如文本被转换成 UTF-8 Unicode。

ASS 文件分为几个部分。

关于字幕文件期望的环境的一般信息，例如 X 和 Y 分辨率
颜色和字体等样式信息
事件信息，其中给出了字幕文本以及定时信息和要应用的任何特殊效果

在正常情况下，您不会使用文本编辑器直接创建这样的文件。相反，程序 Aegisub 为您提供了一个创建文件的 GUI 环境。实际上，您只需输入文本行，以及要显示的每行的开始和结束时间。

图 26-1 为屏幕截图。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 26-1。

Aegisub screenshot

许多特殊效果是可能的。比尔·克雷斯韦尔博客上的视频( https://billcreswell.wordpress.com/tag/aegisub/ )就是一个很好的例子。下面是 YouTube 的直接链接: www.youtube.com/watch?v=0Z0dgdglrAo 。

为了完整起见，下面是我创建的一个 ASS 文件的一部分:

[Script Info]
; Script generated by Aegisub 2.1.9
; http://www.aegisub.org/
Title: Default Aegisub file
ScriptType: v4.00+
WrapStyle: 0
PlayResX: 640
PlayResY: 480
ScaledBorderAndShadow: yes
Video Aspect Ratio: 0
Video Zoom: 6
Video Position: 0

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,20,&H00FFFFFF,&H00B4FCFC,&H00000008,&H80000008,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:18.22,0:00:19.94,Default,,0000,0000,0000,,Here comes the sun
Dialogue: 0,0:00:20.19,0:00:21.75,Default,,0000,0000,0000,,doo doo doo doo
Dialogue: 0,0:00:22.16,0:00:24.20,Default,,0000,0000,0000,,Here comes the sun
Dialogue: 0,0:00:24.61,0:00:28.24,Default,,0000,0000,0000,,I said it's alright
...

屁股文件里的 Karaoke 效果

ASS 文件中的一行基本上由开始显示的时间、结束显示的时间和文本本身组成。然而，Karaoke 用户习惯于在播放时突出显示文本。

ASS 支持两大高光样式。

一次突出显示一个单词。
文本通过从左侧填充来突出显示。

这些效果是通过在文本中嵌入“Karaoke 覆盖”来实现的。这些都在{}中，持续时间为百分之一秒。

详情如下:

单词高亮显示对表单{\k<time>}的覆盖将高亮显示后面的单词time百分之一秒。下面是一个例子:
```
{\k100}Here {\k150}comes {\k50}the {\k150}sun
```
填充高亮显示对表单{\kf<time>}的覆盖将在百分之一秒的时间内逐渐填充接下来的单词time。下面是一个例子:
```
{\kf100}Here {\kf150}comes {\kf50}the {\kf150}sun
```
三种风格出现如下:
Lines with no highlighting (see Figure 26-2)

图 26-2。

Subtitles without highlighting
Word highlighting (see Figure 26-3)

图 26-3。

Subtitles with word highlighting
Fill highlighting (see Figure 26-4)

图 26-4。

Subtitles with fill highlighting

多线 Karaoke

理想情况下，Karaoke 系统应该有一个“前瞻”机制，这样你就可以在唱出下一行之前看到它。这可以通过在不同高度显示两行重叠时间的文本来实现。算法如下:

When line N with markup is shown,
    show line N+1 without markup
After line N is finished, continue showing line N+1
When line N+1 is due to show,
     finish showing unmarked line N+1
     show line N+1 with markup

下面是歌曲《太阳来了》的歌词:

Here comes the sun
doo doo doo doo
Here comes the sun
I said it's alright

生成的 ASS 文件应该如下所示:

Dialogue: 0,0:00:18.22,0:00:19.94,Default,,0000,0000,0100,,{\kf16}Here {\kf46}comes {\kf43}the {\kf67}sun
Dialogue: 0,0:00:18.22,0:00:20.19,Default,,0000,0000,0000,,doo doo doo doo
Dialogue: 0,0:00:20.19,0:00:21.75,Default,,0000,0000,0000,,{\kf17}doo {\kf25}doo {\kf21}doo {\kf92}doo
Dialogue: 0,0:00:20.19,0:00:22.16,Default,,0000,0000,0100,,Here comes the sun
Dialogue: 0,0:00:22.16,0:00:24.20,Default,,0000,0000,0100,,{\kf17}Here {\kf46}comes {\kf43}the {\kf97}sun
Dialogue: 0,0:00:22.16,0:00:24.61,Default,,0000,0000,0000,,I said it's alright

图 26-5 显示了它的样子。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 26-5。

Multiline subtitles

利巴斯

变电站阿尔法和它的渲染器似乎经历了复杂的历史。据《古今:VSFilter》(http://blog.aegisub.org/2010/02/old-and-present-vsfilter.html)记载，ASS 格式大约在 2004 年定型，渲染器 vs filter 就是在那个时候开源的。然而，在 2007 年左右，VSFilter 的开发停止了，出现了几个分支。这些对格式引入了几个扩展，比如 Aegisub 的blur标签。其中一些分支后来被合并了，一些被放弃了，其中一些分支的代码仍然存在。

libass ( http://code.google.com/p/libass/ )是 Linux 的主要渲染库。另一种替代方法 xy-vsfilter 号称更快、更可靠等等，但似乎没有 Linux 实现。libass 支持一些后来的扩展。这些似乎是 Aegisub 2008 的扩展，根据“vs filter hacks”(http://blog.aegisub.org/2008/07/vsfilter-hacks.html)。

将 KAR 文件转换成带屁股字幕的 MKV 文件

请遵循以下步骤:

要从 KAR 或 MIDI 文件中提取歌词，使用第十八章中给出的 Java DumpSequence，如下所示，获取所有事件的转储:
```
java DumpSequence  song.kar  > song.dump
```

对于仅行显示，使用 Aegisub 2.1.9 生成的以下 Python 脚本提取歌词并保存为 ASS 格式:

#!/usr/bin/python

import fileinput
import string
import math

TEXT_STR = "Dialogue: 0,%s,%s,Default,,0000,0000,0000,Karaoke,"

textStr = TEXT_STR
startTime = -1
endTime = -1

def printPreface():
    print '[Script Info]\r\n\
; Script generated by Aegisub 2.1.9\r\n\
; http://www.aegisub.org/\r\n\
Title: Default Aegisub file\r\n\
ScriptType: v4.00+\r\n\
WrapStyle: 0\r\n\
PlayResX: 640\r\n\
PlayResY: 480\r\n\
ScaledBorderAndShadow: yes\r\n\
Video Aspect Ratio: 0\r\n\
Video Zoom: 6\r\n\
Video Position: 0\r\n\
\r\n\
[V4+ Styles]\r\n\
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding\r\n\
Style: Default,Arial,36,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1\r\n\
\r\n\
[Events]\r\n\
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text'

def timeFormat(s):
    global microSecondsPerTick

    tf = float(s)
    tf /= 62.6  #ticks per sec

    # This should be right , but is too slow
    #tf = (tf * microSecondsPerTick) / 1000000

    t = int(math.floor(tf))
    hundredths = round((tf-t)*100)
    secs = t % 60
    t /= 60
    mins = t % 60
    t /= 60
    hrs = t
    return "%01d:%02d:%02d.%02d" % (hrs, mins, secs, hundredths)

def doLyric(words):
    global textStr
    global startTime
    global endTime
    global TEXT_STR

    if words[1] == "0:":
        #print "skipping"
        return

    time = string.rstrip(words[1], ':')
    if startTime == -1:
        startTime = time
    #print words[1],
    if len(words) == 5:
        if words[4][0] == '\\' or words[4][0] == '/':
            #print "My name is %s and weight is %d kg!" % ('Zara', 21)
            #print startTime, endTime
            print textStr % (timeFormat(startTime), timeFormat(endTime)) + "\r\n",
            textStr = TEXT_STR + words[4][:1]
            startTime = -1
        else:
            textStr += words[4]
    else:
        textStr += ' '

    endTime = time

printPreface()

for line in fileinput.input():
    words = line.split()

    if len(words)  >= 2:
        if words[0] == "Resolution:":
            ticksPerBeat = words[1]
        elif words[0] == "Length:":
            numTicks = int(words[1])
        elif words[0] == "Duration:":
            duration = int(words[1])
            microSecondsPerTick = duration/numTicks
            # print "Duration %d numTicks %d microSecondsPerTick %d" % (duration, numTicks, microSecondsPerTick)

    if len(words) >= 3 and words[2] == "Text":
        doLyric(words)

下面是一个例子:

python lyric2ass4kar.py song.dump > song.ass

对于填充歌词显示，使用下面的 Python 脚本提取歌词并以 ASS 格式保存:

 #!/usr/bin/python

import fileinput
import string
import math

TEXT_STR = "Dialogue: 0,%s,%s,Default,,0000,0000,0000,,"

textStr = "{\kf%d}"
plainTextStr = ""
startTime = -1
startWordTime = -1
endTime = -1

def printPreface():
    print '[Script Info]\r\n\
; Script generated by Aegisub 2.1.9\r\n\
; http://www.aegisub.org/\r\n\
Title: Default Aegisub file\r\n\
ScriptType: v4.00+\r\n\
WrapStyle: 0\r\n\
PlayResX: 640\r\n\
PlayResY: 480\r\n\
ScaledBorderAndShadow: yes\r\n\
Video Aspect Ratio: 0\r\n\
Video Zoom: 6\r\n\
Video Position: 0\r\n\
\r\n\
[V4+ Styles]\r\n\
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding\r\n\
Style: Default,Arial,36,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1\r\n\
\r\n\
[Events]\r\n\
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text'

def timeFormat(s):
    global microSecondsPerTick

    tf = float(s)

    # frames per sec should be 60: 120 beats/min, 30 ticks per beat
    # but it is too slow on 54154
    tf /= 62.6  #ticks per sec

    # This should be right , but is too slow
    # tf = (tf * microSecondsPerTick) / 1000000

    t = int(math.floor(tf))
    hundredths = round((tf-t)*100)
    secs = t % 60
    t /= 60
    mins = t % 60
    t /= 60
    hrs = t
    return "%01d:%02d:%02d.%02d" % (hrs, mins, secs, hundredths)

def durat(end, start):
    fend = float(end)
    fstart = float(start)
    d = (fend - fstart) / 62.9
    #print end, start, d
    return round(d*100)

def doLyric(words):
    global textStr
    global plainTextStr
    global startTime
    global endTime
    global TEXT_STR
    global startWordTime
    global lineNum

    if words[1] == "0:":
        #print "skipping"
        return

    time = string.rstrip(words[1], ':')
    if startTime == -1:
        startTime = time
        startWordTime = time
        previousEndTime = time
    #print words[1],
    if len(words) == 5:
        if words[4][0] == '\\' or words[4][0] == '/':
            #print "My name is %s and weight is %d kg!" % ('Zara', 21)
            #print startTime, endTime
            dur = durat(time, startWordTime)
            textStr = textStr % (dur)
            if len(words[4]) == 1:
                print TEXT_STR % (timeFormat(startTime),
                                  timeFormat(endTime)) + \
                                  textStr + "\r\n",

            # next word
            textStr = "{\kf%d}" + words[4][1:]
            startTime = -1
        else:
            textStr += words[4]
    else:
        # it's a space, gets lost by the split
        dur = durat(time, startWordTime)
        textStr = textStr % (dur) + " {\kf%d}"
        startWordTime = time

    endTime = time

printPreface()
# print "Dialogue: 0,0:00:18.22,0:00:19.94,Default,,0000,0000,0000,,{\k16}Here {\k46}comes {\k43}the {\k67}sun"

for line in fileinput.input():
    words = line.split()

    if len(words)  >= 2:
        if words[0] == "Resolution:":
            ticksPerBeat = words[1]
        elif words[0] == "Length:":
            numTicks = int(words[1])
        elif words[0] == "Duration:":
            duration = int(words[1])
            microSecondsPerTick = duration/numTicks
            # print "Duration %d numTicks %d microSecondsPerTick %d" % (duration, numTicks, microSecondsPerTick)

    if len(words) >= 3 and words[2] == "Text":
        doLyric(words)

下面是一个例子:

python lyric2karaokeass4kar.py song.dump > song.ass

对于多行歌词显示，使用下面的 Python 脚本提取歌词并以 ASS 格式保存:

 #!/usr/bin/python

import fileinput
import string
import math

START_EVENTS = ["Dialogue: 0,%s,%s,Default,,0000,0000,0000,,",
                "Dialogue: 0,%s,%s,Default,,0000,0000,0100,,"]

TEXT_STR = "Dialogue: 0,%s,%s,Default,,0000,0000,0000,,"
TEXT_STR2 = "Dialogue: 0,%s,%s,Default,,0000,0000,0100,,"

textStr = "{\kf%d}"
plainTextStr = ""
startTime = -1
previousStartTime = -1
startWordTime = -1
endTime = -1
previousEndTime = -1
lineNum = 0

def printPreface():
    print '[Script Info]\r\n\
; Script generated by Aegisub 2.1.9\r\n\
; http://www.aegisub.org/\r\n\
Title: Default Aegisub file\r\n\
ScriptType: v4.00+\r\n\
WrapStyle: 0\r\n\
PlayResX: 640\r\n\
PlayResY: 480\r\n\
ScaledBorderAndShadow: yes\r\n\
Video Aspect Ratio: 0\r\n\
Video Zoom: 6\r\n\
Video Position: 0\r\n\
\r\n\
[V4+ Styles]\r\n\
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding\r\n\
Style: Default,Arial,36,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1\r\n\
\r\n\
[Events]\r\n\
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text'

def timeFormat(s):
    global microSecondsPerTick

    tf = float(s)
    # print "factori is %f instead of %f" % ((1.0*microSecondsPerTick / 1000000), (1.0/62.9))
    # frames per sec should be 60: 120 beats/min, 30 ticks per beat
    # but it is too slow on 54154
    tf /= 62.6  #ticks per sec

    # This should be right , but is too slow
    # tf = (tf * microSecondsPerTick) / 1000000

    t = int(math.floor(tf))
    hundredths = round((tf-t)*100)
    secs = t % 60
    t /= 60#!/usr/bin/python

import fileinput
import string
import math

START_EVENTS = ["Dialogue: 0,%s,%s,Default,,0000,0000,0000,,",
                "Dialogue: 0,%s,%s,Default,,0000,0000,0100,,"]

TEXT_STR = "Dialogue: 0,%s,%s,Default,,0000,0000,0000,,"
TEXT_STR2 = "Dialogue: 0,%s,%s,Default,,0000,0000,0100,,"

textStr = "{\kf%d}"
plainTextStr = ""
startTime = -1
previousStartTime = -1
startWordTime = -1
endTime = -1
previousEndTime = -1
lineNum = 0

def printPreface():
    print '[Script Info]\r\n\
; Script generated by Aegisub 2.1.9\r\n\
; http://www.aegisub.org/\r\n\
Title: Default Aegisub file\r\n\
ScriptType: v4.00+\r\n\
WrapStyle: 0\r\n\
PlayResX: 640\r\n\
PlayResY: 480\r\n\
ScaledBorderAndShadow: yes\r\n\
Video Aspect Ratio: 0\r\n\
Video Zoom: 6\r\n\
Video Position: 0\r\n\
\r\n\
[V4+ Styles]\r\n\
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding\r\n\
Style: Default,Arial,36,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1\r\n\
\r\n\
[Events]\r\n\
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text'

def timeFormat(s):
    global microSecondsPerTick

    tf = float(s)
    # print "factori is %f instead of %f" % ((1.0*microSecondsPerTick / 1000000), (1.0/62.9))
    # frames per sec should be 60: 120 beats/min, 30 ticks per beat
    # but it is too slow on 54154
    tf /= 62.6  #ticks per sec

    # This should be right , but is too slow
    # tf = (tf * microSecondsPerTick) / 1000000

    t = int(math.floor(tf))
    hundredths = round((tf-t)*100)
    secs = t % 60
    t /= 60
    mins = t % 60
    t /= 60
    hrs = t
    return "%01d:%02d:%02d.%02d" % (hrs, mins, secs, hundredths)

def durat(end, start):
    fend = float(end)
    fstart = float(start)
    d = (fend - fstart) / 62.9
    #print end, start, d
    return round(d*100)

def doLyric(words):
    global textStr
    global plainTextStr
    global startTime
    global endTime
    global previousStartTime
    global previousEndTime
    global TEXT_STR
    global startWordTime
    global lineNum

    if words[1] == "0:":
        #print "skipping"
        return

    time = string.rstrip(words[1], ':')
    if startTime == -1:
        startTime = time
        startWordTime = time
        previousEndTime = time
    #print words[1],
    if len(words) == 5:
        if words[4][0] == '\\' or words[4][0] == '/':
            #print "My name is %s and weight is %d kg!" % ('Zara', 21)
            #print startTime, endTime
            dur = durat(time, startWordTime)
            textStr = textStr % (dur)

            if len(words[4]) == 1:

                if previousStartTime != -1:
                    print START_EVENTS[lineNum % 2] % (timeFormat(previousStartTime),
                                                       timeFormat(previousEndTime)) + \
                                                       plainTextStr + "\r\n",
                print START_EVENTS[lineNum % 2] % (timeFormat(startTime),
                                                   timeFormat(endTime)) + \
                                                   textStr + "\r\n",

            # next word
            lineNum += 1
            #previousEndTime = time
            textStr = "{\kf%d}" + words[4][1:]
            plainTextStr = words[4][1:]
            previousStartTime = startTime
            startTime = -1
        else:
            textStr += words[4]
            plainTextStr += words[4]
    else:
        #print textStr
        #dur = duration(time, startWordTime)
        dur = durat(time, startWordTime)
        textStr = textStr % (dur) + " {\kf%d}"
        plainTextStr += ' '
        startWordTime = time

    endTime = time

printPreface()
# print "Dialogue: 0,0:00:18.22,0:00:19.94,Default,,0000,0000,0000,,{\k16}Here {\k46}comes {\k43}the {\k67}sun"

for line in fileinput.input():
    words = line.split()

    if len(words)  >= 2:
        if words[0] == "Resolution:":
            ticksPerBeat = words[1]
        elif words[0] == "Length:":
            numTicks = int(words[1])
        elif words[0] == "Duration:":
            duration = int(words[1])
            microSecondsPerTick = duration/numTicks
            # print "Duration %d numTicks %d microSecondsPerTick %d" % (duration, numTicks, microSecondsPerTick)

    if len(words) >= 3 and words[2] == "Text":
        doLyric(words)
    mins = t % 60
    t /= 60
    hrs = t
    return "%01d:%02d:%02d.%02d" % (hrs, mins, secs, hundredths)

def durat(end, start):
    fend = float(end)
    fstart = float(start)
    d = (fend - fstart) / 62.9
    #print end, start, d
    return round(d*100)

def doLyric(words):
    global textStr
    global plainTextStr
    global startTime
    global endTime
    global previousStartTime
    global previousEndTime
    global TEXT_STR
    global startWordTime
    global lineNum

    if words[1] == "0:":
        #print "skipping"
        return

    time = string.rstrip(words[1], ':')
    if startTime == -1:
        startTime = time
        startWordTime = time
        previousEndTime = time
    #print words[1],
    if len(words) == 5:
        if words[4][0] == '\\' or words[4][0] == '/':
            #print "My name is %s and weight is %d kg!" % ('Zara', 21)
            #print startTime, endTime
            dur = durat(time, startWordTime)
            textStr = textStr % (dur)

            if len(words[4]) == 1:

                if previousStartTime != -1:
                    print START_EVENTS[lineNum % 2] % (timeFormat(previousStartTime),
                                                       timeFormat(previousEndTime)) + \
                                                       plainTextStr + "\r\n",
                print START_EVENTS[lineNum % 2] % (timeFormat(startTime),
                                                   timeFormat(endTime)) + \
                                                   textStr + "\r\n",

            # next word
            lineNum += 1
            #previousEndTime = time
            textStr = "{\kf%d}" + words[4][1:]
            plainTextStr = words[4][1:]
            previousStartTime = startTime
            startTime = -1
        else:
            textStr += words[4]
            plainTextStr += words[4]
    else:
        #print textStr
        #dur = duration(time, startWordTime)
        dur = durat(time, startWordTime)
        textStr = textStr % (dur) + " {\kf%d}"
        plainTextStr += ' '
        startWordTime = time

    endTime = time

printPreface()
# print "Dialogue: 0,0:00:18.22,0:00:19.94,Default,,0000,0000,0000,,{\k16}Here {\k46}comes {\k43}the {\k67}sun"

for line in fileinput.input():
    words = line.split()

    if len(words)  >= 2:
        if words[0] == "Resolution:":
            ticksPerBeat = words[1]
        elif words[0] == "Length:":
            numTicks = int(words[1])
        elif words[0] == "Duration:":
            duration = int(words[1])
            microSecondsPerTick = duration/numTicks
            # print "Duration %d numTicks %d microSecondsPerTick %d" % (duration, numTicks, microSecondsPerTick)

    if len(words) >= 3 and words[2] == "Text":
        doLyric(words)

下面是一个例子:

python lyric2karaokeass4kar.py song.dump > song.ass

使用fluidsynth将 MIDI 声音文件转换成 WAV 文件。

fluidsynth -F song.wav /usr/share/sounds/sf2/FluidR3_GM.sf2 song.kar

将 WAV 文件转换成 MP3。
```
lame song.wav song.mp3
```
为你的背景找一个合适的纯视频文件(我用了我的 Karaoke 光盘中的一个)，然后把它们合并成一个 MKV 文件。
```
mkvmerge -o 54154.mkv 54154.mp3 54154.ass BACK01.MPG
```

生成的 MKV 文件可以作为独立文件由 MPlayer 播放。

mplayer song.mkv

它也可以由 VLC 演奏，但只有在屁股文件存在的情况下。

vlc song.mkv

根据所选的 Karaoke 效果，屏幕截图在本章的前面已经显示过了。

然而，时机是个问题。默认的 MIDI 速度是每分钟 120 拍，常见的节拍率是每拍 30 拍。这导致每秒 60 个 MIDI 滴答声的速率。但是，您现在播放的是 MP3 文件和 ASS 文件，这两个文件都不再是 MIDI 文件，也不一定是同步的。从 MIDI 到 ASS 的转换速度为每秒 60 个节拍，歌词运行太慢。通过实验，我发现至少对于某些文件来说，62.9 是一个合理的比率。

HTML5 字幕

HTML5 支持视频类型，尽管具体哪种浏览器支持哪种视频格式是可变的。这包括使用 HTML 5.1 track 元素支持字幕和隐藏字幕。搜索会找到几篇详细讨论这个问题的文章。

你需要准备一份时间和文字说明文件。示例中显示的格式是一个.vtt文件，如下所示:

WEBVTT

1
00:00:01.000 --> 00:00:30.000  D:vertical A:start
This is the first line of text, displaying from 1-30 seconds

2
00:00:35.000 --> 00:00:50.000
And the second line of text
separated over two lines from 35 to 50 seconds

这里第一行是WEBVTT，文本块由空行分隔。VTT 文件的格式在“Web vtt:Web 视频文本轨迹格式”( http://dev.w3.org/html5/webvtt/ )中指定。

HTML 然后引用音频/视频文件和字幕文件，如下所示:

    <video  controls>
      <source src="output.webm" controls>
      <track src="54154.vtt" kind="subtitles" srclang="en" label="English" default />
      <!-- fallback for rubbish browsers -->
    </video>

图 26-6 显示了一个屏幕截图。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 26-6。

HTML5 subtitles

似乎没有任何机制来逐步突出显示一行中的单词。JavaScript 也许可以做到这一点，但是粗略看了一下，似乎不太可能。这使得它还不适合 Karaoke。

结论

本章讨论了将字幕文本叠加到变化的视频图像上的方法。这是可行的，但只有几种可行的机制。

Footnotes 1

严格来说，字幕指的是说话的内容，而隐藏字幕可能包括其他声音，如关门声。对于 Karaoke 来说，没必要区分。

二十七、Karaoke FluidSynth

FluidSynth 是一个播放 MIDI 文件的应用，也是一个 MIDI 应用库。它没有播放 Karaoke 文件的挂钩。本章讨论了对 FluidSynth 的一个扩展，该扩展添加了适当的挂钩，然后使用这些挂钩来构建各种 Karaoke 系统。

资源

以下是一些资源:

FluidSynth 主页( http://sourceforge.net/apps/trac/fluidsynth/ )
FluidSynth 下载页面( http://sourceforge.net/projects/fluidsynth/ )
FluidSynth 1.1 开发者文档( http://fluidsynth.sourceforge.net/api/ )
SourceArchive 的fluidsynth文档( http://fluidsynth.sourcearchive.com/documentation/1.1.5-1/main.html ”)

演员

fluidsynth是一个命令行 MIDI 播放器。它在 ALSA 下运行，命令行如下:

fluidsynth -a alsa -l <soundfont> <files...>

播放 MIDI 文件

FluidSynth API 包括以下内容:

使用new_fluid_player创建的音序器
使用new_fluid_synth创建的合成器
一个使用new_fluid_audio_driver创建的音频播放器，在一个单独的线程中运行
一个“设置”对象，可用于控制其他组件的许多功能，用new_fluid_settings创建，用fluid_settings_setstr等调用修改

使用 ALSA 播放 MIDI 文件序列的典型程序如下。它创建各种对象，设置音频播放器使用 ALSA，然后将每个声音字体和 MIDI 文件添加到播放器中。然后对fluid_player_play的调用依次播放每个 MIDI 文件。该程序只是第二十章中所示程序的重复。

#include <fluidsynth.h>
#include <fluid_midi.h>

int main(int argc, char** argv)
{
    int i;
    fluid_settings_t* settings;
    fluid_synth_t* synth;
    fluid_player_t* player;
    fluid_audio_driver_t* adriver;

    settings = new_fluid_settings();
    fluid_settings_setstr(settings, "audio.driver", "alsa");
    synth = new_fluid_synth(settings);
    player = new_fluid_player(synth);

    adriver = new_fluid_audio_driver(settings, synth);
    /* process command line arguments */
    for (i = 1; i < argc; i++) {
        if (fluid_is_soundfont(argv[i])) {
            fluid_synth_sfload(synth, argv[1], 1);
        } else {
            fluid_player_add(player, argv[i]);
        }
    }
    /* play the midi files, if any */
    fluid_player_play(player);
    /* wait for playback termination */
    fluid_player_join(player);
    /* cleanup */
    delete_fluid_audio_driver(adriver);
    delete_fluid_player(player);
    delete_fluid_synth(synth);
    delete_fluid_settings(settings);
    return 0;
}

用回调扩展 FluidSynth

回调是在应用中注册的函数，当某些事件发生时被调用。要构建 Karaoke 播放器，您需要了解以下内容:

当一个文件被加载时，你可以从中提取所有的歌词，以便在适当的时候显示
当每个元歌词或文本事件作为音序器的输出出现时，您可以看到将要演唱的歌词

第一个非常简单:FluidSynth 有一个函数fluid_player_load可以加载一个文件。您可以更改代码，在该函数中添加一个合适的回调函数，以便访问加载的 MIDI 文件。

从音序器中获取歌词或文本事件并不容易，因为它们本来就不应该出现！MIDI 规范允许在 MIDI 文件中使用这些事件类型，但它们不是连线类型，因此永远不应该从音序器发送到合成器。Java MIDI API 通过对元事件处理程序的带外调用使它们可用。FluidSynth 只是把它们扔掉。

另一方面，FluidSynth 已经有一个回调来处理从音序器发送到合成器的 MIDI 事件。它是函数fluid_synth_handle_midi_event，通过调用fluid_player_set_playback_callback进行设置。您需要做的是首先改变现有的 FluidSynth 代码，以便让歌词和文本事件通过。然后插入一个新的回放回调函数，该回调函数将截取这些事件并对它们进行处理，同时将所有其他事件传递给默认处理程序。默认处理程序无论如何都会忽略任何此类事件，因此不需要对其进行更改。

我给 FluidSynth 添加了一个新函数，fluid_player_set_onload_callback，并添加了适当的代码来传递一些元事件。接下来就是编写一个 onload 回调来遍历来自解析后的输入文件的 MIDI 数据，并编写一个合适的 MIDI 事件回调来处理截获的元事件，同时将其余的传递给默认的处理程序。

这些改变是为了给出一个新的源包fluidsynth-1.1.6-karaoke.tar.bz2。如果你只是想从一个补丁文件开始工作，那就是fluid.patch。补丁已经提交给 FluidSynth 维护人员。

要从这个包开始构建，像平常一样做。

tar jxf fluidsynth-1.1.6-karaoke.tar.bz2
cd fluidsynth-1.1.6
./configure
make clean
make

为了获得 ALSA 的支持，你需要安装libasound2-dev包，就像 Jack 和其他包一样。你可能不会安装很多，所以不要运行make install，否则你会覆盖普通的fluidsynth包，它可能会有更多的特性。

前面修改的程序是karaoke_player.c，只打印歌词行和歌词事件，如下所示:

#include <fluidsynth.h>
#include <fluid_midi.h>

/**
 * This MIDI event callback filters out the TEXT and LYRIC events
 * and passes the rest to the default event handler.
 * Here we just print the text of the event, more
 * complex handling can be done
 */
int event_callback(void *data, fluid_midi_event_t *event) {
    fluid_synth_t* synth = (fluid_synth_t*) data;
    int type = fluid_midi_event_get_type(event);
    int chan = fluid_midi_event_get_channel(event);
    if (synth == NULL) printf("Synth is null\n");
    switch(type) {
    case MIDI_TEXT:
        printf("Callback: Playing text event %s (length %d)\n",
               (char *) event->paramptr, event->param1);
        return  FLUID_OK;

    case MIDI_LYRIC:
        printf("Callback: Playing lyric event %d %s\n",
               event->param1, (char *) event->paramptr);
        return  FLUID_OK;
    }
    return fluid_synth_handle_midi_event( data, event);
}

/**
 * This is called whenever new data is loaded, such as a new file.
 * Here we extract the TEXT and LYRIC events and just print them
 * to stdout. They could e.g. be saved and displayed in a GUI
 * as the events are received by the event callback.
 */
int onload_callback(void *data, fluid_player_t *player) {
    printf("Load callback, tracks %d \n", player->ntracks);
    int n;
    for (n = 0; n < player->ntracks; n++) {
        fluid_track_t *track = player->track[n];
        printf("Track %d\n", n);
        fluid_midi_event_t *event = fluid_track_first_event(track);
        while (event != NULL) {
            switch (event->type) {
            case MIDI_TEXT:
            case MIDI_LYRIC:
                printf("Loaded event %s\n", (char *) event->paramptr);
            }
            event = fluid_track_next_event(track);
        }
    }
    return FLUID_OK;
}

int main(int argc, char** argv)
{
    int i;
    fluid_settings_t* settings;
    fluid_synth_t* synth;
    fluid_player_t* player;
    fluid_audio_driver_t* adriver;
    settings = new_fluid_settings();
    fluid_settings_setstr(settings, "audio.driver", "alsa");
    fluid_settings_setint(settings, "synth.polyphony", 64);
    synth = new_fluid_synth(settings);
    player = new_fluid_player(synth);

    /* Set the MIDI event callback to our own functions rather than the system default */
    fluid_player_set_playback_callback(player, event_callback, synth);

    /* Add an onload callback so we can get information from new data before it plays */
    fluid_player_set_onload_callback(player, onload_callback, NULL);

    adriver = new_fluid_audio_driver(settings, synth);
    /* process command line arguments */
    for (i = 1; i < argc; i++) {
        if (fluid_is_soundfont(argv[i])) {
            fluid_synth_sfload(synth, argv[1], 1);
        } else {
            fluid_player_add(player, argv[i]);
        }
    }
    /* play the midi files, if any */
    fluid_player_play(player);
    /* wait for playback termination */
    fluid_player_join(player);
    /* cleanup */
    delete_fluid_audio_driver(adriver);
    delete_fluid_player(player);
    delete_fluid_synth(synth);
    delete_fluid_settings(settings);
    return 0;
}

假设新的fluidsynth包在一个直接的子目录中，为了编译程序，您需要选择本地的 includes 和 libraries。

gcc -g -I fluidsynth-1.1.6/include/ -I fluidsynth-1.1.6/src/midi/ -I fluidsynth-1.1.6/src/utils/ -c -o karaoke_player.o karaoke_player.c

gcc karaoke_player.o -Lfluidsynth-1.1.6/src/.libs -l fluidsynth -o karaoke_player

要运行该程序，您还需要获取本地库和声音字体文件。

export LD_LIBRARY_PATH=./fluidsynth-1.1.6/src/.libs/
./karaoke_player /usr/share/soundfonts/FluidR3_GM.sf2 54154.mid

典型的KAR文件的输出如下:

Load callback, tracks 1
Track 0
Loaded event #
Loaded event 0
Loaded event 0
Loaded event 0
Loaded event 1
Loaded event

...

Callback: Playing lyric event 2 #
Callback: Playing lyric event 2 0
Callback: Playing lyric event 2 0
Callback: Playing lyric event 2 0
Callback: Playing lyric event 2 1
Callback: Playing lyric event 3

用 Gtk 显示和着色文本

虽然有许多显示 Karaoke 文本的方式，但一种常见的模式是显示两行文本:当前播放行和下一行。当前行逐渐突出显示，完成后由下一行替换。

在第二十五章中你做到了。但是 Java 库还没有完善，而且明显很慢，很笨重。在 Oracle 的 Java 开发计划中，它们似乎处于较低的优先级。因此，在这里您将看到一个替代的 GUI 并利用 FluidSynth 库。我选择 Gtk 库的原因在第十五章中有概述。

第一个任务是在加载文件时建立一个歌词行数组。你正在使用 KAR 格式的文件，这些文件带有关于标题的预先信息，等等，以@为前缀，换行符以\为前缀。

struct _lyric_t {
    gchar *lyric;
    long tick;
};
typedef struct _lyric_t lyric_t;

struct _lyric_lines_t {
    char *language;
    char *title;
    char *performer;
    GArray *lines; // array of GString *
};
typedef struct _lyric_lines_t lyric_lines_t;

GArray *lyrics;
lyric_lines_t lyric_lines;

void build_lyric_lines() {
    int n;
    lyric_t *plyric;
    GString *line = g_string_new("");
    GArray *lines =  g_array_sized_new(FALSE, FALSE, sizeof(GString *), 64);

    lyric_lines.title = NULL;

    for (n = 0; n < lyrics->len; n++) {
        plyric = g_array_index(lyrics, lyric_t *, n);
        gchar *lyric = plyric->lyric;
        int tick = plyric->tick;

        if ((strlen(lyric) >= 2) && (lyric[0] == '@') && (lyric[1] == 'L')) {
            lyric_lines.language =  lyric + 2;
            continue;
        }

        if ((strlen(lyric) >= 2) && (lyric[0] == '@') && (lyric[1] == 'T')) {
            if (lyric_lines.title == NULL) {
                lyric_lines.title = lyric + 2;
            } else {
                lyric_lines.performer = lyric + 2;
            }
            continue;
        }

        if (lyric[0] == '@') {
            // some other stuff like @KMIDI KARAOKE FILE
            continue;
        }

        if ((lyric[0] == '/') || (lyric[0] == '\\')) {
            // start of a new line
            // add to lines
            g_array_append_val(lines, line);
            line = g_string_new(lyric + 1);
        }  else {
            line = g_string_append(line, lyric);
        }
    }
    lyric_lines.lines = lines;

    printf("Title is %s, performer is %s, language is %s\n",
           lyric_lines.title, lyric_lines.performer, lyric_lines.language);
    for (n = 0; n < lines->len; n++) {
        printf("Line is %s\n", g_array_index(lines, GString *, n)->str);
    }
}

这是从 onload 回调中调用的。

int onload_callback(void *data, fluid_player_t *player) {
    long ticks = 0L;
    lyric_t *plyric;

    printf("Load callback, tracks %d \n", player->ntracks);
    int n;
    for (n = 0; n < player->ntracks; n++) {
        fluid_track_t *track = player->track[n];
        printf("Track %d\n", n);
        fluid_midi_event_t *event = fluid_track_first_event(track);
        while (event != NULL) {
            switch (fluid_midi_event_get_type (event)) {
            case MIDI_TEXT:
            case MIDI_LYRIC:
                /* there's no fluid_midi_event_get_sysex()
                   or fluid_midi_event_get_time() so we
                   have to look inside the opaque struct
                */
                ticks += event->dtime;
                printf("Loaded event %s for time %d\n",
                       event->paramptr,
                       ticks);
                plyric = g_new(lyric_t, 1);
                plyric->lyric = g_strdup(event->paramptr);
                plyric->tick = ticks;
                g_array_append_val(lyrics, plyric);
            }
            event = fluid_track_next_event(track);
        }
    }

    printf("Saved %d lyric events\n", lyrics->len);
    for (n = 0; n < lyrics->len; n++) {
        plyric = g_array_index(lyrics, lyric_t *, n);
        printf("Saved lyric %s at %d\n", plyric->lyric, plyric->tick);
    }

    build_lyric_lines();
}

标准的 GUI 部分是构建一个由两个标签组成的界面，一个在另一个之上，用来保存歌词行。这只是普通的 Gtk。

最后一部分是处理来自音序器的歌词或文本事件。如果事件是一个\，那么在一个小的停顿之后，标签中的当前文本必须被替换为新的文本。否则，标签中的文本必须逐渐着色，以指示接下来要播放的内容。

在第十五章，我讨论了在 pixbufs 中使用 Cairo 画图，使用 Pango 构造文本。Gtk 标签直接理解 Pango，所以您只需使用 Pango 来格式化文本并将其显示在标签中。这包括构造一个 HTML 字符串，第一部分为红色，其余部分为黑色。这个可以在标签里设置，不需要用 Cairo。

节目是gtkkaraoke_player.c。

Warning

当试图在 Gtk 代码中复制 Pango 属性列表以调整标签大小时，下面的程序经常崩溃。调试显示，Pango copy 函数在 Gtk 中的某个地方被设置为NULL，而不应该是这样。我还没有修复方法，也没有用足够简单的方法复制错误来记录错误报告。

#include <fluidsynth.h>
#include <fluid_midi.h>
#include <string.h>

#include <gtk/gtk.h>

/* GString stuff from https://developer.gnome.org/glib/2.31/glib-Strings.html
   Memory alloc from https://developer.gnome.org/glib/2.30/glib-Memory-Allocation.html
   Packing demo from https://developer.gnome.org/gtk-tutorial/2.90/x386.html
   Thread stuff from https://developer.gnome.org/gtk-faq/stable/x481.html
   GArrays from http://www.gtk.org/api/2.6/glib/glib-Arrays.html
   Pango attributes from http://www.ibm.com/developerworks/library/l-u-pango2/
   Timeouts at http://www.gtk.org/tutorial1.2/gtk_tut-17.html
 */

struct _lyric_t {
    gchar *lyric;
    long tick;

};
typedef struct _lyric_t lyric_t;

struct _lyric_lines_t {
    char *language;
    char *title;
    char *performer;
    GArray *lines; // array of GString *
};
typedef struct _lyric_lines_t lyric_lines_t;

GArray *lyrics;

lyric_lines_t lyric_lines;

fluid_synth_t* synth;

GtkWidget *lyric_labels[2];

fluid_player_t* player;

int current_panel = -1;  // panel showing current lyric line
int current_line = 0;  // which line is the current lyric
gchar *current_lyric;   // currently playing lyric line
GString *front_of_lyric;  // part of lyric to be coloured red
GString *end_of_lyric;    // part of lyric to not be coloured

gchar *markup[] = {"<span foreground=\"red\">",
                   "</span><span foreground=\"black\">",
                   "</span>"};
gchar *markup_newline[] = {"<span foreground=\"black\">",
                   "</span>"};
GString *marked_up_label;

struct _reset_label_data {
    GtkLabel *label;
    gchar *text;
    PangoAttrList *attrs;
};

typedef struct _reset_label_data reset_label_data;

/**
 * redraw a label some time later
 */
gint reset_label_cb(gpointer data) {
    reset_label_data *rdata = ( reset_label_data *) data;

    if (rdata->label == NULL) {
        printf("Label is null, cant set its text \n");
        return FALSE;
    }

    printf("Resetting label callback to \"%s\"\n", rdata->text);

    gdk_threads_enter();

    gchar *str;
    str = g_strconcat(markup_newline[0], rdata->text, markup_newline[1], NULL);

    PangoAttrList *attrs;
    gchar *text;
    pango_parse_markup (str, -1,0, &attrs, &text, NULL, NULL);

    gtk_label_set_text(rdata->label, text);
    gtk_label_set_attributes(rdata->label, attrs);

    gdk_threads_leave();

    GtkAllocation* alloc = g_new(GtkAllocation, 1);
    gtk_widget_get_allocation((GtkWidget *) (rdata->label), alloc);
    printf("Set label text to \"%s\"\n", gtk_label_get_text(rdata->label));
    printf("Label has height %d width %d\n", alloc->height, alloc->width);
    printf("Set other label text to \"%s\"\n",
           gtk_label_get_text(rdata->label == lyric_labels[0] ?
                              lyric_labels[1] : lyric_labels[0]));
    gtk_widget_get_allocation((GtkWidget *) (rdata->label  == lyric_labels[0] ?
                              lyric_labels[1] : lyric_labels[0]), alloc);
    printf("Label has height %d width %d\n", alloc->height, alloc->width);

    return FALSE;
}

/**
 * This MIDI event callback filters out the TEXT and LYRIC events
 * and passes the rest to the default event handler.
 * Here we colour the text in a Gtk label
 */
int event_callback(void *data, fluid_midi_event_t *event) {
    fluid_synth_t* synth = (fluid_synth_t*) data;
    int type = fluid_midi_event_get_type(event);
    int chan = fluid_midi_event_get_channel(event);
    if (synth == NULL) printf("Synth is null\n");
    switch(type) {
    case MIDI_TEXT:
        printf("Callback: Playing text event %s (length %d)\n",
               (char *) event->paramptr, event->param1);

        if (((char *) event->paramptr)[0] == '\\') {
            // we've got a new line, change the label text on the NEXT panel
            int next_panel = current_panel; // really (current_panel+2)%2
            int next_line = current_line + 2;
            gchar *next_lyric;

            if (current_line + 2 >= lyric_lines.lines->len) {
                return FLUID_OK;
            }
            current_line += 1;
            current_panel = (current_panel + 1) % 2;

            // set up new line as current line
            char *lyric =  event->paramptr;

            // find the next line from lyric_lines array
            current_lyric = g_array_index(lyric_lines.lines, GString *, current_line)->str;

            // lyric is in 2 parts: front coloured, end uncoloured
            front_of_lyric = g_string_new(lyric+1); // lose \
            end_of_lyric = g_string_new(current_lyric);
            printf("New line. Setting front to %s end to \"%s\"\n", lyric+1, current_lyric);

            // update label for next line after this one
            char *str = g_array_index(lyric_lines.lines, GString *, next_line)->str;
            printf("Setting text in label %d to \"%s\"\n", next_panel, str);

            next_lyric = g_array_index(lyric_lines.lines, GString *, next_line)->str;

            gdk_threads_enter();

            // change the label after one second to avoid visual "jar"
            reset_label_data *label_data;
            label_data = g_new(reset_label_data, 1);
            label_data->label = (GtkLabel *) lyric_labels[next_panel];
            label_data->text = next_lyric;
            g_timeout_add(1000, reset_label_cb, label_data);

            // Dies if you try to flush at this point!
            // gdk_flush();

            gdk_threads_leave();
        } else {
            // change text colour as chars are played, using Pango attributes
            char *lyric =  event->paramptr;
            if ((front_of_lyric != NULL) && (lyric != NULL)) {
                // add the new lyric to the front of the existing coloured
                g_string_append(front_of_lyric, lyric);
                char *s = front_of_lyric->str;
                printf("Displaying \"%s\"\n", current_lyric);
                printf("  Colouring \"%s\"\n", s);
                printf("  Not colouring \"%s\"\n", current_lyric + strlen(s));

                // todo: avoid memory leak
                marked_up_label = g_string_new(markup[0]);
                g_string_append(marked_up_label, s);
                g_string_append(marked_up_label, markup[1]);
                g_string_append(marked_up_label, current_lyric + strlen(s));
                g_string_append(marked_up_label, markup[2]);
                printf("Marked up label \"%s\"\n", marked_up_label->str);

                /* Example from http://www.ibm.com/developerworks/library/l-u-pango2/
                 */
                PangoAttrList *attrs;
                gchar *text;
                gdk_threads_enter();
                pango_parse_markup (marked_up_label->str, -1,0, &attrs, &text, NULL, NULL);
                printf("Marked up label parsed ok\n");
                gtk_label_set_text((GtkLabel *) lyric_labels[current_panel],
                                   text);
                gtk_label_set_attributes(GTK_LABEL(lyric_labels[current_panel]), attrs);
                // Dies if you try to flush at this point!
                //gdk_flush();

                gdk_threads_leave();
            }
        }
        return  FLUID_OK;

    case MIDI_LYRIC:
        printf("Callback: Playing lyric event %d %s\n",
               event->param1, (char *) event->paramptr);
        return  FLUID_OK;

    case MIDI_EOT:
        printf("End of track\n");
        exit(0);
    }
    // default handler for all other events
    return fluid_synth_handle_midi_event( data, event);
}

/*
 * Build array of lyric lines from the MIDI file data
 */
void build_lyric_lines() {
    int n;
    lyric_t *plyric;
    GString *line = g_string_new("");
    GArray *lines =  g_array_sized_new(FALSE, FALSE, sizeof(GString *), 64);

    lyric_lines.title = NULL;

    for (n = 0; n < lyrics->len; n++) {
        plyric = g_array_index(lyrics, lyric_t *, n);
        gchar *lyric = plyric->lyric;
        int tick = plyric->tick;

        if ((strlen(lyric) >= 2) && (lyric[0] == '@') && (lyric[1] == 'L')) {
            lyric_lines.language =  lyric + 2;
            continue;
        }

        if ((strlen(lyric) >= 2) && (lyric[0] == '@') && (lyric[1] == 'T')) {
            if (lyric_lines.title == NULL) {
                lyric_lines.title = lyric + 2;
            } else {
                lyric_lines.performer = lyric + 2;
            }
            continue;
        }

        if (lyric[0] == '@') {
            // some other stuff like @KMIDI KARAOKE FILE
            continue;
        }

        if ((lyric[0] == '/') || (lyric[0] == '\\')) {
            // start of a new line
            // add to lines
            g_array_append_val(lines, line);
            line = g_string_new(lyric + 1);
        }  else {
            line = g_string_append(line, lyric);
        }
    }
    lyric_lines.lines = lines;

    printf("Title is %s, performer is %s, language is %s\n",
           lyric_lines.title, lyric_lines.performer, lyric_lines.language);
    for (n = 0; n < lines->len; n++) {
        printf("Line is %s\n", g_array_index(lines, GString *, n)->str);
    }

}

/**
 * This is called whenever new data is loaded, such as a new file.
 * Here we extract the TEXT and LYRIC events and save them
 * into an array
 */
int onload_callback(void *data, fluid_player_t *player) {
    long ticks = 0L;
    lyric_t *plyric;

    printf("Load callback, tracks %d \n", player->ntracks);
    int n;
    for (n = 0; n < player->ntracks; n++) {
        fluid_track_t *track = player->track[n];
        printf("Track %d\n", n);
        fluid_midi_event_t *event = fluid_track_first_event(track);
        while (event != NULL) {
            switch (fluid_midi_event_get_type (event)) {
            case MIDI_TEXT:
            case MIDI_LYRIC:
                /* there's no fluid_midi_event_get_sysex()
                   or fluid_midi_event_get_time() so we
                   have to look inside the opaque struct
                */
                ticks += event->dtime;
                printf("Loaded event %s for time %ld\n",
                       (char *) event->paramptr,
                       ticks);
                plyric = g_new(lyric_t, 1);
                plyric->lyric = g_strdup(event->paramptr);
                plyric->tick = ticks;
                g_array_append_val(lyrics, plyric);
            }
            event = fluid_track_next_event(track);
        }
    }

    printf("Saved %d lyric events\n", lyrics->len);
    for (n = 0; n < lyrics->len; n++) {
        plyric = g_array_index(lyrics, lyric_t *, n);
        printf("Saved lyric %s at %ld\n", plyric->lyric, plyric->tick);
    }

    build_lyric_lines();

    // stick the first two lines into the labels so we can see
    // what is coming
    gdk_threads_enter();
    char *str = g_array_index(lyric_lines.lines, GString *, 1)->str;
    gtk_label_set_text((GtkLabel *) lyric_labels[0], str);
    str = g_array_index(lyric_lines.lines, GString *, 2)->str;
    gtk_label_set_text((GtkLabel *) lyric_labels[1], str);
    // gdk_flush ();

    /* release GTK thread lock */
    gdk_threads_leave();

    return FLUID_OK;
}

/* Called when the windows are realized
 */
static void realize_cb (GtkWidget *widget, gpointer data) {
    /* now we can play the midi files, if any */
    fluid_player_play(player);
}

static gboolean delete_event( GtkWidget *widget,
                              GdkEvent  *event,
                              gpointer   data )
{
    /* If you return FALSE in the "delete-event" signal handler,
     * GTK will emit the "destroy" signal. Returning TRUE means
     * you don't want the window to be destroyed.
     * This is useful for popping up 'are you sure you want to quit?'
     * type dialogs. */

    g_print ("delete event occurred\n");

    /* Change TRUE to FALSE and the main window will be destroyed with
     * a "delete-event". */

    return TRUE;
}

/* Another callback */
static void destroy( GtkWidget *widget,
                     gpointer   data )
{
    gtk_main_quit ();
}

int main(int argc, char** argv)
{

    /* set up the fluidsynth stuff */
    int i;
    fluid_settings_t* settings;

    fluid_audio_driver_t* adriver;
    settings = new_fluid_settings();
    fluid_settings_setstr(settings, "audio.driver", "alsa");
    fluid_settings_setint(settings, "synth.polyphony", 64);
    fluid_settings_setint(settings, "synth.reverb.active", FALSE);
    fluid_settings_setint(settings, "synth.sample-rate", 22050);
    synth = new_fluid_synth(settings);
    player = new_fluid_player(synth);

    lyrics = g_array_sized_new(FALSE, FALSE, sizeof(lyric_t *), 1024);

    /* Set the MIDI event callback to our own functions rather than the system default */
    fluid_player_set_playback_callback(player, event_callback, synth);

    /* Add an onload callback so we can get information from new data before it plays */
    fluid_player_set_onload_callback(player, onload_callback, NULL);

    adriver = new_fluid_audio_driver(settings, synth);
    /* process command line arguments */
    for (i = 1; i < argc; i++) {
        if (fluid_is_soundfont(argv[i])) {
            fluid_synth_sfload(synth, argv[1], 1);
        } else {
            fluid_player_add(player, argv[i]);
        }
    }

    // Gtk stuff now

   /* GtkWidget is the storage type for widgets */
    GtkWidget *window;
    GtkWidget *button;
    GtkWidget *lyrics_box;

    /* This is called in all GTK applications. Arguments are parsed
     * from the command line and are returned to the application. */
    gtk_init (&argc, &argv);

    /* create a new window */
    window = gtk_window_new (GTK_WINDOW_TOPLEVEL);

    /* When the window is given the "delete-event" signal (this is given
     * by the window manager, usually by the "close" option, or on the
     * titlebar), we ask it to call the delete_event () function
     * as defined above. The data passed to the callback
     * function is NULL and is ignored in the callback function. */
    g_signal_connect (window, "delete-event",
                      G_CALLBACK (delete_event), NULL);

    /* Here we connect the "destroy" event to a signal handler.
     * This event occurs when we call gtk_widget_destroy() on the window,
     * or if we return FALSE in the "delete-event" callback. */
    g_signal_connect (window, "destroy",
                      G_CALLBACK (destroy), NULL);

    g_signal_connect (window, "realize", G_CALLBACK (realize_cb), NULL);

    /* Sets the border width of the window. */
    gtk_container_set_border_width (GTK_CONTAINER (window), 10);

    // Gtk 3.0 deprecates gtk_vbox_new in favour of gtk_grid
    // but that isn't in Gtk 2.0, so we ignore warnings for now
    lyrics_box = gtk_vbox_new(TRUE, 1);
    gtk_widget_show(lyrics_box);

    char *str = "  ";
    lyric_labels[0] = gtk_label_new(str);
    lyric_labels[1] = gtk_label_new(str);

    gtk_widget_show (lyric_labels[0]);
    gtk_widget_show (lyric_labels[1]);

    gtk_box_pack_start (GTK_BOX (lyrics_box), lyric_labels[0], TRUE, TRUE, 0);
    gtk_box_pack_start (GTK_BOX (lyrics_box), lyric_labels[1], TRUE, TRUE, 0);

    /* This packs the button into the window (a gtk container). */
    gtk_container_add (GTK_CONTAINER (window), lyrics_box);

    /* and the window */
    gtk_widget_show (window);

    /* All GTK applications must have a gtk_main(). Control ends here
     * and waits for an event to occur (like a key press or
     * mouse event). */
    gtk_main ();

    /* wait for playback termination */
    fluid_player_join(player);
    /* cleanup */
    delete_fluid_audio_driver(adriver);
    delete_fluid_player(player);
    delete_fluid_synth(synth);
    delete_fluid_settings(settings);
    return 0;
}

运行时，如图 27-1 所示。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 27-1。

Caption

用 Gtk 播放背景视频

第十五章展示了如何播放背景视频，包括图像(使用 pixbufs)、文本(使用 Cairo)和彩色文本(使用 Pango)。您可以通过添加动态文本显示来扩展这一功能，以便播放 Karaoke。

您可以在一个结构中捕获每一行歌词，该结构保留整行、已经唱过的部分、该行的 Pango 标记和 Pango 属性。

typedef struct _coloured_line_t {
    gchar *line;
    gchar *front_of_line;
    gchar *marked_up_line;
    PangoAttrList *attrs;
} coloured_line_t;

每次 MIDI 歌词事件发生时，都会在监听 FluidSynth 音序器的线程中进行更新。

一个单独的线程播放视频，并在每一帧上用当前和下一个歌词覆盖帧图像。这被设置到一个GdkImage中，由 Gtk 显示。

节目是gtkkaraoke_player_video_pango.c。

#include <fluidsynth.h>
#include <fluid_midi.h>
#include <string.h>

#include <gtk/gtk.h>

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>

// saving as pixbufs leaks memory
//#define USE_PIXBUF

/* run by
   gtkkaraoke_player_video_pango /usr/share/sounds/sf2/FluidR3_GM.sf2 /home/newmarch/Music/karaoke/sonken/songs/54154.kar
*/

/*
 * APIs:
 * GString: https://developer.gnome.org/glib/2.28/glib-Strings.html
 * Pango text attributes: https://developer.gnome.org/pango/stable/pango-Text-Attributes.html#pango-parse-markup
 * Pango layout: http://www.gtk.org/api/2.6/pango/pango-Layout-Objects.html
 * Cairo rendering: https://developer.gnome.org/pango/stable/pango-Cairo-Rendering.html#pango-cairo-create-layout
 * Cairo surface_t: http://cairographics.org/manual/cairo-cairo-surface-t.html
 * GTK+ 3 Reference Manual: https://developer.gnome.org/gtk3/3.0/
 * Gdk Pixbufs: https://developer.gnome.org/gdk/stable/gdk-Pixbufs.html
 */

struct _lyric_t {
    gchar *lyric;
    long tick;

};
typedef struct _lyric_t lyric_t;

struct _lyric_lines_t {
    char *language;
    char *title;
    char *performer;
    GArray *lines; // array of GString *
};
typedef struct _lyric_lines_t lyric_lines_t;

GArray *lyrics;

lyric_lines_t lyric_lines;

typedef struct _coloured_line_t {
    gchar *line;
    gchar *front_of_line;
    gchar *marked_up_line;
    PangoAttrList *attrs;
#ifdef USE_PIXBUF
    GdkPixbuf *pixbuf;
#endif
} coloured_line_t;

coloured_line_t coloured_lines[2];

fluid_synth_t* synth;

GtkWidget *image;
#if GTK_MAJOR_VERSION == 2
GdkPixmap *dbuf_pixmap;
#endif

int height_lyric_pixbufs[] = {300, 400}; // vertical offset of lyric in video

fluid_player_t* player;

int current_panel = 1;  // panel showing current lyric line
int current_line = 0;  // which line is the current lyric
gchar *current_lyric;   // currently playing lyric line
GString *front_of_lyric;  // part of lyric to be coloured red
//GString *end_of_lyric;    // part of lyrci to not be coloured

// Colours seem to get mixed up when putting a pixbuf onto a pixbuf
#ifdef USE_PIXBUF
#define RED blue
#else
#define RED red
#endif

gchar *markup[] = {"<span font=\"28\" foreground=\"RED\">",
                   "</span><span font=\"28\" foreground=\"white\">",
                   "</span>"};
gchar *markup_newline[] = {"<span foreground=\"black\">",
                           "</span>"};
GString *marked_up_label;

/* FFMpeg vbls */
AVFormatContext *pFormatCtx = NULL;
AVCodecContext *pCodecCtx = NULL;
int videoStream;
struct SwsContext *sws_ctx = NULL;
AVCodec *pCodec = NULL;

void markup_line(coloured_line_t *line) {
    GString *str =  g_string_new(markup[0]);
    g_string_append(str, line->front_of_line);
    g_string_append(str, markup[1]);
    g_string_append(str, line->line + strlen(line->front_of_line));
    g_string_append(str, markup[2]);
    printf("Marked up label \"%s\"\n", str->str);

    line->marked_up_line = str->str;
    // we have to free line->marked_up_line

    pango_parse_markup(str->str, -1,0, &(line->attrs), NULL, NULL, NULL);
    g_string_free(str, FALSE);
}

#ifdef USE_PIXBUF
void update_line_pixbuf(coloured_line_t *line) {
    //return;
    cairo_surface_t *surface;
    cairo_t *cr;

    int lyric_width = 480;
    int lyric_height = 60;
    surface = cairo_image_surface_create (CAIRO_FORMAT_ARGB32,
                                          lyric_width, lyric_height);
    cr = cairo_create (surface);

    PangoLayout *layout;
    PangoFontDescription *desc;

    // draw the attributed text
    layout = pango_cairo_create_layout (cr);
    pango_layout_set_text (layout, line->line, -1);
    pango_layout_set_attributes(layout, line->attrs);

    // centre the image in the surface
    int width, height;
    pango_layout_get_pixel_size(layout,
                                &width,
                                &height);
    cairo_move_to(cr, (lyric_width-width)/2, 0);

    pango_cairo_update_layout (cr, layout);
    pango_cairo_show_layout (cr, layout);

    // pull the pixbuf out of the surface
    unsigned char *data = cairo_image_surface_get_data(surface);
    width = cairo_image_surface_get_width(surface);
    height = cairo_image_surface_get_height(surface);
    int stride = cairo_image_surface_get_stride(surface);
    printf("Text surface width %d height %d stride %d\n", width, height, stride);

    GdkPixbuf *old_pixbuf = line->pixbuf;
    line->pixbuf = gdk_pixbuf_new_from_data(data, GDK_COLORSPACE_RGB, 1, 8, width, height, stride, NULL, NULL);
    cairo_surface_destroy(surface);
    g_object_unref(old_pixbuf);
}
#endif

/**
 * This MIDI event callback filters out the TEXT and LYRIC events
 * and passes the rest to the default event handler.
  */
int event_callback(void *data, fluid_midi_event_t *event) {
    fluid_synth_t* synth = (fluid_synth_t*) data;
    int type = fluid_midi_event_get_type(event);
    int chan = fluid_midi_event_get_channel(event);
    if (synth == NULL) printf("Synth is null\n");

    //return 0;

    switch(type) {
    case MIDI_TEXT:
        printf("Callback: Playing text event %s (length %d)\n",
               (char *) event->paramptr, (int) event->param1);

        if (((char *) event->paramptr)[0] == '\\') {
            int next_panel = current_panel; // really (current_panel+2)%2
            int next_line = current_line + 2;
            gchar *next_lyric;

            if (current_line + 2 >= lyric_lines.lines->len) {
                return FLUID_OK;
            }
            current_line += 1;
            current_panel = (current_panel + 1) % 2;

            // set up new line as current line
            char *lyric =  event->paramptr;
            current_lyric = g_array_index(lyric_lines.lines, GString *, current_line)->str;
            front_of_lyric = g_string_new(lyric+1); // lose \
            printf("New line. Setting front to %s end to \"%s\"\n", lyric+1, current_lyric);

            coloured_lines[current_panel].line = current_lyric;
            coloured_lines[current_panel].front_of_line = lyric+1;
            markup_line(coloured_lines+current_panel);
#ifdef USE_PIXBUF
            update_line_pixbuf(coloured_lines+current_panel);
#endif
            // update label for next line after this one
            next_lyric = g_array_index(lyric_lines.lines, GString *, next_line)->str;

            marked_up_label = g_string_new(markup_newline[0]);

            g_string_append(marked_up_label, next_lyric);
            g_string_append(marked_up_label, markup_newline[1]);
            PangoAttrList *attrs;
            gchar *text;
            pango_parse_markup (marked_up_label->str, -1,0, &attrs, &text, NULL, NULL);

            coloured_lines[next_panel].line = next_lyric;
            coloured_lines[next_panel].front_of_line = "";
            markup_line(coloured_lines+next_panel);
#ifdef USE_PIXBUF
            update_line_pixbuf(coloured_lines+next_panel);
#endif
        } else {
            // change text colour as chars are played
            char *lyric =  event->paramptr;
            if ((front_of_lyric != NULL) && (lyric != NULL)) {
                g_string_append(front_of_lyric, lyric);
                char *s = front_of_lyric->str;
                coloured_lines[current_panel].front_of_line = s;
                markup_line(coloured_lines+current_panel);
#ifdef USE_PIXBUF
                update_line_pixbuf(coloured_lines+current_panel);
#endif
            }
        }
        return  FLUID_OK;

    case MIDI_LYRIC:
        printf("Callback: Playing lyric event %d %s\n", (int) event->param1, (char *) event->paramptr);
        return  FLUID_OK;

    case MIDI_EOT:
        printf("End of track\n");
        exit(0);
    }
    return fluid_synth_handle_midi_event( data, event);
}

void build_lyric_lines() {
    int n;
    lyric_t *plyric;
    GString *line = g_string_new("");
    GArray *lines =  g_array_sized_new(FALSE, FALSE, sizeof(GString *), 64);

    lyric_lines.title = NULL;

    for (n = 0; n < lyrics->len; n++) {
        plyric = g_array_index(lyrics, lyric_t *, n);
        gchar *lyric = plyric->lyric;
        int tick = plyric->tick;

        if ((strlen(lyric) >= 2) && (lyric[0] == '@') && (lyric[1] == 'L')) {
            lyric_lines.language =  lyric + 2;
            continue;
        }

        if ((strlen(lyric) >= 2) && (lyric[0] == '@') && (lyric[1] == 'T')) {
            if (lyric_lines.title == NULL) {
                lyric_lines.title = lyric + 2;
            } else {
                lyric_lines.performer = lyric + 2;
            }
            continue;
        }

        if (lyric[0] == '@') {
            // some other stuff like @KMIDI KARAOKE FILE
            continue;
        }

        if ((lyric[0] == '/') || (lyric[0] == '\\')) {
            // start of a new line
            // add to lines
            g_array_append_val(lines, line);
            line = g_string_new(lyric + 1);
        }  else {
            line = g_string_append(line, lyric);
        }
    }
    lyric_lines.lines = lines;

    printf("Title is %s, performer is %s, language is %s\n",
           lyric_lines.title, lyric_lines.performer, lyric_lines.language);
    for (n = 0; n < lines->len; n++) {
        printf("Line is %s\n", g_array_index(lines, GString *, n)->str);
    }

}

/**
 * This is called whenever new data is loaded, such as a new file.
 * Here we extract the TEXT and LYRIC events and just print them
 * to stdout. They could e.g. be saved and displayed in a GUI
 * as the events are received by the event callback.
 */
int onload_callback(void *data, fluid_player_t *player) {
    long ticks = 0L;
    lyric_t *plyric;

    printf("Load callback, tracks %d \n", player->ntracks);
    int n;
    for (n = 0; n < player->ntracks; n++) {
        fluid_track_t *track = player->track[n];
        printf("Track %d\n", n);
        fluid_midi_event_t *event = fluid_track_first_event(track);
        while (event != NULL) {
            switch (fluid_midi_event_get_type (event)) {
            case MIDI_TEXT:
            case MIDI_LYRIC:
                /* there's no fluid_midi_event_get_sysex()
                   or fluid_midi_event_get_time() so we
                   have to look inside the opaque struct
                */
                ticks += event->dtime;
                printf("Loaded event %s for time %ld\n",
                       (char *) event->paramptr,
                       ticks);
                plyric = g_new(lyric_t, 1);
                plyric->lyric = g_strdup(event->paramptr);
                plyric->tick = ticks;
                g_array_append_val(lyrics, plyric);
            }
            event = fluid_track_next_event(track);
        }
    }

    printf("Saved %d lyric events\n", lyrics->len);
    for (n = 0; n < lyrics->len; n++) {
        plyric = g_array_index(lyrics, lyric_t *, n);
        printf("Saved lyric %s at %ld\n", plyric->lyric, plyric->tick);
    }

    build_lyric_lines();

    return FLUID_OK;
}

static void overlay_lyric(cairo_t *cr,
                          coloured_line_t *line,
                          int ht) {
    PangoLayout *layout;
    int height, width;

    if (line->line == NULL) {
        return;
    }

    layout = pango_cairo_create_layout (cr);
    pango_layout_set_text (layout, line->line, -1);
    pango_layout_set_attributes(layout, line->attrs);
    pango_layout_get_pixel_size(layout,
                                &width,
                                &height);
    cairo_move_to(cr, (720-width)/2, ht);

    pango_cairo_update_layout (cr, layout);
    pango_cairo_show_layout (cr, layout);

    g_object_unref(layout);
}

static void pixmap_destroy_notify(guchar *pixels,
                                  gpointer data) {
    printf("Ddestroy pixmap\n");
}

static void *play_background(void *args) {
    /* based on code from
       http://www.cs.dartmouth.edu/∼xy/cs23/gtk.html
       http://cdry.wordpress.com/2009/09/09/using-custom-io-callbacks-with-ffmpeg/
    */

    int i;
    AVPacket packet;
    int frameFinished;
    AVFrame *pFrame = NULL;

    int oldSize;
    char *oldData;
    int bytesDecoded;
    GdkPixbuf *pixbuf;
    AVFrame *picture_RGB;
    char *buffer;

#if GTK_MAJOR_VERSION == 2
    GdkPixmap *pixmap;
    GdkBitmap *mask;
#endif

    pFrame=avcodec_alloc_frame();

    i=0;
    picture_RGB = avcodec_alloc_frame();
    buffer = malloc (avpicture_get_size(PIX_FMT_RGB24, 720, 576));
    avpicture_fill((AVPicture *)picture_RGB, buffer, PIX_FMT_RGB24, 720, 576);

    while(av_read_frame(pFormatCtx, &packet)>=0) {
        if(packet.stream_index==videoStream) {
            //printf("Frame %d\n", i++);
            usleep(33670);  // 29.7 frames per second
            // Decode video frame
            avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished,
                                  &packet);
            int width = pCodecCtx->width;
            int height = pCodecCtx->height;

            sws_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height, PIX_FMT_RGB24, SWS_BICUBIC, NULL, NULL, NULL);

            if (frameFinished) {
                printf("Frame %d\n", i++);

                sws_scale(sws_ctx,  (uint8_t const * const *) pFrame->data, pFrame->linesize, 0, pCodecCtx->height, picture_RGB->data, picture_RGB->linesize);

                pixbuf = gdk_pixbuf_new_from_data(picture_RGB->data[0], GDK_COLORSPACE_RGB, 0, 8, 720, 480, picture_RGB->linesize[0], pixmap_destroy_notify, NULL);

                /* start GTK thread lock for drawing */
                gdk_threads_enter();

#define SHOW_LYRIC
#ifdef SHOW_LYRIC
                // Create the destination surface
                cairo_surface_t *surface = cairo_image_surface_create (CAIRO_FORMAT_ARGB32,
                                                                       width, height);
                cairo_t *cr = cairo_create(surface);

                // draw the background image
                gdk_cairo_set_source_pixbuf(cr, pixbuf, 0, 0);
                cairo_paint (cr);

#ifdef USE_PIXBUF
                // draw the lyric
                GdkPixbuf *lyric_pixbuf = coloured_lines[current_panel].pixbuf;
                if (lyric_pixbuf != NULL) {
                    int width = gdk_pixbuf_get_width(lyric_pixbuf);
                    gdk_cairo_set_source_pixbuf(cr,
                                                lyric_pixbuf,
                                                (720-width)/2,
                                                 height_lyric_pixbufs[current_panel]);
                    cairo_paint (cr);
                }

                int next_panel = (current_panel+1) % 2;
                lyric_pixbuf = coloured_lines[next_panel].pixbuf;
                if (lyric_pixbuf != NULL) {
                    int width = gdk_pixbuf_get_width(lyric_pixbuf);
                    gdk_cairo_set_source_pixbuf(cr,
                                                lyric_pixbuf,
                                                (720-width)/2,
                                                 height_lyric_pixbufs[next_panel]);
                    cairo_paint (cr);
                }
#else

                overlay_lyric(cr,
                              coloured_lines+current_panel,
                              height_lyric_pixbufs[current_panel]);

                int next_panel = (current_panel+1) % 2;
                overlay_lyric(cr,
                              coloured_lines+next_panel,
                              height_lyric_pixbufs[next_panel]);
#endif
                pixbuf = gdk_pixbuf_get_from_surface(surface,
                                                     0,
                                                     0,
                                                     width,
                                                     height);

                gtk_image_set_from_pixbuf((GtkImage*) image, pixbuf);

                g_object_unref(pixbuf);         /* reclaim memory */
                //g_object_unref(layout);
                cairo_surface_destroy(surface);
                cairo_destroy(cr);
#else
        gtk_image_set_from_pixbuf((GtkImage*) image, pixbuf);
#endif /* SHOW_LYRIC */

                /* release GTK thread lock */
                gdk_threads_leave();
            }
        }
        av_free_packet(&packet);
    }
    sws_freeContext(sws_ctx);

    printf("Video over!\n");
    exit(0);
}

static void *play_midi(void *args) {
    fluid_player_play(player);

    printf("Audio finished\n");
    //exit(0);
}

/* Called when the windows are realized
 */
static void realize_cb (GtkWidget *widget, gpointer data) {
    /* start the video playing in its own thread */
    pthread_t tid;
    pthread_create(&tid, NULL, play_background, NULL);

    /* start the MIDI file playing in its own thread */
    pthread_t tid_midi;
    pthread_create(&tid_midi, NULL, play_midi, NULL);
}

static gboolean delete_event( GtkWidget *widget,
                              GdkEvent  *event,
                              gpointer   data )
{
    /* If you return FALSE in the "delete-event" signal handler,
     * GTK will emit the "destroy" signal. Returning TRUE means
     * you don't want the window to be destroyed.
     * This is useful for popping up 'are you sure you want to quit?'
     * type dialogs. */

    g_print ("delete event occurred\n");

    /* Change TRUE to FALSE and the main window will be destroyed with
     * a "delete-event". */

    return TRUE;
}

/* Another callback */
static void destroy( GtkWidget *widget,
                     gpointer   data )
{
    gtk_main_quit ();
}

int main(int argc, char** argv)
{
    XInitThreads();

    int i;

    fluid_settings_t* settings;

    fluid_audio_driver_t* adriver;
    settings = new_fluid_settings();
    fluid_settings_setstr(settings, "audio.driver", "alsa");
    //fluid_settings_setint(settings, "lash.enable", 0);
    fluid_settings_setint(settings, "synth.polyphony", 64);
    fluid_settings_setint(settings, "synth.reverb.active", FALSE);
    fluid_settings_setint(settings, "synth.sample-rate", 22050);
    synth = new_fluid_synth(settings);
    player = new_fluid_player(synth);

    lyrics = g_array_sized_new(FALSE, FALSE, sizeof(lyric_t *), 1024);

    /* Set the MIDI event callback to our own functions rather than the system default */
    fluid_player_set_playback_callback(player, event_callback, synth);

    /* Add an onload callback so we can get information from new data before it plays */
    fluid_player_set_onload_callback(player, onload_callback, NULL);

    adriver = new_fluid_audio_driver(settings, synth);

    /* process command line arguments */
    for (i = 1; i < argc; i++) {
        if (fluid_is_soundfont(argv[i])) {
            fluid_synth_sfload(synth, argv[1], 1);
        } else {
            fluid_player_add(player, argv[i]);
        }
    }

    /* FFMpeg stuff */

    AVFrame *pFrame = NULL;
    AVPacket packet;

    AVDictionary *optionsDict = NULL;

    av_register_all();

    if(avformat_open_input(&pFormatCtx, "short.mpg", NULL, NULL)!=0) {
        printf("Couldn't open video file\n");
        return -1; // Couldn't open file
    }

    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0) {
        printf("Couldn't find stream information\n");
        return -1; // Couldn't find stream information
    }

    // Dump information about file onto standard error
    av_dump_format(pFormatCtx, 0, argv[1], 0);

    // Find the first video stream
    videoStream=-1;
    for(i=0; i<pFormatCtx->nb_streams; i++)
        if(pFormatCtx->streams[i]->codec->codec_type==AVMEDIA_TYPE_VIDEO) {
            videoStream=i;
            break;
        }
    if(videoStream==-1)
        return -1; // Didn't find a video stream

    for(i=0; i<pFormatCtx->nb_streams; i++)
        if(pFormatCtx->streams[i]->codec->codec_type==AVMEDIA_TYPE_AUDIO) {
            printf("Found an audio stream too\n");
            break;
        }

    // Get a pointer to the codec context for the video stream
    pCodecCtx=pFormatCtx->streams[videoStream]->codec;

    // Find the decoder for the video stream
    pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
    if(pCodec==NULL) {
        fprintf(stderr, "Unsupported codec!\n");
        return -1; // Codec not found
    }

    // Open codec
    if(avcodec_open2(pCodecCtx, pCodec, &optionsDict)<0) {
        printf("Could not open codec\n");
        return -1; // Could not open codec
    }

    sws_ctx =
        sws_getContext
        (
         pCodecCtx->width,
         pCodecCtx->height,
         pCodecCtx->pix_fmt,
         pCodecCtx->width,
         pCodecCtx->height,
         PIX_FMT_YUV420P,
         SWS_BILINEAR,
         NULL,
         NULL,
         NULL
         );

    /* GTK stuff now */

    /* GtkWidget is the storage type for widgets */
    GtkWidget *window;
    GtkWidget *button;
    GtkWidget *lyrics_box;

    /* This is called in all GTK applications. Arguments are parsed
     * from the command line and are returned to the application. */
    gtk_init (&argc, &argv);

    /* create a new window */
    window = gtk_window_new (GTK_WINDOW_TOPLEVEL);

    /* When the window is given the "delete-event" signal (this is given
     * by the window manager, usually by the "close" option, or on the
     * titlebar), we ask it to call the delete_event () function
     * as defined above. The data passed to the callback
     * function is NULL and is ignored in the callback function. */
    g_signal_connect (window, "delete-event",
                      G_CALLBACK (delete_event), NULL);

    /* Here we connect the "destroy" event to a signal handler.
     * This event occurs when we call gtk_widget_destroy() on the window,
     * or if we return FALSE in the "delete-event" callback. */
    g_signal_connect (window, "destroy",
                      G_CALLBACK (destroy), NULL);

    g_signal_connect (window, "realize", G_CALLBACK (realize_cb), NULL);

    /* Sets the border width of the window. */
    gtk_container_set_border_width (GTK_CONTAINER (window), 10);

    lyrics_box = gtk_vbox_new(TRUE, 1);
    gtk_widget_show(lyrics_box);

    /*
    char *str = "     ";
    lyric_labels[0] = gtk_label_new(str);
    str =  "World";
    lyric_labels[1] = gtk_label_new(str);
    */

    image = gtk_image_new();

    //image_drawable = gtk_drawing_area_new();
    //gtk_widget_set_size_request (canvas, 720, 480);
    //gtk_drawing_area_size((GtkDrawingArea *) image_drawable, 720, 480);

    //gtk_widget_show (lyric_labels[0]);
    //gtk_widget_show (lyric_labels[1]);

    gtk_widget_show (image);

    //gtk_box_pack_start (GTK_BOX (lyrics_box), lyric_labels[0], TRUE, TRUE, 0);
    //gtk_box_pack_start (GTK_BOX (lyrics_box), lyric_labels[1], TRUE, TRUE, 0);
    gtk_box_pack_start (GTK_BOX (lyrics_box), image, TRUE, TRUE, 0);
    //gtk_box_pack_start (GTK_BOX (lyrics_box), canvas, TRUE, TRUE, 0);
    //gtk_box_pack_start (GTK_BOX (lyrics_box), image_drawable, TRUE, TRUE, 0);

    /* This packs the button into the window (a gtk container). */
    gtk_container_add (GTK_CONTAINER (window), lyrics_box);

    /* and the window */
    gtk_widget_show (window);

    /* All GTK applications must have a gtk_main(). Control ends here
     * and waits for an event to occur (like a key press or
     * mouse event). */
    gtk_main ();

    return 0;

    /* wait for playback termination */
    fluid_player_join(player);
    /* cleanup */
    delete_fluid_audio_driver(adriver);
    delete_fluid_player(player);
    delete_fluid_synth(synth);
    delete_fluid_settings(settings);

    return 0;
}