协议:CC BY-NC-SA 4.0
九、Java 声音
本章讲述了使用 Java Sound API 对采样数据进行编程的要点。本章假设读者具备良好的 Java 应用知识。Java 声音在 Java 早期就已经存在了。它处理采样和 MIDI 数据,是一个综合的系统。
资源
许多资源可用于 Java Sound。
- Java 平台标准版 7 API 规范(
http://docs.oracle.com/javase/7/docs/api/
)是所有标准 Java APIs 的参考点,包括javax.sound.sampled
。 - Java 教程(
http://docs.oracle.com/javase/tutorial/sound/index.html
)中的“Trail: Sound”教程很好地概述了 sampled 和 MIDI 包。 - Java Sound Resources (
www.jsresources.org/faq_audio.html
)上关于音频编程的常见问题解答了很多关于 Java Sound 的问题。 - 声音组(
http://openjdk.java.net/groups/sound/
)由设计、实现和维护各种 OpenJDK 声音组件的开发人员组成。您可以在开源社区中找到更多关于 Java Sound 正在进行的开发的信息。
关键 Java 声音类
这些是关键类:
AudioSystem
类是所有采样音频类的入口点。AudioFormat
类指定了关于格式的信息,比如采样率。AudioInputStream
类从混合器的目标行提供一个输入流。Mixer
类代表一个音频设备。SourceDataLine
类代表一个设备的输入线。TargetDataLine
类代表一个设备的输出线。
关于设备的信息
每个设备由一个Mixer
对象表示。向AudioSystem
询问这些内容的列表。每个混音器都有一组目标(输出)线和源(输入)线。分别询问每个混音器。下面这个节目叫做DeviceInfo.java
:
import javax.sound.sampled.*;
public class DeviceInfo {
public static void main(String[] args) throws Exception {
Mixer.Info[] minfoSet = AudioSystem.getMixerInfo();
System.out.println("Mixers:");
for (Mixer.Info minfo: minfoSet) {
System.out.println(" " + minfo.toString());
Mixer m = AudioSystem.getMixer(minfo);
System.out.println(" Mixer: " + m.toString());
System.out.println(" Source lines");
Line.Info[] slines = m.getSourceLineInfo();
for (Line.Info s: slines) {
System.out.println(" " + s.toString());
}
Line.Info[] tlines = m.getTargetLineInfo();
System.out.println(" Target lines");
for (Line.Info t: tlines) {
System.out.println(" " + t.toString());
}
}
}
}
以下是我的系统上的部分输出:
Mixers:
PulseAudio Mixer, version 0.02
Source lines
interface SourceDataLine supporting 42 audio formats, and buffers of 0 to 1000000 bytes
interface Clip supporting 42 audio formats, and buffers of 0 to 1000000 bytes
Target lines
interface TargetDataLine supporting 42 audio formats, and buffers of 0 to 1000000 bytes
default [default], version 1.0.24
Source lines
interface SourceDataLine supporting 512 audio formats, and buffers of at least 32 bytes
interface Clip supporting 512 audio formats, and buffers of at least 32 bytes
Target lines
interface TargetDataLine supporting 512 audio formats, and buffers of at least 32 bytes
PCH [plughw:0,0], version 1.0.24
Source lines
interface SourceDataLine supporting 24 audio formats, and buffers of at least 32 bytes
interface Clip supporting 24 audio formats, and buffers of at least 32 bytes
Target lines
interface TargetDataLine supporting 24 audio formats, and buffers of at least 32 bytes
NVidia [plughw:1,3], version 1.0.24
Source lines
interface SourceDataLine supporting 96 audio formats, and buffers of at least 32 bytes
interface Clip supporting 96 audio formats, and buffers of at least 32 bytes
Target lines
NVidia [plughw:1,7], version 1.0.24
Source lines
interface SourceDataLine supporting 96 audio formats, and buffers of at least 32 bytes
interface Clip supporting 96 audio formats, and buffers of at least 32 bytes
Target lines
NVidia [plughw:1,8], version 1.0.24
Source lines
interface SourceDataLine supporting 96 audio formats, and buffers of at least 32 bytes
interface Clip supporting 96 audio formats, and buffers of at least 32 bytes
Target lines
这显示了脉冲音频和 ALSA 混频器。例如,进一步的查询可以显示支持的格式。
播放文件中的音频
要从文件中播放,必须创建适当的对象,以便从文件中读取和写入输出设备。这些措施如下:
- 从
AudioSystem
请求一个AudioInputStream
。它是用文件名作为参数创建的。 - 为输出创建源数据行。术语可能会混淆:程序产生输出,但这是数据线的输入。因此,数据线必须是输出设备的源。数据线的创建是一个多步骤的过程。
- 首先创建一个
AudioFormat
对象来指定数据线的参数。 - 为 audion 格式的源数据线创建一个
DataLine.Info
。 - 从将处理
DataLine.Info
的AudioSystem
请求源数据线。
- 首先创建一个
按照这些步骤,可以从输入流中读取数据,并将其写入数据线。图 9-1 显示了相关类的 UML 类图。
图 9-1。
Class diagram for playing audio from a file
import java.io.File;
import java.io.IOException;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.FloatControl;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
public class PlayAudioFile {
/** Plays audio from given file names. */
public static void main(String [] args) {
// Check for given sound file names.
if (args.length < 1) {
System.out.println("Usage: java Play <sound file names>*");
System.exit(0);
}
// Process arguments.
for (int i = 0; i < args.length; i++)
playAudioFile(args[i]);
// Must exit explicitly since audio creates non-daemon threads.
System.exit(0);
} // main
public static void playAudioFile(String fileName) {
File soundFile = new File(fileName);
try {
// Create a stream from the given file.
// Throws IOException or UnsupportedAudioFileException
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(soundFile);
// AudioSystem.getAudioInputStream(inputStream); // alternate audio stream from inputstream
playAudioStream(audioInputStream);
} catch (Exception e) {
System.out.println("Problem with file " + fileName + ":");
e.printStackTrace();
}
} // playAudioFile
/** Plays audio from the given audio input stream. */
public static void playAudioStream(AudioInputStream audioInputStream) {
// Audio format provides information like sample rate, size, channels.
AudioFormat audioFormat = audioInputStream.getFormat();
System.out.println("Play input audio format=" + audioFormat);
// Open a data line to play our type of sampled audio.
// Use SourceDataLine for play and TargetDataLine for record.
DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
if (!AudioSystem.isLineSupported(info)) {
System.out.println("Play.playAudioStream does not handle this type of audio on this system.");
return;
}
try {
// Create a SourceDataLine for play back (throws LineUnavailableException).
SourceDataLine dataLine = (SourceDataLine) AudioSystem.getLine(info);
// System.out.println("SourceDataLine class=" + dataLine.getClass());
// The line acquires system resources (throws LineAvailableException).
dataLine.open(audioFormat);
// Adjust the volume on the output line.
if(dataLine.isControlSupported(FloatControl.Type.MASTER_GAIN)) {
FloatControl volume = (FloatControl) dataLine.getControl(FloatControl.Type.MASTER_GAIN);
volume.setValue(6.0F);
}
// Allows the line to move data in and out to a port.
dataLine.start();
// Create a buffer for moving data from the audio stream to the line.
int bufferSize = (int) audioFormat.getSampleRate() * audioFormat.getFrameSize();
byte [] buffer = new byte[ bufferSize ];
// Move the data until done or there is an error.
try {
int bytesRead = 0;
while (bytesRead >= 0) {
bytesRead = audioInputStream.read(buffer, 0, buffer.length);
if (bytesRead >= 0) {
// System.out.println("Play.playAudioStream bytes read=" + bytesRead +
// ", frame size=" + audioFormat.getFrameSize() + ", frames read=" + bytesRead / audioFormat.getFrameSize());
// Odd sized sounds throw an exception if we don't write the same amount.
int framesWritten = dataLine.write(buffer, 0, bytesRead);
}
} // while
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Play.playAudioStream draining line.");
// Continues data line I/O until its buffer is drained.
dataLine.drain();
System.out.println("Play.playAudioStream closing line.");
// Closes the data line, freeing any resources such as the audio device.
dataLine.close();
} catch (LineUnavailableException e) {
e.printStackTrace();
}
} // playAudioStream
} // PlayAudioFile
将音频录制到文件
做这件事的大部分工作是准备音频输入流。一旦完成,方法AudioSystem
的write
将从音频输入流复制输入到输出文件。
要准备音频输入流,请执行以下步骤:
- 创建一个描述输入参数的
AudioFormat
对象。 - 麦克风产生音频。所以,它需要一个
TargetDataLine
。所以,为目标数据线创建一个DataLine.Info
。 - 向
AudioSystem
询问满足信息的行。 - 用
AudioInputStream
把线包起来。
输出只是一个 Java File
。
然后使用AudioSystem
函数write()
将流复制到文件中。图 9-2 显示了 UML 类图。
图 9-2。
UML diagram for recording audio to a file
该计划如下:
import javax.sound.sampled.*;
import java.io.File;
/**
* Sample audio recorder
*/
public class Recorder extends Thread
{
/**
* The TargetDataLine that we’ll use to read data from
*/
private TargetDataLine line;
/**
* The audio format type that we’ll encode the audio data with
*/
private AudioFileFormat.Type targetType = AudioFileFormat.Type.WAVE;
/**
* The AudioInputStream that we’ll read the audio data from
*/
private AudioInputStream inputStream;
/**
* The file that we’re going to write data out to
*/
private File file;
/**
* Creates a new Audio Recorder
*/
public Recorder(String outputFilename)
{
try {
// Create an AudioFormat that specifies how the recording will be performed
// In this example we’ll 44.1Khz, 16-bit, stereo
AudioFormat audioFormat = new AudioFormat(
AudioFormat.Encoding.PCM_SIGNED, // Encoding technique
44100.0F, // Sample Rate
16, // Number of bits in each channel
2, // Number of channels (2=stereo)
4, // Number of bytes in each frame
44100.0F, // Number of frames per second
false); // Big-endian (true) or little-
// endian (false)
// Create our TargetDataLine that will be used to read audio data by first
// creating a DataLine instance for our audio format type
DataLine.Info info = new DataLine.Info(TargetDataLine.class, audioFormat);
// Next we ask the AudioSystem to retrieve a line that matches the
// DataLine Info
this.line = (TargetDataLine)AudioSystem.getLine(info);
// Open the TargetDataLine with the specified format
this.line.open(audioFormat);
// Create an AudioInputStream that we can use to read from the line
this.inputStream = new AudioInputStream(this.line);
// Create the output file
this.file = new File(outputFilename);
}
catch(Exception e) {
e.printStackTrace();
}
}
public void startRecording() {
// Start the TargetDataLine
this.line.start();
// Start our thread
start();
}
public void stopRecording() {
// Stop and close the TargetDataLine
this.line.stop();
this.line.close();
}
public void run() {
try {
// Ask the AudioSystem class to write audio data from the audio input stream
// to our file in the specified data type (PCM 44.1Khz, 16-bit, stereo)
AudioSystem.write(this.inputStream, this.targetType, this.file);
}
catch(Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
if (args.length == 0) {
System.out.println("Usage: Recorder <filename>");
System.exit(0);
}
try {
// Create a recorder that writes WAVE data to the specified filename
Recorder r = new Recorder(args[0]);
System.out.println("Press ENTER to start recording");
System.in.read();
// Start the recorder
r.startRecording();
System.out.println("Press ENTER to stop recording");
System.in.read();
// Stop the recorder
r.stopRecording();
System.out.println("Recording complete");
}
catch(Exception e) {
e.printStackTrace();
}
}
}
向扬声器播放麦克风
这是前两个节目的结合。准备一个AudioInputStream
用于从麦克风读取。一个SourceDataLine
是准备写给演讲者的。通过从音频输入流读取数据并写入源数据线,将数据从第一个复制到第二个。图 9-3 显示了 UML 类图。
图 9-3。
UML diagram for sending microphone input to a speaker
该计划如下:
import java.io.File;
import java.io.IOException;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.Line;
import javax.sound.sampled.Line.Info;
import javax.sound.sampled.TargetDataLine;
import javax.sound.sampled.FloatControl;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
public class PlayMicrophone {
private static final int FRAMES_PER_BUFFER = 1024;
public static void main(String[] args) throws Exception {
new PlayMicrophone().
playAudio();
}
private void out(String strMessage)
{
System.out.println(strMessage);
}
//This method creates and returns an
// AudioFormat object for a given set of format
// parameters. If these parameters don't work
// well for you, try some of the other
// allowable parameter values, which are shown
// in comments following the declarations.
private AudioFormat getAudioFormat(){
float sampleRate = 44100.0F; //8000,11025,16000,22050,44100
int sampleSizeInBits = 16; //8,16
int channels = 1; //1,2
boolean signed = true; //true,false
boolean bigEndian = false; //true,false
return new AudioFormat(sampleRate,
sampleSizeInBits,
channels,
signed,
bigEndian);
}//end getAudioFormat
public void playAudio() throws Exception {
AudioFormat audioFormat;
TargetDataLine targetDataLine;
audioFormat = getAudioFormat();
DataLine.Info dataLineInfo =
new DataLine.Info(
TargetDataLine.class,
audioFormat);
targetDataLine = (TargetDataLine)
AudioSystem.getLine(dataLineInfo);
/*
Line.Info lines[] = AudioSystem.getTargetLineInfo(dataLineInfo);
for (int n = 0; n < lines.length; n++) {
System.out.println("Target " + lines[n].toString() + " " + lines[n].getLineClass());
}
targetDataLine = (TargetDataLine)
AudioSystem.getLine(lines[0]);
*/
targetDataLine.open(audioFormat,
audioFormat.getFrameSize() * FRAMES_PER_BUFFER);
targetDataLine.start();
playAudioStream(new AudioInputStream(targetDataLine));
/*
File soundFile = new File( fileName );
try {
// Create a stream from the given file.
// Throws IOException or UnsupportedAudioFileException
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream( soundFile );
// AudioSystem.getAudioInputStream( inputStream ); // alternate audio stream from inputstream
playAudioStream( audioInputStream );
} catch ( Exception e ) {
System.out.println( "Problem with file " + fileName + ":" );
e.printStackTrace();
}
*/
} // playAudioFile
/** Plays audio from the given audio input stream. */
public void playAudioStream( AudioInputStream audioInputStream ) {
// Audio format provides information like sample rate, size, channels.
AudioFormat audioFormat = audioInputStream.getFormat();
System.out.println( "Play input audio format=" + audioFormat );
// Open a data line to play our type of sampled audio.
// Use SourceDataLine for play and TargetDataLine for record.
DataLine.Info info = new DataLine.Info( SourceDataLine.class, audioFormat );
Line.Info lines[] = AudioSystem.getSourceLineInfo(info);
for (int n = 0; n < lines.length; n++) {
System.out.println("Source " + lines[n].toString() + " " + lines[n].getLineClass());
}
if ( !AudioSystem.isLineSupported( info ) ) {
System.out.println( "Play.playAudioStream does not handle this type of audio on this system." );
return;
}
try {
// Create a SourceDataLine for play back (throws LineUnavailableException).
SourceDataLine dataLine = (SourceDataLine) AudioSystem.getLine( info );
// System.out.println( "SourceDataLine class=" + dataLine.getClass() );
// The line acquires system resources (throws LineAvailableException).
dataLine.open( audioFormat,
audioFormat.getFrameSize() * FRAMES_PER_BUFFER);
// Adjust the volume on the output line.
if( dataLine.isControlSupported( FloatControl.Type.MASTER_GAIN ) ) {
FloatControl volume = (FloatControl) dataLine.getControl( FloatControl.Type.MASTER_GAIN );
volume.setValue( 6.0F );
}
// Allows the line to move data in and out to a port.
dataLine.start();
// Create a buffer for moving data from the audio stream to the line.
int bufferSize = (int) audioFormat.getSampleRate() * audioFormat.getFrameSize();
bufferSize = audioFormat.getFrameSize() * FRAMES_PER_BUFFER;
System.out.println("Buffer size: " + bufferSize);
byte [] buffer = new byte[ bufferSize ];
// Move the data until done or there is an error.
try {
int bytesRead = 0;
while ( bytesRead >= 0 ) {
bytesRead = audioInputStream.read( buffer, 0, buffer.length );
if ( bytesRead >= 0 ) {
System.out.println( "Play.playAudioStream bytes read=" + bytesRead +
", frame size=" + audioFormat.getFrameSize() + ", frames read=" + bytesRead / audioFormat.getFrameSize() );
// Odd sized sounds throw an exception if we don't write the same amount.
int framesWritten = dataLine.write( buffer, 0, bytesRead );
}
} // while
} catch ( IOException e ) {
e.printStackTrace();
}
System.out.println( "Play.playAudioStream draining line." );
// Continues data line I/O until its buffer is drained.
dataLine.drain();
System.out.println( "Play.playAudioStream closing line." );
// Closes the data line, freeing any resources such as the audio device.
dataLine.close();
} catch ( LineUnavailableException e ) {
e.printStackTrace();
}
} // playAudioStream
}
JavaSound 从哪里获得设备?
本章的第一个程序显示了调音台设备及其属性的列表。Java 是如何获得这些信息的?本节涵盖了 JDK 1.8,OpenJDK 大概也会类似。您将需要来自 Oracle 的 Java 源代码来跟踪这一点。或者,继续前进。
文件jre/lib/resources.jar
包含 JRE 运行时使用的资源列表。这是一个 zip 文件,包含文件META-INF/services/javax.sound.sampled.spi.MixerProvider
。在我的系统上,这个文件的内容如下:
# last mixer is default mixer
com.sun.media.sound.PortMixerProvider
com.sun.media.sound.DirectAudioDeviceProvider
类com.sun.media.sound.PortMixerProvider
在我系统上的文件java/media/src/share/native/com/sun/media/sound/PortMixerProvider.java
中。它扩展了MixerProvider
,并实现了Mixer.Info[] getMixerInfo
等方法。这个类存储设备信息。
这个类完成的大部分工作实际上是由 C 文件java/media/src/share/native/com/sun/media/sound/PortMixerProvider.c
中的本地方法执行的,它实现了PortMixerProvider
类使用的两个方法nGetNumDevices
和nNewPortMixerInfo
。不幸的是,在这个 C 文件中找不到多少乐趣,因为它只是调用 C 函数PORT_GetPortMixerCount
和PORT_GetPortMixerDescription
。
有三个文件包含这些函数。
java/media/src/windows/native/com/sun/media/sound/PLATFORM_API_WinOS_Ports.c
java/media/src/solaris/native/com/sun/media/sound/PLATFORM_API_SolarisOS_Ports.c
java/media/src/solaris/native/com/sun/media/sound/PLATFORM_API_LinuxOS_ALSA_Ports.c
在文件PLATFORM_API_LinuxOS_ALSA_Ports.c
中,你会看到第五章中描述的对 ALSA 的函数调用。这些调用填充了供 JavaSound 使用的 ALSA 设备的信息。
结论
Java Sound API 是有据可查的。我在这里展示了四个简单的程序,但是更复杂的程序也是可能的。简要讨论了与基本音响系统的联系。
十、GStreamer
GStreamer 是一个组件库,可以在复杂的管道中连接在一起。它可以用于过滤、转换格式和混音。它可以处理音频和视频格式,但本章只讨论音频。它着眼于使用 GStreamer 的用户级机制,以及链接 GStreamer 组件的编程模型。为编写新组件提供了参考。
资源
以下是一些资源:
- 用 GStreamer (
www.ibm.com/developerworks/aix/library/au-gstreamer.html?ca=dgr-lnxw07GStreamer
)进行多用途多媒体处理 - GStreamer 插件引用(
http://gstreamer.freedesktop.org/documentation/
) - 插件的“GStreamer 编写器指南”(1.9.90) (
https://gstreamer.freedesktop.org/data/doc/gstreamer/head/pwg/html/index.html
)
概观
GStreamer 使用管道模型来连接元素,这些元素是源、过滤器和接收器。图 10-1 为模型。
图 10-1。
GStreamer pipeline model
每个元素有零个或多个焊盘,可以是产生数据的源焊盘,也可以是消耗数据的宿焊盘,如图 10-2 所示。
图 10-2。
GStreamer source and sink pads
pad 可以是静态的,也可以是响应事件而动态创建或销毁的。例如,要处理一个容器文件(如 MP4 ),元素必须先读取足够多的文件内容,然后才能确定所包含对象的格式,如 H.264 视频。完成后,它可以为下一阶段创建一个源 pad 来使用数据。
GStreamer 并不局限于像命令语言bash
这样的线性流水线。例如,解复用器可能需要将音频和视频分开,并分别进行处理,如图 10-3 所示。
图 10-3。
Complex GStreamer pipeline
元素遵循状态模型,如下所示:
GST_STATE_NULL
GST_STATE_READY
GST_STATE_PAUSED
GST_STATE_PLAYING
通常会创建元素并将其从NULL
移动到PLAYING
。其他状态可以进行更精确的控制。
元素还可以生成包含数据流状态信息的事件。事件通常在内部处理,但也可能被监视,例如表示数据流结束或数据流格式的事件。
插件是可加载的代码块。通常,一个插件包含单个元素的实现,但它可能包含更多元素。
每个 pad 都有一个相关的功能列表。每个功能都是关于 pad 可以处理什么的陈述。这包括有关数据类型(例如,audio/raw
)、格式(S32LE、U32LE、S16LE、U16LE 等)、数据速率(例如,每秒 1-2147483647 位)等信息。当源焊盘链接到宿焊盘时,这些能力用于确定元件将如何通信。
命令行处理
处理 GStreamer 有三个层次:通过使用命令行,通过编写 C 程序(或者 Python、Perl、C++等等)链接元素,或者通过编写新元素。本节介绍命令行工具。
商品及服务税-检查
不带参数的命令gst-inspect
(在我的 Ubuntu 系统上,gst-inspect-1.0
)显示了插件列表、它们的元素和简短描述。简要摘录如下:
...
audiomixer: liveadder: AudioMixer
audioparsers: aacparse: AAC audio stream parser
audioparsers: ac3parse: AC3 audio stream parser
audioparsers: amrparse: AMR audio stream parser
audioparsers: dcaparse: DTS Coherent Acoustics audio stream parser
audioparsers: flacparse: FLAC audio parser
audioparsers: mpegaudioparse: MPEG1 Audio Parser
audioparsers: sbcparse: SBC audio parser
audioparsers: wavpackparse: Wavpack audio stream parser
audiorate: audiorate: Audio rate adjuster
...
这表明插件audioparsers
包含许多元素,比如aacparse
,它是一个“AAC 音频流解析器”
当以插件作为参数运行时,gst-inspect
显示了关于插件的更多细节。
$gst-inspect-1.0 audioparsers
Plugin Details:
Name audioparsers
Description Parsers for various audio formats
Filename /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstaudioparsers.so
Version 1.8.1
License LGPL
Source module gst-plugins-good
Source release date 2016-04-20
Binary package GStreamer Good Plugins (Ubuntu)
Origin URL https://launchpad.net/distros/ubuntu/+source/gst-plugins-good1.0
aacparse: AAC audio stream parser
amrparse: AMR audio stream parser
ac3parse: AC3 audio stream parser
dcaparse: DTS Coherent Acoustics audio stream parser
flacparse: FLAC audio parser
mpegaudioparse: MPEG1 Audio Parser
sbcparse: SBC audio parser
wavpackparse: Wavpack audio stream parser
8 features:
+-- 8 elements
特别要注意的是,它来自模块gst-plugins-good
。插件按照稳定性、许可等进行分类。
当以元素作为参数运行时,gst-inspect
显示了关于元素的大量信息。
$gst-inspect-1.0 aacparse
Factory Details:
Rank primary + 1 (257)
Long-name AAC audio stream parser
Klass Codec/Parser/Audio
Description Advanced Audio Coding parser
Author Stefan Kost <stefan.kost@nokia.com>
Plugin Details:
Name audioparsers
Description Parsers for various audio formats
Filename /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstaudioparsers.so
Version 1.8.1
License LGPL
Source module gst-plugins-good
Source release date 2016-04-20
Binary package GStreamer Good Plugins (Ubuntu)
Origin URL https://launchpad.net/distros/ubuntu/+source/gst-plugins-good1.0
GObject
+----GInitiallyUnowned
+----GstObject
+----GstElement
+----GstBaseParse
+----GstAacParse
Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
audio/mpeg
mpegversion: { 2, 4 }
SRC template: 'src'
Availability: Always
Capabilities:
audio/mpeg
framed: true
mpegversion: { 2, 4 }
stream-format: { raw, adts, adif, loas }
Element Flags:
no flags set
Element Implementation:
Has change_state() function: gst_base_parse_change_state
Element has no clocking capabilities.
Element has no URI handling capabilities.
Pads:
SINK: 'sink'
Pad Template: 'sink'
SRC: 'src'
Pad Template: 'src'
Element Properties:
name : The name of the object
flags: readable, writable
String. Default: "aacparse0"
parent : The parent of the object
flags: readable, writable
Object of type "GstObject"
disable-passthrough : Force processing (disables passthrough)
flags: readable, writable
Boolean. Default: false
这表明它可以采用音频/mpeg 版本 2 或 4,并将数据转换为各种格式的音频/mpeg 版本 2 或 4。
GST-发现者
命令gst-discoverer
(在我的系统gst-discoverer-1.0
上)可以用来给出关于资源的信息,比如文件或者 URIs。在一个名为audio_01.ogg
的音频文件上,它给出了以下信息:
$gst-discoverer-1.0 enigma/audio_01.ogg
Analyzing file:enigma/audio_01.ogg
Done discovering file:enigma/audio_01.ogg
Topology:
container: Ogg
audio: Vorbis
Properties:
Duration: 0:02:03.586666666
Seekable: yes
Tags:
encoder: Xiph.Org libVorbis I 20020717
encoder version: 0
audio codec: Vorbis
nominal bitrate: 112001
bitrate: 112001
container format: Ogg
GST-设备-监视器
该命令可以提供关于系统中设备的大量信息:
$gst-device-monitor-1.0
Probing devices...
Device found:
name : Monitor of Built-in Audio Digital Stereo (HDMI)
class : Audio/Source
caps : audio/x-raw, format=(string){ S16LE, S16BE, F32LE, F32BE, S32LE, S32BE, S24LE, S24BE, S24_32LE, S24_32BE, U8 }, layout=(string)interleaved, rate=(int)[ 1, 2147483647 ], channels=(int)[ 1, 32 ];
audio/x-alaw, rate=(int)[ 1, 2147483647 ], channels=(int)[ 1, 32 ];
audio/x-mulaw, rate=(int)[ 1, 2147483647 ], channels=(int)[ 1, 32 ];
properties:
device.description = "Monitor\ of\ Built-in\ Audio\ Digital\ Stereo\ \(HDMI\)"
device.class = monitor
alsa.card = 0
alsa.card_name = "HDA\ Intel\ HDMI"
alsa.long_card_name = "HDA\ Intel\ HDMI\ at\ 0xf7214000\ irq\ 52"
alsa.driver_name = snd_hda_intel
device.bus_path = pci-0000:00:03.0
sysfs.path = /devices/pci0000:00/0000:00:03.0/sound/card0
device.bus = pci
device.vendor.id = 8086
device.vendor.name = "Intel\ Corporation"
device.product.id = 160c
device.product.name = "Broadwell-U\ Audio\ Controller"
device.form_factor = internal
device.string = 0
module-udev-detect.discovered = 1
device.icon_name = audio-card-pci
...
这是关于我的 HDMI 显示器的音频功能的大量信息,然后是关于我的其他设备的音频和视频功能的其他信息。
商品及服务税-播放
这个程序是一站式播放各种媒体文件和 URIs,如下:
$gst-play-1.0 enigma/audio_01.ogg
商品及服务税-推出
gst-launch
程序允许你建立一个命令管道来处理媒体数据。格式如下:
gst-launch <elmt> [<args>] ! <elmt> [<args>] ! ...
例如,要通过 ALSA 播放 WAV 文件,请使用以下命令:
$gst-launch-1.0 filesrc location=enigma/audio_01.wav ! wavparse ! alsasink
使用 GStreamer 管道最困难的部分似乎是选择合适的插件。这看起来有点像艺术。请参阅位于 http://wiki.oz9aec.net/index.php/Gstreamer_cheat_sheet
的 GStreamer 备忘单以获取帮助。
例如,Ogg 文件是一种容器格式,通常包含 Vorbis 音频流和 Theora 视频流(尽管它们可以包含其他数据格式)。它们播放音频或视频,或者两者都播放,必须使用解复用器从容器中提取流,解码,然后播放。播放音频有多种方式,包括以下三种:
$gst-launch-1.0 filesrc location=enigma/audio_01.ogg ! oggdemux ! vorbisdec ! audioconvert ! alsasink
$gst-launch-1.0 filesrc location=enigma/audio_01.ogg ! oggdemux ! vorbisdec ! autoaudiosink
$gst-launch-1.0 uridecodebin uri=file:enigma/audio_01.ogg ! audioconvert ! autoaudiosink
GStreamer 管道的语法允许将一个管道分成多个管道,例如管理音频和视频流。这在 GStreamer 的在线文档中有所介绍。
编程
同样的管道原则也适用于gst-launch
,但是当然在 C 编程级别有更多的管道需要关注。以下来自 http://docs.gstreamer.com/display/GstSDK/Basic+tutorials
的 GStreamer SDK 基础教程的程序与最后一个gst-launch
示例($gst-launch-1.0 uridecodebin uri=... ! audioconvert ! autoaudiosink
)做的一样。
GStreamer 元素是通过如下调用创建的:
data.source = gst_element_factory_make ("uridecodebin", "source");
管道是用这个建造的:
data.pipeline = gst_pipeline_new ("test-pipeline")
gst_bin_add_many (GST_BIN (data.pipeline), data.source, data.convert , data.sink, NULL);
最终所有的元素都必须被连接起来。现在,convert
和sink
可以与以下链接:
gst_element_link (data.convert, data.sink)
要播放的 URI 设置如下:
g_object_set (data.source, "uri", "http://docs.gstreamer.com/media/sintel_trailer-480p.webm", NULL);
数据源是一个容器;在我之前的例子中,它是一个 Ogg 容器,这里它是一个 web 媒体 URL。在读取足够的数据以确定数据格式和参数之前,这不会在数据源元素上创建源填充。因此,C 程序必须为pad-added
添加一个事件处理程序,它是这样做的:
g_signal_connect (data.source, "pad-added", G_CALLBACK (pad_added_handler), &data);
当一个 pad 被添加到源时,pad_added_handler
将被调用。这做了很多类型检查并得到了新的 pad,但最终完成了链接source
和convert
元素的关键步骤。
gst_pad_link (new_pad, sink_pad)
然后,应用通过将状态改变为PLAYING
开始播放,并等待正常终止(GST_MESSAGE_EOS
)或其他消息。
gst_element_set_state (data.pipeline, GST_STATE_PLAYING);
bus = gst_element_get_bus (data.pipeline);
msg = gst_bus_timed_pop_filtered (bus, GST_CLOCK_TIME_NONE,
GST_MESSAGE_STATE_CHANGED | GST_MESSAGE_ERROR | GST_MESSAGE_EOS);
代码的最后一部分进行清理。完整的程序如下:
#include <gst/gst.h>
/* Structure to contain all our information, so we can pass it to callbacks */
typedef struct _CustomData {
GstElement *pipeline;
GstElement *source;
GstElement *convert;
GstElement *sink;
} CustomData;
/* Handler for the pad-added signal */
static void pad_added_handler (GstElement *src, GstPad *pad, CustomData *data);
int main(int argc, char *argv[]) {
CustomData data;
GstBus *bus;
GstMessage *msg;
GstStateChangeReturn ret;
gboolean terminate = FALSE;
/* Initialize GStreamer */
gst_init (&argc, &argv);
/* Create the elements */
data.source = gst_element_factory_make ("uridecodebin", "source");
data.convert = gst_element_factory_make ("audioconvert", "convert");
data.sink = gst_element_factory_make ("autoaudiosink", "sink");
/* Create the empty pipeline */
data.pipeline = gst_pipeline_new ("test-pipeline");
if (!data.pipeline || !data.source || !data.convert || !data.sink) {
g_printerr ("Not all elements could be created.\n");
return -1;
}
/* Build the pipeline. Note that we are NOT linking the source at this
* point. We will do it later. */
gst_bin_add_many (GST_BIN (data.pipeline), data.source, data.convert , data.sink, NULL);
if (!gst_element_link (data.convert, data.sink)) {
g_printerr ("Elements could not be linked.\n");
gst_object_unref (data.pipeline);
return -1;
}
/* Set the URI to play */
g_object_set (data.source, "uri", "http://docs.gstreamer.com/media/sintel_trailer-480p.webm", NULL);
/* Connect to the pad-added signal */
g_signal_connect (data.source, "pad-added", G_CALLBACK (pad_added_handler), &data);
/* Start playing */
ret = gst_element_set_state (data.pipeline, GST_STATE_PLAYING);
if (ret == GST_STATE_CHANGE_FAILURE) {
g_printerr ("Unable to set the pipeline to the playing state.\n");
gst_object_unref (data.pipeline);
return -1;
}
/* Listen to the bus */
bus = gst_element_get_bus (data.pipeline);
do {
msg = gst_bus_timed_pop_filtered (bus, GST_CLOCK_TIME_NONE,
GST_MESSAGE_STATE_CHANGED | GST_MESSAGE_ERROR | GST_MESSAGE_EOS);
/* Parse message */
if (msg != NULL) {
GError *err;
gchar *debug_info;
switch (GST_MESSAGE_TYPE (msg)) {
case GST_MESSAGE_ERROR:
gst_message_parse_error (msg, &err, &debug_info);
g_printerr ("Error received from element %s: %s\n", GST_OBJECT_NAME (msg->src), err->message);
g_printerr ("Debugging information: %s\n", debug_info ? debug_info : "none");
g_clear_error (&err);
g_free (debug_info);
terminate = TRUE;
break;
case GST_MESSAGE_EOS:
g_print ("End-Of-Stream reached.\n");
terminate = TRUE;
break;
case GST_MESSAGE_STATE_CHANGED:
/* We are only interested in state-changed messages from the pipeline */
if (GST_MESSAGE_SRC (msg) == GST_OBJECT (data.pipeline)) {
GstState old_state, new_state, pending_state;
gst_message_parse_state_changed (msg, &old_state, &new_state, &pending_state);
g_print ("Pipeline state changed from %s to %s:\n",
gst_element_state_get_name (old_state), gst_element_state_get_name (new_state));
}
break;
default:
/* We should not reach here */
g_printerr ("Unexpected message received.\n");
break;
}
gst_message_unref (msg);
}
} while (!terminate);
/* Free resources */
gst_object_unref (bus);
gst_element_set_state (data.pipeline, GST_STATE_NULL);
gst_object_unref (data.pipeline);
return 0;
}
/* This function will be called by the pad-added signal */
static void pad_added_handler (GstElement *src, GstPad *new_pad, CustomData *data) {
GstPad *sink_pad = gst_element_get_static_pad (data->convert, "sink");
GstPadLinkReturn ret;
GstCaps *new_pad_caps = NULL;
GstStructure *new_pad_struct = NULL;
const gchar *new_pad_type = NULL;
g_print ("Received new pad '%s' from '%s':\n", GST_PAD_NAME (new_pad), GST_ELEMENT_NAME (src));
/* If our converter is already linked, we have nothing to do here */
if (gst_pad_is_linked (sink_pad)) {
g_print (" We are already linked. Ignoring.\n");
goto exit;
}
/* Check the new pad's type */
new_pad_caps = gst_pad_get_caps (new_pad);
new_pad_struct = gst_caps_get_structure (new_pad_caps, 0);
new_pad_type = gst_structure_get_name (new_pad_struct);
if (!g_str_has_prefix (new_pad_type, "audio/x-raw")) {
g_print (" It has type '%s' which is not raw audio. Ignoring.\n", new_pad_type);
goto exit;
}
/* Attempt the link */
ret = gst_pad_link (new_pad, sink_pad);
if (GST_PAD_LINK_FAILED (ret)) {
g_print (" Type is '%s' but link failed.\n", new_pad_type);
} else {
g_print (" Link succeeded (type '%s').\n", new_pad_type);
}
exit:
/* Unreference the new pad's caps, if we got them */
if (new_pad_caps != NULL)
gst_caps_unref (new_pad_caps);
/* Unreference the sink pad */
gst_object_unref (sink_pad);
}
编写插件
编写新的 GStreamer 插件是一项艰巨的任务。位于 https://gstreamer.freedesktop.org/data/doc/gstreamer/head/pwg/html/index.html
的文档“GStreamer 作者指南”对此给出了广泛的建议。
结论
本章从命令行和一个示例 C 程序两个方面介绍了 GStreamer 的使用。有一个庞大的可用插件列表,可以满足音频/视频开发人员的许多需求。我只是触及了 GStreamer 的皮毛,它还有许多其他特性,包括与 GTK 工具包的集成。
十一、libao
根据 libao 文档( www.xiph.org/ao/doc/overview.html
),“libao 旨在使使用各种音频设备和库进行简单的音频输出变得容易。由于这个原因,复杂的音频控制功能丢失了,并且可能永远不会被添加。然而,如果你只是想打开任何可用的音频设备并播放声音,libao 应该没问题。”
资源
查看以下内容:
- libao 文档(
www.xiph.org/ao/doc/
)
libao
libao 是一个极小的图书馆;它基本上只是播放音频数据。它不能解码任何标准的文件格式:不支持 WAV、MP3、Vorbis 等等。您必须配置位、通道、速率和字节格式的格式参数,然后将适当的数据发送到设备。它的主要用途是输出 PCM 数据,可以在编解码器解码后使用,或者播放正弦波等简单声音。
下面是一个来自 libao 网站的简单例子,播放一秒钟的正弦音调:
/*
*
* ao_example.c
*
* Written by Stan Seibert - July 2001
*
* Legal Terms:
*
* This source file is released into the public domain. It is
* distributed without any warranty; without even the implied
* warranty * of merchantability or fitness for a particular
* purpose.
*
* Function:
*
* This program opens the default driver and plays a 440 Hz tone for
* one second.
*
* Compilation command line (for Linux systems):
*
* gcc -lao -ldl -lm -o ao_example ao_example.c
*
*/
#include <stdio.h>
#include <ao/ao.h>
#include <math.h>
#define BUF_SIZE 4096
int main(int argc, char **argv)
{
ao_device *device;
ao_sample_format format;
int default_driver;
char *buffer;
int buf_size;
int sample;
float freq = 440.0;
int i;
/* -- Initialize -- */
fprintf(stderr, "libao example program\n");
ao_initialize();
/* -- Setup for default driver -- */
default_driver = ao_default_driver_id();
memset(&format, 0, sizeof(format));
format.bits = 16;
format.channels = 2;
format.rate = 44100;
format.byte_format = AO_FMT_LITTLE;
/* -- Open driver -- */
device = ao_open_live(default_driver, &format, NULL /* no options */);
if (device == NULL) {
fprintf(stderr, "Error opening device.\n");
return 1;
}
/* -- Play some stuff -- */
buf_size = format.bits/8 * format.channels * format.rate;
buffer = calloc(buf_size,
sizeof(char));
for (i = 0; i < format.rate; i++) {
sample = (int)(0.75 * 32768.0 *
sin(2 * M_PI * freq * ((float) i/format.rate)));
/* Put the same stuff in left and right channel */
buffer[4*i] = buffer[4*i+2] = sample & 0xff;
buffer[4*i+1] = buffer[4*i+3] = (sample >> 8) & 0xff;
}
ao_play(device, buffer, buf_size);
/* -- Close and shutdown -- */
ao_close(device);
ao_shutdown();
return (0);
}
结论
libao 并不复杂;这是一个基本的库,可以在任何可用的设备上播放声音。它将适合的情况下,你有一个已知的 PCM 格式的声音。
十二、 FFmpeg/Libav
根据“FFmpeg 初学者教程”( http://keycorner.org/pub/text/doc/ffmpegtutorial.htm
),FFmpeg 是一个完整的、跨平台的命令行工具,能够记录、转换和流式传输各种格式的数字音频和视频。它可以用来快速轻松地完成大多数多媒体任务,如音频压缩、音频/视频格式转换、从视频中提取图像等。
FFmpeg 由一组命令行工具和一组库组成,可用于将音频(和视频)文件从一种格式转换为另一种格式。它既可以在容器上工作,也可以在编解码器上工作。它不是为播放或录制音频而设计的;它更像是一个通用的转换工具。
资源
- FFmpeg 首页(
http://ffmpeg.org/
- FFmpeg 文档(
http://ffmpeg.org/ffmpeg.html
- Libav 主页 (
https://libav.org/
) - 在
https://github.com/chelyaev/ffmpeg-tutorial
更新代码的 FFmpeg 和 SDL 教程(http://dranger.com/ffmpeg/
)
FFmpeg/Libav 之争
FFmpeg 开始于 2000 年,为处理多媒体数据提供库和程序。然而,在过去的几年里,开发人员之间发生了一些纠纷,导致了 2011 年 Libav 项目的分叉。从那以后,这两个项目一直在进行,几乎是并行的,并且经常互相借鉴。然而,形势依然严峻,似乎没有解决的可能。
这对开发者来说是不幸的。虽然程序通常可以在这两个系统之间移植,但有时在 API 和行为上存在差异。还有发行版支持的问题。多年来,Debian 及其衍生产品只支持 Libav,忽略了 FFmpeg。这已经改变了,现在两者都支持。参见“为什么 Debian 回到 FFmpeg”(https://lwn.net/Articles/650816/
)对其中一些问题的讨论。
FFmpeg 命令行工具
主要的 FFmpeg 工具是ffmpeg
本身。最简单的用途是作为从一种格式到另一种格式的转换器,如下所示:
ffmpeg -i file.ogg file.mp3
这将把 Vorbis 编解码器数据的 Ogg 容器转换成 MP2 编解码器数据的 MPEG 容器。
Libav 的等价物是avconv
,运行方式类似。
avconv -i file.ogg file.mp3
在内部,ffmpeg
使用模块流水线,如图 12-1 所示。
图 12-1。
FFmpeg/Libav pipeline (Source: http://ffmpeg.org/ffmpeg.html
)
如果默认值不合适,可以使用选项设置多路复用器/多路分解器和解码器/编码器。
以下是其他命令:
ffprobe
给出关于文件的信息。- 是一个简单的媒体播放器。
ffserver
是媒体服务器。
设计
有许多库可用于 FFmpeg/Libav 编程。Libav 构建了以下库:
- libavcodec 公司
- libavdevice
- libavfilter
- libavformat
- libavresample
- 滑鹌
FFmepg 构建以下内容:
- libavcodec 公司
- libavdevice
- libavfilter
- libavformat
- libavresample
- 滑鹌
- libpostproc
- libswresample
- libswscale
FFmpeg 中的额外库用于视频后处理和缩放。
使用这些系统都不是一个简单的过程。Libav 网站声明,“Libav 一直是一个非常实验性的、由开发者驱动的项目。它是许多多媒体项目中的关键组件,并且不断添加新功能。为了提供一个稳定的基础,主要版本每四到六个月削减一次,并至少维持两年。”
FFmpeg 网站声明,“FFmpeg 一直是一个非常实验性和开发者驱动的项目。它是许多多媒体项目中的关键组件,并且不断添加新功能。开发分支快照在 99%的时间里都工作得很好,所以人们不怕使用它们。”
我的经验是,这两个项目的“实验”性质导致了不稳定的核心 API,定期废弃和替换关键功能。比如libavcodec
版本 56 中的函数avcodec_decode_audio
现在升级到版本 4: avcodec_decode_audio4
。甚至那个版本现在也在 FFmpeg 和 Libav 的上游版本(版本 57)中被弃用,取而代之的是在版本 56 中不存在的函数,比如avcodec_send_packet
。除此之外,还有两个项目具有相同的目标和大体相同的 API,但并不总是如此。比如 FFmpeg 有swr_alloc_set_opts
,而 Libav 用的是av_opt_set_int
。此外,视听编解码器和容器本身也在不断发展。
这样做的结果是,互联网上的许多示例程序不再编译,不再使用废弃的 API,或者属于“其他”系统。这并不是要贬低两个成就高超的系统,只是希望不要这么乱。
解码 MP3 文件
以下程序将 MP3 文件解码为原始 PCM 文件。这是使用 FFmpeg/Libav 所能完成的最简单的任务,但不幸的是这并不简单。首先,你要注意你要处理的是一个编解码器,而不是一个包含编解码器的文件。这不是一个 FFmpeg/Libav 问题,而是一个一般性问题。
扩展名为.mpg
或.mp3
的文件可能包含许多不同的格式。如果我对我拥有的一些文件运行命令file
,我会得到不同的结果。
BST.mp3: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
Beethoven_Fr_Elise.mp3: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
Angel-no-vocal.mp3: Audio file with ID3 version 2.3.0
01DooWackaDoo.mp3: Audio file with ID3 version 2.3.0, \
contains: MPEG ADTS, layer III, v1, 224 kbps, 44.1 kHz, JntStereo
前两个文件只包含一个编解码器,可以由下面的程序管理。第三和第四个文件是容器文件,包含 MPEG+ID3 数据。这些需要使用avformat
函数来管理,例如av_read_frame
1 。
该程序基本上是 FFmpeg/Libav 源代码发行版中的一个标准示例。它基于 FFmpeg 源中的ffmpeg-3.2/doc/examples/decoding_encoding.c
和 Libav 源中的libav-12/doc/examples/avcodec.c
。顺便提一下,两个程序都使用了avcodec_decode_audio4
,这在这两个上游版本中都被否决了,也没有替换函数avcodec_send_packet
的例子。
更严重的问题是,MP3 文件越来越多地使用平面格式。在这种情况下,不同的通道位于不同的平面。FFmpeg/Libav 函数avcodec_decode_audio4
通过将每个平面放置在单独的数据阵列中来正确处理这一问题,但当它作为 PCM 数据输出时,平面必须交错。示例中没有这样做,这可能会导致 PCM 数据不正确(大量咔嗒声,然后是半速音频)。
相关的 FFmpeg 功能如下:
- 注册所有可能的多路复用器、多路分解器和协议。
avformat_open_input
:打开输入流。av_find_stream_info
:提取流信息。av_init_packet
:设定数据包中的默认值。avcodec_find_decoder
:找到合适的解码器。avcodec_alloc_context3
:设置主数据结构的默认值。avcodec_open2
:打开解码器。fread
:FFmpeg 处理循环从数据流中一次读取一个缓冲区。avcodec_decode_audio4
:将音频帧解码成原始音频数据。
其余的代码交错数据流以输出到 PCM 文件。生成的文件可以通过以下方式播放:
aplay -c 2 -r 44100 /tmp/test.sw -f S16_LE
该计划如下:
/*
* copyright (c) 2001 Fabrice Bellard
*
* This file is part of Libav.
*
* Libav is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* Libav is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with Libav; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
// From http://code.haskell.org/∼thielema/audiovideo-example/cbits/
// Adapted to version version 2.8.6-1ubuntu2 by Jan Newmarch
/**
* @file
* libavcodec API use example.
*
* @example libavcodec/api-example.c
* Note that this library only handles codecs (mpeg, mpeg4, etc...),
* not file formats (avi, vob, etc...). See library 'libavformat' for the
* format handling
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifdef HAVE_AV_CONFIG_H
#undef HAVE_AV_CONFIG_H
#endif
#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096
void die(char *s) {
fputs(s, stderr);
exit(1);
}
/*
* Audio decoding.
*/
static void audio_decode_example(AVFormatContext* container,
const char *outfilename, const char *filename)
{
AVCodec *codec;
AVCodecContext *context = NULL;
int len;
FILE *f, *outfile;
uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
AVPacket avpkt;
AVFrame *decoded_frame = NULL;
int num_streams = 0;
int sample_size = 0;
av_init_packet(&avpkt);
printf("Audio decoding\n");
int stream_id = -1;
// To find the first audio stream. This process may not be necessary
// if you can gurarantee that the container contains only the desired
// audio stream
int i;
for (i = 0; i < container->nb_streams; i++) {
if (container->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
stream_id = i;
break;
}
}
/* find the appropriate audio decoder */
AVCodecContext* codec_context = container->streams[stream_id]->codec;
codec = avcodec_find_decoder(codec_context->codec_id);
if (!codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
context = avcodec_alloc_context3(codec);;
/* open it */
if (avcodec_open2(context, codec, NULL) < 0) {
fprintf(stderr, "could not open codec\n");
exit(1);
}
f = fopen(filename, "rb");
if (!f) {
fprintf(stderr, "could not open %s\n", filename);
exit(1);
}
outfile = fopen(outfilename, "wb");
if (!outfile) {
av_free(context);
exit(1);
}
/* decode until eof */
avpkt.data = inbuf;
avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);
while (avpkt.size > 0) {
int got_frame = 0;
if (!decoded_frame) {
if (!(decoded_frame = av_frame_alloc())) {
fprintf(stderr, "out of memory\n");
exit(1);
}
} else {
av_frame_unref(decoded_frame);
}
printf("Stream idx %d\n", avpkt.stream_index);
len = avcodec_decode_audio4(context, decoded_frame, &got_frame, &avpkt);
if (len < 0) {
fprintf(stderr, "Error while decoding\n");
exit(1);
}
if (got_frame) {
printf("Decoded frame nb_samples %d, format %d\n",
decoded_frame->nb_samples,
decoded_frame->format);
if (decoded_frame->data[1] != NULL)
printf("Data[1] not null\n");
else
printf("Data[1] is null\n");
/* if a frame has been decoded, output it */
int data_size = av_samples_get_buffer_size(NULL, context->channels,
decoded_frame->nb_samples,
context->sample_fmt, 1);
// first time: count the number of planar streams
if (num_streams == 0) {
while (num_streams < AV_NUM_DATA_POINTERS &&
decoded_frame->data[num_streams] != NULL)
num_streams++;
printf("Number of streams %d\n", num_streams);
}
// first time: set sample_size from 0 to e.g 2 for 16-bit data
if (sample_size == 0) {
sample_size =
data_size / (num_streams * decoded_frame->nb_samples);
}
int m, n;
for (n = 0; n < decoded_frame->nb_samples; n++) {
// interleave the samples from the planar streams
for (m = 0; m < num_streams; m++) {
fwrite(&decoded_frame->data[m][n*sample_size],
1, sample_size, outfile);
}
}
}
avpkt.size -= len;
avpkt.data += len;
if (avpkt.size < AUDIO_REFILL_THRESH) {
/* Refill the input buffer, to avoid trying to decode
* incomplete frames. Instead of this, one could also use
* a parser, or use a proper container format through
* libavformat. */
memmove(inbuf, avpkt.data, avpkt.size);
avpkt.data = inbuf;
len = fread(avpkt.data + avpkt.size, 1,
AUDIO_INBUF_SIZE - avpkt.size, f);
if (len > 0)
avpkt.size += len;
}
}
fclose(outfile);
fclose(f);
avcodec_close(context);
av_free(context);
av_free(decoded_frame);
}
int main(int argc, char **argv)
{
const char *filename = "Beethoven_Fr_Elise.mp3";
AVFormatContext *pFormatCtx = NULL;
if (argc == 2) {
filename = argv[1];
}
// Register all formats and codecs
av_register_all();
if(avformat_open_input(&pFormatCtx, filename, NULL, NULL)!=0) {
fprintf(stderr, "Can't get format of file %s\n", filename);
return -1; // Couldn't open file
}
// Retrieve stream information
if(avformat_find_stream_info(pFormatCtx, NULL)<0)
return -1; // Couldn't find stream information
av_dump_format(pFormatCtx, 0, filename, 0);
printf("Num streams %d\n", pFormatCtx->nb_streams);
printf("Bit rate %d\n", pFormatCtx->bit_rate);
audio_decode_example(pFormatCtx, "/tmp/test.sw", filename);
return 0;
}
结论
本章简要介绍了 FFmpeg/Libav,查看了 libavcodec 库。FFmpeg 和 Libav 要复杂得多,它们可以进行复杂得多的转换。此外,他们还可以进行视频处理,这在第十五章中有说明。
Footnotes 1
第十五章和第二十一章中给出了 av_read_frame 的示例。
十三、OpenMAXIL
OpenMAX 是 Khronos Group 为低性能设备设计的音频和视频开放标准。卡的供应商被期望生产实现。一般的 Linux 实现方式很少,但是 Broadcom 已经实现了其中一个规范(OpenMAX IL),它的芯片被用于 Raspberry Pi。其他 Khronos 规范(OpenMAX AL 和 OpenSL ES)已经在 Android 设备中实现,可通过原生开发套件(NDK)访问,但这些并不打算直接使用;它们只能通过 Java APIs 使用。本书不讨论它们。本章仅讨论 OpenMAX IL。
资源
以下是一些资源:
- 来自 eLinux 的 OpenMAX 集成层(IL)标准(
http://elinux.oimg/The_OpenMAX_Integration_Layer_standard.pdf
)。 - Khronos 主页(
www.khronos.org/
)给出了免费下载的规范;它们做得很好,可读性很强。 - LIM OpenMAX 实现(
http://limoa.sourceforge.net/
)是一个 Linux 实现。从lim-omx-1.0.tar.gz
(http://sourceforge.net/projects/limoa/files/1.0/lim-omx-1.0.tar.gz/download
)下载 1.0 就可以了。 - OpenMAX IL Bellagio 包(
http://omxil.sourceforge.net/
)源码,DEB 包,RPM 都有。 - 德州仪器 OpenMax 开发指南(
http://processors.wiki.ti.com/index.php/OpenMax_Development_Guide
)。 - OpenMAX(开放媒体加速)(
www.cnx-software.com/2011/11/11/openmax-open-media-acceleration/
)。
引用
以下是一些引述:
- 根据 jamesh 的说法,“使用 OpenMAX 完全是一场噩梦……”(
www.raspberrypi.org/forums/viewtopic.php?t=5621
)。 - 根据 dom (
www.raspberrypi.org/forums/memberlist.php?mode=viewprofile&u=754
),“我已经写了相当多的[OpenMAX ]客户端代码,发现很难。在你得到任何有用的东西之前,你必须得到很多正确的东西。如果你幸运的话,会有很多 OMX 错误无效状态和 OMX 错误错误参数的消息。如果你不在的话,什么也不会发生。” - 根据 Twinkletoes (
www.raspberrypi.org/forums/viewtopic.php?t=6577
),“我来自 DirectShow 背景,我认为那是很糟糕的记录……然后我遇到了[OpenMAX ]。很多 PPT 都在谈论它,但我找不到任何文档或代码示例。”
OpenMAX 概念
OpenMAX IL API 与 OpenMAX AL 的 API 截然不同。基本概念是组件,即某种类型的音频/视频(或其他)处理单元,如音量控制、混音器或输出设备。每个组件有零个或多个输入和输出端口,每个端口可以有一个或多个携带数据的缓冲器。
OpenMAX IL 通常由某种 A/V 框架使用,如 OpenMAX AL。除了 OpenMAX AL,目前还有一个 GStreamer 插件在底层使用 OpenMAX IL。但是也可以构建独立的应用,直接调用 OpenMAX IL API。总的来说,这些都被称为 IL 客户端。
OpenMAX IL API 很难直接使用。错误消息经常是无用的,线程会毫无解释地阻塞,直到一切都完全正确,静默阻塞不会给你任何关于什么是不正确的线索。此外,我必须处理的例子并没有完全正确地遵循规范,这会导致大量的时间浪费。
OpenMAX IL 组件使用缓冲区来传送数据。组件通常会处理来自输入缓冲区的数据,并将其放在输出缓冲区。这种处理对 API 是不可见的,因此它允许供应商在硬件或软件中实现组件,构建在其他 A/V 组件之上,等等。OpenMAX IL 提供了设置和获取组件参数、调用组件上的标准函数或从组件中获取数据的机制。
虽然一些 OpenMAX IL 调用是同步的,但是那些可能需要大量处理的调用是异步的,通过回调函数传递结果。这自然会导致多线程处理模型,尽管 OpenMAX IL 并不明显使用任何线程库,并且应该不知道 IL 客户端如何使用线程。Bellagio 示例使用 pthreads,而 Broadcom 的 Raspberry Pi 示例使用 Broadcom 的 video core OS(VCO)线程( https://github.com/raspberrypi/userland/blob/master/interface/vcos/vcos_semaphore.h
)。
有两种机制可以让数据进出组件。第一个是 IL 客户端调用组件的地方。所有组件都需要支持此机制。第二种是在两个组件之间建立一个隧道,让数据沿着共享缓冲区流动。支持这种机制不需要组件。
OpenMAX IL 组件
OpenMAX IL in 1.1.2 lists 中列出了许多标准组件,包括(对于音频)解码器、编码器、混合器、读取器、渲染器、写入器、捕获器和处理器。一个 IL 客户端通过调用OMX_GetHandle()
获得这样一个组件,并传入组件的名称。这是一个问题:组件没有标准的名称。
1.1.2 规范说,“由于组件是按名称请求的,因此定义了命名约定。OpenMAX IL 组件名是以零结尾的字符串,格式如下:OMX.<vendor_name>.<vendor_specified_convention>
,例如OMX.CompanyABC.MP3Decoder.productXYZ
。不同供应商的组件名称之间没有标准化。”
在这一点上,您必须查看当前可用的实现,因为这种标准化的缺乏会导致即使是最基本的程序也存在差异。
履行
以下是实现。
树莓派
Raspberry Pi 有一个 Broadcom 图形处理单元(GPU),Broadcom 支持 OpenMAX IL。构建应用所需的包含文件在/opt/vc/include/IL
、/opt/vc/include
和/opt/vc/include/interface/vcos/pthreads
中。需要链接的库在/opt/vc/lib
目录下,分别是openmaxil
和bcm_host
。
Broadcom 库需要调用额外的代码以及标准的 OpenMAX IL 函数。此外,OpenMAX IL 还有许多(合法的)扩展,这些扩展在规范或其他实现中是找不到的。这些在/opt/vc/include/IL/OMX_Broadcom.h
中有描述。由于这些原因,我定义了RASPBERRY_PI
来允许这些被处理。
例如,listcomponents.c
的编译行如下:
cc -g -DRASPBERRY_PI -I /opt/vc/include/IL -I /opt/vc/include \
-I /opt/vc/include/interface/vcos/pthreads \
-o listcomponents listcomponents.c \
-L /opt/vc/lib -l openmaxil -l bcm_host
Broadcom 实现是闭源的。它似乎是其 GPU API 的一个薄薄的包装,Broadcom 不会发布该 API 的任何细节。这意味着您不能扩展组件集或支持的编解码器,因为没有关于如何构建新组件的详细信息。虽然组件的设置是合理的,但目前除了 PCM 之外不支持编解码器,也不支持非 GPU 硬件,如 USB 声卡。
OtherCrashOverride ( www.raspberrypi.org/phpBB3/viewtopic.php?f=70&t=33101&p=287590#p287590
)说他已经设法让 Broadcom 组件在 LIM 实现下运行,但我还没有证实这一点。
就音频而言,Raspberry Pi 上的实现非常弱,因为所有音频解码都要在软件中完成,并且它只能播放 PCM 数据。视频更令人印象深刻,在我的书《Raspberry Pi GPU 音频视频编程》中有所论述。
百乐宫(美国酒店名)
Bellagio 库不需要额外的代码或任何扩展。有一些小错误,所以我定义BELLAGIO
来处理它们。我从源代码构建但没有安装,所以 includes 和 libraries 在一个有趣的地方。我的编译代码如下:
cc -g -DBELLAGIO -I ../libomxil-bellagio-0.9.3/include/ \
-o listcomponents listcomponents.c \
-L ../libomxil-bellagio-0.9.3/src/.libs -l omxil-bellagio
这是运行时的代码行:
export LD_LIBRARY_PATH=../libomxil-bellagio-0.9.3/src/.libs/
./listcomponents
Bellagio 代码是开源的。
潜象存储器(Latent Image Memory 的缩写)
下载 1.1 版本很麻烦,因为 1.1 下载使用了已经消失的 Git repo(截至 2016 年 11 月)。相反,您必须运行以下命令:
git clone git://limoa.git.sourceforge.net/gitroot/limoa/limoi-components
git clone git://limoa.git.sourceforge.net/gitroot/limoa/limoi-core
git clone git://limoa.git.sourceforge.net/gitroot/limoa/limoi-plugins
git clone git://limoa.git.sourceforge.net/gitroot/limoa/limutil
git clone git://limoa.git.sourceforge.net/gitroot/limoa/manifest
您必须将构建中的root.mk
文件复制到包含所有代码的顶层文件夹中,并将其重命名为Makefile
。root.readme
文件有构建指令。感谢 OtherCrashOverride ( www.raspberrypi.org/phpBB3/viewtopic.php?f=70&t=33101&p=286516#p286516
)的这些指令。
建造图书馆遇到了一些小问题。我不得不注释掉一个视频文件中的几行,因为它引用了不存在的结构字段,并且不得不从一个Makefile.am
中移除-Werrors
,否则关于未使用变量的警告将会中止编译。
库构建将文件放在我的HOME
中的新目录中。到目前为止,我在实现中发现了一些小错误。我的编译代码如下:
cc -g -DLIM -I ../../lim-omx-1.1/LIM/limoi-core/include/ \
-o listcomponents listcomponents.c \
-L /home/newmarch/osm-build/lib/ -l limoa -l limoi-core
以下是运行时的代码行:
export LD_LIBRARY_PATH=/home/newmarch/osm-build/lib/
./listcomponents
LIM 代码是开源的。
硬件支持的版本
您可以在 open max IL Conformant Products(www.khronos.org/conformance/adopters/conformant-products#openmaxil
)找到硬件支持的版本列表。
组件的实现
Bellagio 库(你需要源码包才能看到这些文件)在其README
中只列出了两个音频组件。
- OMX 音量控制
- OMX 混音器组件
它们的名字(来自示例测试文件)分别是OMX.st.volume.component
和OMX.st.audio.mixer
。百乐宫背后的公司是意法半导体( www.st.com/internet/com/home/home.jsp
),这就解释了st
。
Raspberry Pi 上使用的 Broadcom OpenMAX IL 实现有更好的文档记录。如果您下载 Raspberry Pi 的固件主文件,它会在documentation/ilcomponents
目录中列出 IL 组件。这列出了组件audio_capture
、audio_decode
、audio_encode
、audio_lowpower
、audio_mixer
、audio_processor
、audio_render
和audio_splitter
。
Broadcom 示例中的许多 OpenMAX IL 函数调用都隐藏在 Broadcom 便利函数中,如下所示:
ilclient_create_component(st->client, &st->audio_render,
"audio_render",
ILCLIENT_ENABLE_INPUT_BUFFERS | ILCLIENT_DISABLE_ALL_PORTS);
这围绕着OMX_GetHandle()
。但是至少ilclient.h
声明,“在传递给 IL 核心之前,所提供的组件名称会自动加上前缀OMX.broadcom.
”所以,你可以断定真名是,比如OMX.broadcom.audio_render
,等等。
有一种简单的方法可以通过编程获得受支持的组件。首先用OMX_init()
初始化 OpenMAX 系统,然后调用OMX_ComponentNameEnum()
。对于连续的索引值,它每次都返回一个唯一的名称,直到最后返回一个错误值OMX_ErrorNoMore
。
每个组件可以支持多个角色。这些都是由OMX_GetRolesOfComponent
给出的。1.1 规范在第 8.6 节“标准音频组件”中列出了音频组件的类别和相关角色 LIM 库匹配这些,而 Bellagio 和 Broadcom 不匹配。
下面的程序是listcomponents.c
:
#include <stdio.h>
#include <stdlib.h>
#include <OMX_Core.h>
#ifdef RASPBERRY_PI
#include <bcm_host.h>
#endif
OMX_ERRORTYPE err;
//extern OMX_COMPONENTREGISTERTYPE OMX_ComponentRegistered[];
void listroles(char *name) {
int n;
OMX_U32 numRoles;
OMX_U8 *roles[32];
/* get the number of roles by passing in a NULL roles param */
err = OMX_GetRolesOfComponent(name, &numRoles, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Getting roles failed\n", 0);
exit(1);
}
printf(" Num roles is %d\n", numRoles);
if (numRoles > 32) {
printf("Too many roles to list\n");
return;
}
/* now get the roles */
for (n = 0; n < numRoles; n++) {
roles[n] = malloc(OMX_MAX_STRINGNAME_SIZE);
}
err = OMX_GetRolesOfComponent(name, &numRoles, roles);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Getting roles failed\n", 0);
exit(1);
}
for (n = 0; n < numRoles; n++) {
printf(" role: %s\n", roles[n]);
free(roles[n]);
}
/* This is in version 1.2
for (i = 0; OMX_ErrorNoMore != err; i++) {
err = OMX_RoleOfComponentEnum(role, name, i);
if (OMX_ErrorNone == err) {
printf(" Role of omponent is %s\n", role);
}
}
*/
}
int main(int argc, char** argv) {
int i;
unsigned char name[OMX_MAX_STRINGNAME_SIZE];
# ifdef RASPBERRY_PI
bcm_host_init();
# endif
err = OMX_Init();
if (err != OMX_ErrorNone) {
fprintf(stderr, "OMX_Init() failed\n", 0);
exit(1);
}
err = OMX_ErrorNone;
for (i = 0; OMX_ErrorNoMore != err; i++) {
err = OMX_ComponentNameEnum(name, OMX_MAX_STRINGNAME_SIZE, i);
if (OMX_ErrorNone == err) {
printf("Component is %s\n", name);
listroles(name);
}
}
printf("No more components\n");
/*
i= 0 ;
while (1) {
printf("Component %s\n", OMX_ComponentRegistered[i++]);
}
*/
exit(0);
}
Bellagio 库的输出如下:
Component is OMX.st.clocksrc
Num roles is 1
role: clocksrc
Component is OMX.st.clocksrc
Num roles is 1
role: clocksrc
Component is OMX.st.video.scheduler
Num roles is 1
role: video.scheduler
Component is OMX.st.video.scheduler
Num roles is 1
role: video.scheduler
Component is OMX.st.volume.component
Num roles is 1
role: volume.component
Component is OMX.st.volume.component
Num roles is 1
role: volume.component
Component is OMX.st.audio.mixer
Num roles is 1
role: audio.mixer
Component is OMX.st.audio.mixer
Num roles is 1
role: audio.mixer
Component is OMX.st.clocksrc
Num roles is 1
role: clocksrc
Component is OMX.st.clocksrc
Num roles is 1
role: clocksrc
Component is OMX.st.video.scheduler
Num roles is 1
role: video.scheduler
Component is OMX.st.video.schedu
ler
Num roles is 1
role: video.scheduler
Component is OMX.st.volume.component
Num roles is 1
role: volume.component
Component is OMX.st.volume.component
Num roles is 1
role: volume.component
Component is OMX.st.audio.mixer
Num roles is 1
role: audio.mixer
Component is OMX.st.audio.mixer
Num roles is 1
role: audio.mixer
No more components
这不太正确。OpenMAX IL 规范规定每个组件只能出现一次,不能重复。
Raspberry Pi 报告了大量的组件,但是没有为它们中的任何一个定义角色。
Component is OMX.broadcom.audio_capture
Num roles is 0
Component is OMX.broadcom.audio_decode
Num roles is 0
Component is OMX.broadcom.audio_encode
Num roles is 0
Component is OMX.broadcom.audio_render
Num roles is 0
Component is OMX.broadcom.audio_mixer
Num roles is 0
Component is OMX.broadcom.audio_splitter
Num roles is 0
Component is OMX.broadcom.audio_processor
Num roles is 0
Component is OMX.broadcom.camera
Num roles is 0
Component is OMX.broadcom.clock
Num roles is 0
Component is OMX.broadcom.coverage
Num roles is 0
Component is OMX.broadcom.egl_render
Num roles is 0
Component is OMX.broadcom.image_fx
Num roles is 0
Component is OMX.broadcom.image_decode
Num roles is 0
Component is OMX.broadcom.image_encode
Num roles is 0
Component is OMX.broadcom.image_read
Num roles is 0
Component is OMX.broadcom.image_write
Num roles is 0
Component is OMX.broadcom.read_media
Num roles is 0
Component is OMX.broadcom.resize
Num roles is 0
Component is OMX.broadcom.source
Num roles is 0
Component is OMX.broadcom.text_scheduler
Num roles is 0
Component is OMX.broadcom.transition
Num roles is 0
Component is OMX.broadcom.video_decode
Num roles is 0
Component is OMX.broadcom.video_encode
Num roles is 0
Component is OMX.broadcom.video_render
Num roles is 0
Component is OMX.broadcom.video_scheduler
Num roles is 0
Component is OMX.broadcom.video_splitter
Num roles is 0
Component is OMX.broadcom.visualisation
Num roles is 0
Component is OMX.broadcom.write_media
Num roles is 0
Component is OMX.broadcom.write_still
Num roles is 0
No more components
LIM 的输出如下:
Component is OMX.limoi.alsa_sink
Num roles is 1
role: audio_renderer.pcm
Component is OMX.limoi.clock
Num roles is 1
role: clock.binary
Component is OMX.limoi.ffmpeg.decode.audio
Num roles is 8
role: audio_decoder.aac
role: audio_decoder.adpcm
role: audio_decoder.amr
role: audio_decoder.mp3
role: audio_decoder.ogg
role: audio_decoder.pcm
role: audio_decoder.ra
role: audio_decoder.wma
Component is OMX.limoi.ffmpeg.decode.video
Num roles is 7
role: video_decoder.avc
role: video_decoder.h263
role: video_decoder.mjpeg
role: video_decoder.mpeg2
role: video_decoder.mpeg4
role: video_decoder.rv
role: video_decoder.wmv
Component is OMX.limoi.ffmpeg.demux
Num roles is 1
role: container_demuxer.all
Component is OMX.limoi.ffmpeg.encode.audio
Num roles is 2
role: audio_encoder.aac
role: audio_encoder.mp3
Component is OMX.limoi.ffmpeg.encode.video
Num roles is 2
role: video_encoder.h263
role: video_encoder.mpeg4
Component is OMX.limoi.ffmpeg.mux
Num roles is 1
role: container_muxer.all
Component is OMX.limoi.ogg_dec
Num roles is 1
role: audio_decoder_with_framing.ogg
Component is OMX.limoi.sdl.renderer.video
Num roles is 1
role: iv_renderer.yuv.overlay
Component is OMX.limoi.vid
eo_scheduler
Num roles is 1
role: video_scheduler.binary
No more components
获取关于 IL 组件的信息
接下来,您将了解如何获取有关 OpenMAX IL 系统和您使用的任何组件的信息。所有 IL 客户端必须通过调用OMX_Init()
来初始化 OpenMAX IL。几乎所有函数都返回错误值,Bellagio 使用的风格如下:
err = OMX_Init();
if(err != OMX_ErrorNone) {
fprintf(stderr, "OMX_Init() failed\n", 0);
exit(1);
}
这在我看来是一种合理的风格,所以我在续集中遵循了它。
下一个需求是获得组件的句柄。这需要组件的供应商名称,可以使用前面显示的listcomponents.c
程序找到。函数OMX_GetHandle
接受一些参数,包括一组回调函数。这些是跟踪应用的行为所需要的,但对于本节中的示例并不需要。这段代码显示了如何获得 Bellagio 音量组件的句柄:
OMX_HANDLETYPE handle;
OMX_CALLBACKTYPE callbacks;
OMX_ERRORTYPE err;
err = OMX_GetHandle(&handle, "OMX.st.volume.component", NULL /*appPriv */, &callbacks);
if(err != OMX_ErrorNone) {
fprintf(stderr, "OMX_GetHandle failed\n", 0);
exit(1);
}
组件有端口,端口有通道。这些信息的获取和设置由函数OMX_GetParameter()
、OMX_SetParameter()
、OMX_GetConfig()
和OMX_GetConfig()
完成。在组件被“加载”之前进行…Parameter
调用,在组件被加载之后进行…Config
调用。
c 不是 OO 语言,这是一个普通的函数调用(嗯,实际上是一个宏)。在 OO 语言中,它是一个对象将另一个对象作为参数的方法,如component.method(object)
。在 OpenMAX IL 中,Get/Set 函数将调用“对象”作为第一个参数(组件,该方法的参数是什么类型的“对象”的指示符),可能的“对象”类型的索引,以及参数对象的结构。索引值与 1.1 规范表 4-2 中的结构相关。
这些调用采用一个(指向的)结构来填充或提取值。这些结构都是规范化的,因此它们共享公共字段,如结构的大小。在 Bellagio 示例中,这是通过宏setHeader()
完成的。传入以获取端口信息的结构通常是类型为OMX_PORT_PARAM_TYPE
的通用结构。有些字段可以直接访问,有些需要转换为更特殊的类型,有些隐藏在联合中,必须提取出来。
端口由整数索引标记。不同的功能有不同的端口,如音频、图像、视频等。要获取有关音频端口起始值的信息,请使用以下命令:
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
printf("Audio ports start on %d\n",
((OMX_PORT_PARAM_TYPE)param).nStartPortNumber);
printf("There are %d open ports\n",
((OMX_PORT_PARAM_TYPE)param).nPorts);
宏setHeader
只是填充头部信息,比如版本号和数据结构的大小。
现在可以询问特定端口的能力。您可以查询端口类型(音频或其他)、方向(输入或输出)以及有关支持的 MIME 类型的信息。
OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
sPortDef.nPortIndex = 0;
err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
if (sPortDef.eDomain == OMX_PortDomainAudio) {
printf("Is an audio port\n");
} else {
printf("Is other device port\n");
}
if (sPortDef.eDir == OMX_DirInput) {
printf("Port is an input port\n");
} else {
printf("Port is an output port\n");
}
/* the Audio Port info is buried in a union format.audio within the struct */
printf("Port min buffers %d, mimetype %s, encoding %d\n",
sPortDef.nBufferCountMin,
sPortDef.format.audio.cMIMEType,
sPortDef.format.audio.eEncoding);
Bellagio 库为其音量控制组件支持的 MIME 类型返回“raw/audio”。但是,这不是 IANA MIME 媒体类型( www.iana.org/assignments/media-types
)列出的有效 MIME 类型。编码返回的值是零,对应OMX_AUDIO_CodingUnused
,这个好像也不正确。
如果您在 Raspberry Pi 组件audio_render
和 LIM 组件OMX.limoi.alsa_sink
上尝试相同的程序,您会得到 MIME 类型的NULL
,但是编码值为 2,也就是OMX_AUDIO_CodingPCM
。PCM 有一个哑剧类型的audio/L16
,所以NULL
似乎不合适。
OpenMAX IL 库允许向端口查询其支持的数据类型。这是通过使用索引OMX_IndexParamAudioPortFormat
查询OMX_AUDIO_PARAM_PORTFORMATTYPE
对象来完成的。根据规范,对于从零开始的每个索引,对GetParameter()
的调用应该返回一个编码,比如OMX_AUDIO_CodingPCM
或OMX_AUDIO_CodingMp3
,直到不再有支持的格式,在这种情况下,调用将返回OMX_ErrorNoMore
。
Bellagio 代码返回值OMX_AUDIO_CodingUnused
,这是不正确的。LIM 代码根本没有设置值,所以您得到的只是垃圾。Broadcom 实现工作正常,但正如将要讨论的那样,它会返回实际上不受支持的值。所以,这种呼吁的价值有限。
以下代码对此进行了测试:
void getSupportedAudioFormats(int indentLevel, int portNumber) {
OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;
setHeader(&sAudioPortFormat, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
sAudioPortFormat.nIndex = 0;
sAudioPortFormat.nPortIndex = portNumber;
printf("Supported audio formats are:\n");
for(;;) {
err = OMX_GetParameter(handle, OMX_IndexParamAudioPortFormat, &sAudioPortFormat);
if (err == OMX_ErrorNoMore) {
printf("No more formats supported\n");
return;
}
/* This shouldn't occur, but does with Broadcom library */
if (sAudioPortFormat.eEncoding == OMX_AUDIO_CodingUnused) {
printf("No coding format returned\n");
return;
}
switch (sAudioPortFormat.eEncoding) {
case OMX_AUDIO_CodingPCM:
printf("Supported encoding is PCM\n");
break;
case OMX_AUDIO_CodingVORBIS:
printf("Supported encoding is Ogg Vorbis\n");
break;
case OMX_AUDIO_CodingMP3:
printf("Supported encoding is MP3\n");
break;
#ifdef RASPBERRY_PI
case OMX_AUDIO_CodingFLAC:
printf("Supported encoding is FLAC\n");
break;
case OMX_AUDIO_CodingDDP:
printf("Supported encoding is DDP\n");
break;
case OMX_AUDIO_CodingDTS:
printf("Supported encoding is DTS\n");
break;
case OMX_AUDIO_CodingWMAPRO:
printf("Supported encoding is WMAPRO\n");
break;
#endif
case OMX_AUDIO_CodingAAC:
printf("Supported encoding is AAC\n");
break;
case OMX_AUDIO_CodingWMA:
printf("Supported encoding is WMA\n");
break;
case OMX_AUDIO_CodingRA:
printf("Supported encoding is RA\n");
break;
case OMX_AUDIO_CodingAMR:
printf("Supported encoding is AMR\n");
break;
case OMX_AUDIO_CodingEVRC:
printf("Supported encoding is EVRC\n");
break;
case OMX_AUDIO_CodingG726:
printf("Supported encoding is G726\n");
break;
case OMX_AUDIO_CodingMIDI:
printf("Supported encoding is MIDI\n");
break;
case OMX_AUDIO_CodingATRAC3:
printf("Supported encoding is ATRAC3\n");
break;
case OMX_AUDIO_CodingATRACX:
printf("Supported encoding is ATRACX\n");
break;
case OMX_AUDIO_CodingATRACAAL:
printf("Supported encoding is ATRACAAL\n");
break;
default:
printf("Supported encoding is %d\n",
sAudioPortFormat.eEncoding);
}
sAudioPortFormat.nIndex++;
}
}
请注意,该代码包含特定于 Broadcom 库的枚举值,如OMX_AUDIO_CodingATRAC3
。根据 OpenMAX IL 扩展机制,这些是合法的值,但当然不是可移植的值。
Bellagio 库错误地为每个索引值返回OMX_AUDIO_CodingUnused
。
Broadcom 库可以返回许多值。例如,对于audio_decode
组件,它返回以下内容:
Supported audio formats are:
Supported encoding is MP3
Supported encoding is PCM
Supported encoding is AAC
Supported encoding is WMA
Supported encoding is Ogg Vorbis
Supported encoding is RA
Supported encoding is AMR
Supported encoding is EVRC
Supported encoding is G726
Supported encoding is FLAC
Supported encoding is DDP
Supported encoding is DTS
Supported encoding is WMAPRO
Supported encoding is ATRAC3
Supported encoding is ATRACX
Supported encoding is ATRACAAL
Supported encoding is MIDI
No more formats supported
遗憾的是,除了 PCM 之外,这些都不被支持。以下是根据 jamesh 在“音频解码器组件的 OMX _ 分配缓冲区失败”中的说法:
The way it works is that the component returns success for all codecs it may support (that is, all codecs we once owned). This is limited by the actual installed codec. It is best to detect which codecs exist at runtime, but these codes have never been written because they are never needed. This is also unlikely to happen, because Broadcom no longer supports audio codecs in this way, they have moved from the video core to the host CPU, because they are now strong enough to handle any audio decoding task.
这真的有点可悲。
将所有的位放在一起就产生了程序info.c
,如下所示:
/**
Based on code
Copyright (C) 2007-2009 STMicroelectronics
Copyright (C) 2007-2009 Nokia Corporation and/or its subsidiary(-ies).
under the LGPL
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/stat.h>
#include <OMX_Core.h>
#include <OMX_Component.h>
#include <OMX_Types.h>
#include <OMX_Audio.h>
#ifdef RASPBERRY_PI
#include <bcm_host.h>
#endif
OMX_ERRORTYPE err;
OMX_HANDLETYPE handle;
OMX_VERSIONTYPE specVersion, compVersion;
OMX_CALLBACKTYPE callbacks;
#define indent {int n = 0; while (n++ < indentLevel*2) putchar(' ');}
static void setHeader(OMX_PTR header, OMX_U32 size) {
/* header->nVersion */
OMX_VERSIONTYPE* ver = (OMX_VERSIONTYPE*)(header + sizeof(OMX_U32));
/* header->nSize */
*((OMX_U32*)header) = size;
/* for 1.2
ver->s.nVersionMajor = OMX_VERSION_MAJOR;
ver->s.nVersionMinor = OMX_VERSION_MINOR;
ver->s.nRevision = OMX_VERSION_REVISION;
ver->s.nStep = OMX_VERSION_STEP;
*/
ver->s.nVersionMajor = specVersion.s.nVersionMajor;
ver->s.nVersionMinor = specVersion.s.nVersionMinor;
ver->s.nRevision = specVersion.s.nRevision;
ver->s.nStep = specVersion.s.nStep;
}
void printState() {
OMX_STATETYPE state;
err = OMX_GetState(handle, &state);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on getting state\n");
exit(1);
}
switch (state) {
case OMX_StateLoaded: fprintf(stderr, "StateLoaded\n"); break;
case OMX_StateIdle: fprintf(stderr, "StateIdle\n"); break;
case OMX_StateExecuting: fprintf(stderr, "StateExecuting\n"); break;
case OMX_StatePause: fprintf(stderr, "StatePause\n"); break;
case OMX_StateWaitForResources: fprintf(stderr, "StateWiat\n"); break;
default: fprintf(stderr, "State unknown\n"); break;
}
}
OMX_ERRORTYPE setEncoding(int portNumber, OMX_AUDIO_CODINGTYPE encoding) {
OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
sPortDef.nPortIndex = portNumber;
sPortDef.nPortIndex = portNumber;
err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n",
0);
exit(1);
}
sPortDef.format.audio.eEncoding = encoding;
sPortDef.nBufferCountActual = sPortDef.nBufferCountMin;
err = OMX_SetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
return err;
}
void getPCMInformation(int indentLevel, int portNumber) {
/* assert: PCM is a supported mode */
OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
/* set it into PCM format before asking for PCM info */
if (setEncoding(portNumber, OMX_AUDIO_CodingPCM) != OMX_ErrorNone) {
fprintf(stderr, "Error in setting coding to PCM\n");
return;
}
setHeader(&sPCMMode, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
sPCMMode.nPortIndex = portNumber;
err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
if(err != OMX_ErrorNone){
indent printf("PCM mode unsupported\n");
} else {
indent printf(" PCM default sampling rate %d\n", sPCMMode.nSamplingRate);
indent printf(" PCM default bits per sample %d\n", sPCMMode.nBitPerSample);
indent printf(" PCM default number of channels %d\n", sPCMMode.nChannels);
}
/*
setHeader(&sAudioPortFormat, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
sAudioPortFormat.nIndex = 0;
sAudioPortFormat.nPortIndex = portNumber;
*/
}
void getMP3Information(int indentLevel, int portNumber) {
/* assert: MP3 is a supported mode */
OMX_AUDIO_PARAM_MP3TYPE sMP3Mode;
/* set it into MP3 format before asking for MP3 info */
if (setEncoding(portNumber, OMX_AUDIO_CodingMP3) != OMX_ErrorNone) {
fprintf(stderr, "Error in setting coding to MP3\n");
return;
}
setHeader(&sMP3Mode, sizeof(OMX_AUDIO_PARAM_MP3TYPE));
sMP3Mode.nPortIndex = portNumber;
err = OMX_GetParameter(handle, OMX_IndexParamAudioMp3, &sMP3Mode);
if(err != OMX_ErrorNone){
indent printf("MP3 mode unsupported\n");
} else {
indent printf(" MP3 default sampling rate %d\n", sMP3Mode.nSampleRate);
indent printf(" MP3 default bits per sample %d\n", sMP3Mode.nBitRate);
indent printf(" MP3 default number of channels %d\n", sMP3Mode.nChannels);
}
}
void getSupportedAudioFormats(int indentLevel, int portNumber) {
OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;
setHeader(&sAudioPortFormat, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
sAudioPortFormat.nIndex = 0;
sAudioPortFormat.nPortIndex = portNumber;
#ifdef LIM
printf("LIM doesn't set audio formats properly\n");
return;
#endif
indent printf("Supported audio formats are:\n");
for(;;) {
err = OMX_GetParameter(handle, OMX_IndexParamAudioPortFormat, &sAudioPortFormat);
if (err == OMX_ErrorNoMore) {
indent printf("No more formats supported\n");
return;
}
/* This shouldn't occur, but does with Broadcom library */
if (sAudioPortFormat.eEncoding == OMX_AUDIO_CodingUnused) {
indent printf("No coding format returned\n");
return;
}
switch (sAudioPortFormat.eEncoding) {
case OMX_AUDIO_CodingPCM:
indent printf("Supported encoding is PCM\n");
getPCMInformation(indentLevel+1, portNumber);
break;
case OMX_AUDIO_CodingVORBIS:
indent printf("Supported encoding is Ogg Vorbis\n");
break;
case OMX_AUDIO_CodingMP3:
indent printf("Supported encoding is MP3\n");
getMP3Information(indentLevel+1, portNumber);
break;
#ifdef RASPBERRY_PI
case OMX_AUDIO_CodingFLAC:
indent printf("Supported encoding is FLAC\n");
break;
case OMX_AUDIO_CodingDDP:
indent printf("Supported encoding is DDP\n");
break;
case OMX_AUDIO_CodingDTS:
indent printf("Supported encoding is DTS\n");
break;
case OMX_AUDIO_CodingWMAPRO:
indent printf("Supported encoding is WMAPRO\n");
break;
case OMX_AUDIO_CodingATRAC3:
indent printf("Supported encoding is ATRAC3\n");
break;
case OMX_AUDIO_CodingATRACX:
indent printf("Supported encoding is ATRACX\n");
break;
case OMX_AUDIO_CodingATRACAAL:
indent printf("Supported encoding is ATRACAAL\n");
break;
#endif
case OMX_AUDIO_CodingAAC:
indent printf("Supported encoding is AAC\n");
break;
case OMX_AUDIO_CodingWMA:
indent printf("Supported encoding is WMA\n");
break;
case OMX_AUDIO_CodingRA:
indent printf("Supported encoding is RA\n");
break;
case OMX_AUDIO_CodingAMR:
indent printf("Supported encoding is AMR\n");
break;
case OMX_AUDIO_CodingEVRC:
indent printf("Supported encoding is EVRC\n");
break;
case OMX_AUDIO_CodingG726:
indent printf("Supported encoding is G726\n");
break;
case OMX_AUDIO_CodingMIDI:
indent printf("Supported encoding is MIDI\n");
break;
/*
case OMX_AUDIO_Coding:
indent printf("Supported encoding is \n");
break;
*/
default:
indent printf("Supported encoding is not PCM or MP3 or Vorbis, is 0x%X\n",
sAudioPortFormat.eEncoding);
}
sAudioPortFormat.nIndex++;
}
}
void getAudioPortInformation(int indentLevel, int nPort, OMX_PARAM_PORTDEFINITIONTYPE sPortDef) {
indent printf("Port %d requires %d buffers\n", nPort, sPortDef.nBufferCountMin);
indent printf("Port %d has min buffer size %d bytes\n", nPort, sPortDef.nBufferSize);
if (sPortDef.eDir == OMX_DirInput) {
indent printf("Port %d is an input port\n", nPort);
} else {
indent printf("Port %d is an output port\n", nPort);
}
switch (sPortDef.eDomain) {
case OMX_PortDomainAudio:
indent printf("Port %d is an audio port\n", nPort);
indent printf("Port mimetype %s\n",
sPortDef.format.audio.cMIMEType);
switch (sPortDef.format.audio.eEncoding) {
case OMX_AUDIO_CodingPCM:
indent printf("Port encoding is PCM\n");
break;
case OMX_AUDIO_CodingVORBIS:
indent printf("Port encoding is Ogg Vorbis\n");
break;
case OMX_AUDIO_CodingMP3:
indent printf("Port encoding is MP3\n");
break;
default:
indent printf("Port encoding is not PCM or MP3 or Vorbis, is %d\n",
sPortDef.format.audio.eEncoding);
}
getSupportedAudioFormats(indentLevel+1, nPort);
break;
/* could put other port types here */
default:
indent printf("Port %d is not an audio port\n", nPort);
}
}
void getAllAudioPortsInformation(int indentLevel) {
OMX_PORT_PARAM_TYPE param;
OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
int startPortNumber;
int nPorts;
int n;
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting audio OMX_PORT_PARAM_TYPE parameter\n", 0);
return;
}
indent printf("Audio ports:\n");
indentLevel++;
startPortNumber = param.nStartPortNumber;
nPorts = param.nPorts;
if (nPorts == 0) {
indent printf("No ports of this type\n");
return;
}
indent printf("Ports start on %d\n", startPortNumber);
indent printf("There are %d open ports\n", nPorts);
for (n = 0; n < nPorts; n++) {
setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
sPortDef.nPortIndex = startPortNumber + n;
err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n", 0);
exit(1);
}
getAudioPortInformation(indentLevel+1, startPortNumber + n, sPortDef);
}
}
void getAllVideoPortsInformation(int indentLevel) {
OMX_PORT_PARAM_TYPE param;
int startPortNumber;
int nPorts;
int n;
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamVideoInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting video OMX_PORT_PARAM_TYPE parameter\n", 0);
return;
}
printf("Video ports:\n");
indentLevel++;
startPortNumber = param.nStartPortNumber;
nPorts = param.nPorts;
if (nPorts == 0) {
indent printf("No ports of this type\n");
return;
}
indent printf("Ports start on %d\n", startPortNumber);
indent printf("There are %d open ports\n", nPorts);
}
void getAllImagePortsInformation(int indentLevel) {
OMX_PORT_PARAM_TYPE param;
int startPortNumber;
int nPorts;
int n;
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamVideoInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting image OMX_PORT_PARAM_TYPE parameter\n", 0);
return;
}
printf("Image ports:\n");
indentLevel++;
startPortNumber = param.nStartPortNumber;
nPorts = param.nPorts;
if (nPorts == 0) {
indent printf("No ports of this type\n");
return;
}
indent printf("Ports start on %d\n", startPortNumber);
indent printf("There are %d open ports\n", nPorts);
}
void getAllOtherPortsInformation(int indentLevel) {
OMX_PORT_PARAM_TYPE param;
int startPortNumber;
int nPorts;
int n;
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamVideoInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting other OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
printf("Other ports:\n");
indentLevel++;
startPortNumber = param.nStartPortNumber;
nPorts = param.nPorts;
if (nPorts == 0) {
indent printf("No ports of this type\n");
return;
}
indent printf("Ports start on %d\n", startPortNumber);
indent printf("There are %d open ports\n", nPorts);
}
int main(int argc, char** argv) {
OMX_PORT_PARAM_TYPE param;
OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
OMX_AUDIO_PORTDEFINITIONTYPE sAudioPortDef;
OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;
OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
#ifdef RASPBERRY_PI
char *componentName = "OMX.broadcom.audio_mixer";
#endif
#ifdef LIM
char *componentName = "OMX.limoi.alsa_sink";
#else
char *componentName = "OMX.st.volume.component";
#endif
unsigned char name[128]; /* spec says 128 is max name length */
OMX_UUIDTYPE uid;
int startPortNumber;
int nPorts;
int n;
/* ovveride component name by command line argument */
if (argc == 2) {
componentName = argv[1];
}
# ifdef RASPBERRY_PI
bcm_host_init();
# endif
err = OMX_Init();
if(err != OMX_ErrorNone) {
fprintf(stderr, "OMX_Init() failed\n", 0);
exit(1);
}
/** Ask the core for a handle to the volume control component
*/
err = OMX_GetHandle(&handle, componentName, NULL /*app private data */, &callbacks);
if (err != OMX_ErrorNone) {
fprintf(stderr, "OMX_GetHandle failed\n", 0);
exit(1);
}
err = OMX_GetComponentVersion(handle, name, &compVersion, &specVersion, &uid);
if (err != OMX_ErrorNone) {
fprintf(stderr, "OMX_GetComponentVersion failed\n", 0);
exit(1);
}
printf("Component name: %s version %d.%d, Spec version %d.%d\n",
name, compVersion.s.nVersionMajor,
compVersion.s.nVersionMinor,
specVersion.s.nVersionMajor,
specVersion.s.nVersionMinor);
/** Get ports information */
getAllAudioPortsInformation(0);
getAllVideoPortsInformation(0);
getAllImagePortsInformation(0);
getAllOtherPortsInformation(0);
exit(0);
}
Bellagio 版本的Makefile
如下:
INCLUDES=-I ../libomxil-bellagio-0.9.3/include/
LIBS=-L ../libomxil-bellagio-0.9.3/src/.libs -l omxil-bellagio
CFLAGS = -g
info: info.c
cc $(FLAGS) $(INCLUDES) -o info info.c $(LIBS)
使用 Bellagio 实现的输出如下:
Component name: OMX.st.volume.component version 1.1, Spec version 1.1
Audio ports:
Ports start on 0
There are 2 open ports
Port 0 requires 2 buffers
Port 0 is an input port
Port 0 is an audio port
Port mimetype raw/audio
Port encoding is not PCM or MP3 or Vorbis, is 0
Supported audio formats are:
No coding format returned
Port 1 requires 2 buffers
Port 1 is an output port
Port 1 is an audio port
Port mimetype raw/audio
Port encoding is not PCM or MP3 or Vorbis, is 0
Supported audio formats are:
No coding format returned
Video ports:
No ports of this type
Image ports:
No ports of this type
Other ports:
No ports of this type
树莓派的Makefile
如下:
INCLUDES=-I /opt/vc/include/IL -I /opt/vc/include -I /opt/vc/include/interface/vcos/pthreads
CFLAGS=-g -DRASPBERRY_PI
LIBS=-L /opt/vc/lib -l openmaxil -l bcm_host
info: info.c
cc $(CFLAGS) $(INCLUDES) -o info info.c $(LIBS)
组件audio_render
在 Raspberry Pi 上的输出如下:
Audio ports:
Ports start on 100
There are 1 open ports
Port 100 requires 1 buffers
Port 100 is an input port
Port 100 is an audio port
Port mimetype (null)
Port encoding is PCM
Supported audio formats are:
Supported encoding is PCM
PCM default sampling rate 44100
PCM default bits per sample 16
PCM default number of channels 2
Supported encoding is DDP
No more formats supported
Video ports:
No ports of this type
Image ports:
No ports of this type
Other ports:
No ports of this type
直线电机的Makefile
如下:
INCLUDES=-I ../../lim-omx-1.1/LIM/limoi-core/include/
#LIBS=-L ../../lim-omx-1.1/LIM/limoi-base/src/.libs -l limoi-base
LIBS = -L /home/newmarch/osm-build/lib/ -l limoa -l limoi-core
CFLAGS = -g -DLIM
info: info.c
cc $(CFLAGS) $(INCLUDES) -o info info.c $(LIBS)
alsa_sink
组件的 LIM 输出如下:
Component name: OMX.limoi.alsa_sink version 0.0, Spec version 1.1
Audio ports:
Ports start on 0
There are 1 open ports
Port 0 requires 2 buffers
Port 0 is an input port
Port 0 is an audio port
Port mimetype (null)
Port encoding is PCM
LIM doesn't set audio formats properly
Error in getting video OMX_PORT_PARAM_TYPE parameter
Error in getting image OMX_PORT_PARAM_TYPE parameter
Error in getting other OMX_PORT_PARAM_TYPE parameter
当组件不支持某个模式时(这里的音频组件不支持视频、图像或其他模式),LIM 实现会抛出错误。这违反了 1.1 规范,该规范规定如下:
"All standard components shall support the following parameters:
o OMX_IndexParamPortDefinition
o OMX_IndexParamCompBufferSupplier
o OMX_IndexParamAudioInit
o OMX_IndexParamImageInit
o OMX_IndexParamVideoInit
o OMX_IndexParamOtherInit"
我想你可能会说alsa_sink
组件不是标准组件,所以它是允许的。嗯,好吧…
播放 PCM 音频文件
向输出设备播放音频需要使用audio_render
设备。这是 1.1 规范中的标准设备之一,包含在 Broadcom Raspberry Pi 库中,但不包含在 Bellagio 库中。LIM 有一个组件alsa_sink
,起着同样的作用。
播放音频的程序结构如下:
- 初始化库和音频渲染组件。
- 不断填充输入缓冲区,并要求组件清空缓冲区。
- 从组件捕获事件,告知缓冲区已被清空,以便安排重新填充缓冲区并请求清空缓冲区。
- 完工后清理。
请注意,Raspberry Pi 音频渲染组件将只播放 PCM 数据,而 LIM alsa_sink
组件只能以 44,100Hz 播放。
状态
初始化组件是一个多步骤的过程,具体取决于组件的状态。组件在Loaded
状态下创建。它们通过OMX_SendCommand(handle, OMX_CommandStateSet, <next state>, <param>)
从一种状态转换到另一种状态。从Loaded
出发的下一个州应该是Idle
,从那里到Executing
。还有其他一些你不需要关心的状态。
改变状态的请求是异步的。send 命令立即返回(嗯,在 5 毫秒内)。当状态发生实际变化时,会调用事件处理程序回调函数。
线
一些命令要求组件处于特定状态。将组件置于某种状态的请求是异步的。因此,客户端可以发出请求,但是客户端可能必须等待,直到状态发生变化。这最好通过客户端暂停其线程的操作来完成,直到被事件处理程序中发生的状态变化唤醒。
Linux/Unix 已经在管理多线程的 Posix pthreads 库上实现了标准化。出于我们的目的,您使用了这个库中的两个部分:在关键部分放置互斥体的能力和基于条件挂起/唤醒线程的能力。Pthreads 在很多地方都有涉及,Blaise Barney 有一个很短很好的教程叫做“POSIX Threads 编程”( https://computing.llnl.gov/tutorials/pthreads/#Misc
)。
您使用的函数和数据如下:
pthread_mutex_t mutex;
OMX_STATETYPE currentState = OMX_StateLoaded;
pthread_cond_t stateCond;
void waitFor(OMX_STATETYPE state) {
pthread_mutex_lock(&mutex);
while (currentState != state)
pthread_cond_wait(&stateCond, &mutex);
fprintf(stderr, "Wait successfully completed\n");
pthread_mutex_unlock(&mutex);
}
void wakeUp(OMX_STATETYPE newState) {
pthread_mutex_lock(&mutex);
currentState = newState;
pthread_cond_signal(&stateCond);
pthread_mutex_unlock(&mutex);
}
pthread_mutex_t empty_mutex;
int emptyState = 0;
OMX_BUFFERHEADERTYPE* pEmptyBuffer;
pthread_cond_t emptyStateCond;
void waitForEmpty() {
pthread_mutex_lock(&empty_mutex);
while (emptyState == 1)
pthread_cond_wait(&emptyStateCond, &empty_mutex);
emptyState = 1;
pthread_mutex_unlock(&empty_mutex);
}
void wakeUpEmpty(OMX_BUFFERHEADERTYPE* pBuffer) {
pthread_mutex_lock(&empty_mutex);
emptyState = 0;
pEmptyBuffer = pBuffer;
pthread_cond_signal(&emptyStateCond);
pthread_mutex_unlock(&empty_mutex);
}
void mutex_init() {
int n = pthread_mutex_init(&mutex, NULL);
if ( n != 0) {
fprintf(stderr, "Can't init state mutex\n");
}
n = pthread_mutex_init(&empty_mutex, NULL);
if ( n != 0) {
fprintf(stderr, "Can't init empty mutex\n");
}
}
OpenMAX IL 中的匈牙利符号
匈牙利符号是由查尔斯·西蒙尼发明的,用来给变量、结构和字段名添加类型或功能信息。Microsoft Windows SDK 中大量使用了一个窗体。在 OpenMAX IL 中,通过为变量、字段等添加前缀,使用了一种简化形式,如下所示:
- 以某种数字为前缀。
p
给指针加前缀。- 给结构或字符串加前缀。
- 给回调函数加前缀。
这些公约的价值是很有争议的。
回收
两种类型的回调函数与这个例子相关:在状态和一些其他事件改变时发生的事件回调,以及当组件清空输入缓冲区时发生的空缓冲区回调。这些在以下机构注册:
OMX_CALLBACKTYPE callbacks = { .EventHandler = cEventHandler,
.EmptyBufferDone = cEmptyBufferDone,
};
err = OMX_GetHandle(&handle, componentName, NULL /*app private data */, &callbacks);
组件资源
每个组件都有许多需要配置的端口。端口是组件的一些资源。每个端口开始时是启用的,但可以用OMX_SendCommand(handle, OMX_CommandPortDisable, <port number>, NULL)
设置为禁用。
启用的端口可以分配缓冲区,用于将数据传入和传出组件。这可以通过两种方式完成:OMX_AllocateBuffer
要求组件为客户端执行分配,而使用OMX_UseBuffer
客户端将一个缓冲区交给组件。由于可能存在缓冲区内存对齐问题,我更喜欢让组件进行分配。
这是一个棘手的部分。要在组件上分配或使用缓冲区,必须请求从Loaded
状态转换到Idle
。因此,在分配缓冲区之前,必须调用OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateIdle, <param>)
。但是直到每个端口都被禁用或者所有的缓冲区都被分配后,到Idle
的转换才会发生。
这最后一步让我绞尽脑汁了将近一周。audio_render
组件有两个端口:一个输入音频端口和一个时间更新端口。虽然我已经正确配置了音频端口,但我没有禁用时间端口,因为我不知道它有时间端口。因此,到Idle
的转换从未发生。下面是处理这种情况的代码:
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamOtherInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
printf("Other has %d ports\n", nPorts);
/* and disable it */
err = OMX_SendCommand(handle, OMX_CommandPortDisable, startPortNumber, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting port to disabled\n");
exit(1);
}
以下是如何设置音频端口的参数:
/** Get audio port information */
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
if (nPorts > 1) {
fprintf(stderr, "Render device has more than one port\n");
exit(1);
}
setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
sPortDef.nPortIndex = startPortNumber;
err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n", 0);
exit(1);
}
if (sPortDef.eDomain != OMX_PortDomainAudio) {
fprintf(stderr, "Port %d is not an audio port\n", startPortNumber);
exit(1);
}
if (sPortDef.eDir != OMX_DirInput) {
fprintf(stderr, "Port is not an input port\n");
exit(1);
}
if (sPortDef.format.audio.eEncoding == OMX_AUDIO_CodingPCM) {
printf("Port encoding is PCM\n");
} else {
printf("Port has unknown encoding\n");
}
/* create minimum number of buffers for the port */
nBuffers = sPortDef.nBufferCountActual = sPortDef.nBufferCountMin;
printf("Number of bufers is %d\n", nBuffers);
err = OMX_SetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in setting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
/* call to put state into idle before allocating buffers */
err = OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateIdle, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting state to idle\n");
exit(1);
}
err = OMX_SendCommand(handle, OMX_CommandPortEnable, startPortNumber, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting port to enabled\n");
exit(1);
}
nBufferSize = sPortDef.nBufferSize;
printf("%d buffers of size is %d\n", nBuffers, nBufferSize);
inBuffers = malloc(nBuffers * sizeof(OMX_BUFFERHEADERTYPE *));
if (inBuffers == NULL) {
fprintf(stderr, "Can't allocate buffers\n");
exit(1);
}
for (n = 0; n < nBuffers; n++) {
err = OMX_AllocateBuffer(handle, inBuffers+n, startPortNumber, NULL,
nBufferSize);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on AllocateBuffer in 1%i\n", err);
exit(1);
}
}
waitFor(OMX_StateIdle);
/* try setting the encoding to PCM mode */
setHeader(&sPCMMode, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
sPCMMode.nPortIndex = startPortNumber;
err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
if(err != OMX_ErrorNone){
printf("PCM mode unsupported\n");
exit(1);
} else {
printf("PCM mode supported\n");
printf("PCM sampling rate %d\n", sPCMMode.nSamplingRate);
printf("PCM nChannels %d\n", sPCMMode.nChannels);
}
设置输出设备
OpenMAX 有一个标准的音频渲染组件。但是它渲染到什么设备上呢?内置声卡?USB 声卡?这不是 OpenMAX IL 的一部分;甚至没有办法列出音频设备,只有音频组件。
OpenMAX 有一个扩展机制,OpenMAX 实现者可以使用它来回答类似这样的问题。Broadcom 核心实现具有可用于设置音频目的(源)设备的扩展类型OMX_CONFIG_BRCMAUDIODESTINATIONTYPE
(和OMX_CONFIG_BRCMAUDIOSOURCETYPE
)。下面是执行此操作的代码:
void setOutputDevice(const char *name) {
int32_t success = -1;
OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;
if (name && strlen(name) < sizeof(arDest.sName)) {
setHeader(&arDest, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
strcpy((char *)arDest.sName, name);
err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting audio destination\n");
exit(1);
}
}
}
这是它再次陷入黑暗的地方。头文件<IL/OMX_Broadcom.h>
声明sName
的默认值是“local ”,但没有给出任何其他值。Raspberry Pi 论坛表示,这是指 3.5 毫米模拟音频输出,hdmi 是通过使用值“HDMI”来选择的没有记录其他值,并且 Broadcom OpenMAX IL 似乎不支持任何其他音频设备。特别是,当前的 Broadcom OpenMAX IL 组件不支持 USB 音频设备的输入或输出。因此,你不能使用 OpenMAX IL 在 Raspberry Pi 上进行音频捕获,因为它没有 Broadcom 支持的音频输入。
主循环
一旦所有端口都设置好,播放音频文件包括填充缓冲区,等待它们变空,然后再填充它们,直到数据结束。有两种可能的样式。
- 在主循环中填充缓冲区一次,然后在空缓冲区回调中继续填充和清空缓冲区。
- 在主循环中,不断地填充和清空缓冲区,在每次填充之间等待缓冲区清空。
Bellagio 示例使用了第一种技术。然而,1.2 规范说“…IL 客户端不应该从 IL 回调上下文中调用 IL 核心或组件函数”,所以这不是一个好的技术。Raspberry Pi 示例使用了第二种技术,但是使用了一个非标准调用来查找当时的等待时间和延迟。最好只设置更多的 pthreads 条件,并在这些条件上进行阻塞。
这将导致一个如下所示的主循环:
emptyState = 1;
for (;;) {
int data_read = read(fd, inBuffers[0]->pBuffer, nBufferSize);
inBuffers[0]->nFilledLen = data_read;
inBuffers[0]->nOffset = 0;
filesize -= data_read;
if (data_read <= 0) {
fprintf(stderr, "In the %s no more input data available\n", __func__);
inBuffers[0]->nFilledLen=0;
inBuffers[0]->nFlags = OMX_BUFFERFLAG_EOS;
bEOS=OMX_TRUE;
err = OMX_EmptyThisBuffer(handle, inBuffers[0]);
break;
}
if(!bEOS) {
fprintf(stderr, "Emptying again buffer %p %d bytes, %d to go\n", inBuffers[0], data_read, filesize);
err = OMX_EmptyThisBuffer(handle, inBuffers[0]);
}else {
fprintf(stderr, "In %s Dropping Empty This buffer to Audio Dec\n", __func__);
}
waitForEmpty();
printf("Waited for empty\n");
}
printf("Buffers emptied\n");
完整程序
完整的程序如下:
/**
Based on code
Copyright (C) 2007-2009 STMicroelectronics
Copyright (C) 2007-2009 Nokia Corporation and/or its subsidiary(-ies).
under the LGPL
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/stat.h>
#include <pthread.h>
#include <OMX_Core.h>
#include <OMX_Component.h>
#include <OMX_Types.h>
#include <OMX_Audio.h>
#ifdef RASPBERRY_PI
#include <bcm_host.h>
#include <IL/OMX_Broadcom.h>
#endif
OMX_ERRORTYPE err;
OMX_HANDLETYPE handle;
OMX_VERSIONTYPE specVersion, compVersion;
int fd = 0;
unsigned int filesize;
static OMX_BOOL bEOS=OMX_FALSE;
OMX_U32 nBufferSize;
int nBuffers;
pthread_mutex_t mutex;
OMX_STATETYPE currentState = OMX_StateLoaded;
pthread_cond_t stateCond;
void waitFor(OMX_STATETYPE state) {
pthread_mutex_lock(&mutex);
while (currentState != state)
pthread_cond_wait(&stateCond, &mutex);
pthread_mutex_unlock(&mutex);
}
void wakeUp(OMX_STATETYPE newState) {
pthread_mutex_lock(&mutex);
currentState = newState;
pthread_cond_signal(&stateCond);
pthread_mutex_unlock(&mutex);
}
pthread_mutex_t empty_mutex;
int emptyState = 0;
OMX_BUFFERHEADERTYPE* pEmptyBuffer;
pthread_cond_t emptyStateCond;
void waitForEmpty() {
pthread_mutex_lock(&empty_mutex);
while (emptyState == 1)
pthread_cond_wait(&emptyStateCond, &empty_mutex);
emptyState = 1;
pthread_mutex_unlock(&empty_mutex);
}
void wakeUpEmpty(OMX_BUFFERHEADERTYPE* pBuffer) {
pthread_mutex_lock(&empty_mutex);
emptyState = 0;
pEmptyBuffer = pBuffer;
pthread_cond_signal(&emptyStateCond);
pthread_mutex_unlock(&empty_mutex);
}
void mutex_init() {
int n = pthread_mutex_init(&mutex, NULL);
if ( n != 0) {
fprintf(stderr, "Can't init state mutex\n");
}
n = pthread_mutex_init(&empty_mutex, NULL);
if ( n != 0) {
fprintf(stderr, "Can't init empty mutex\n");
}
}
static void display_help() {
fprintf(stderr, "Usage: render input_file");
}
/** Gets the file descriptor's size
* @return the size of the file. If size cannot be computed
* (i.e. stdin, zero is returned)
*/
static int getFileSize(int fd) {
struct stat input_file_stat;
int err;
/* Obtain input file length */
err = fstat(fd, &input_file_stat);
if(err){
fprintf(stderr, "fstat failed",0);
exit(-1);
}
return input_file_stat.st_size;
}
OMX_ERRORTYPE cEventHandler(
OMX_HANDLETYPE hComponent,
OMX_PTR pAppData,
OMX_EVENTTYPE eEvent,
OMX_U32 Data1,
OMX_U32 Data2,
OMX_PTR pEventData) {
fprintf(stderr, "Hi there, I am in the %s callback\n", __func__);
if(eEvent == OMX_EventCmdComplete) {
if (Data1 == OMX_CommandStateSet) {
fprintf(stderr, "Component State changed in ", 0);
switch ((int)Data2) {
case OMX_StateInvalid:
fprintf(stderr, "OMX_StateInvalid\n", 0);
break;
case OMX_StateLoaded:
fprintf(stderr, "OMX_StateLoaded\n", 0);
break;
case OMX_StateIdle:
fprintf(stderr, "OMX_StateIdle\n",0);
break;
case OMX_StateExecuting:
fprintf(stderr, "OMX_StateExecuting\n",0);
break;
case OMX_StatePause:
fprintf(stderr, "OMX_StatePause\n",0);
break;
case OMX_StateWaitForResources:
fprintf(stderr, "OMX_StateWaitForResources\n",0);
break;
}
wakeUp((int) Data2);
} else if (Data1 == OMX_CommandPortEnable){
} else if (Data1 == OMX_CommandPortDisable){
}
} else if(eEvent == OMX_EventBufferFlag) {
if((int)Data2 == OMX_BUFFERFLAG_EOS) {
}
} else {
fprintf(stderr, "Param1 is %i\n", (int)Data1);
fprintf(stderr, "Param2 is %i\n", (int)Data2);
}
return OMX_ErrorNone;
}
OMX_ERRORTYPE cEmptyBufferDone(
OMX_HANDLETYPE hComponent,
OMX_PTR pAppData,
OMX_BUFFERHEADERTYPE* pBuffer) {
fprintf(stderr, "Hi there, I am in the %s callback.\n", __func__);
if (bEOS) {
fprintf(stderr, "Buffers emptied, exiting\n");
}
wakeUpEmpty(pBuffer);
fprintf(stderr, "Exiting callback\n");
return OMX_ErrorNone;
}
OMX_CALLBACKTYPE callbacks = { .EventHandler = cEventHandler,
.EmptyBufferDone = cEmptyBufferDone,
};
void printState() {
OMX_STATETYPE state;
err = OMX_GetState(handle, &state);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on getting state\n");
exit(1);
}
switch (state) {
case OMX_StateLoaded: fprintf(stderr, "StateLoaded\n"); break;
case OMX_StateIdle: fprintf(stderr, "StateIdle\n"); break;
case OMX_StateExecuting: fprintf(stderr, "StateExecuting\n"); break;
case OMX_StatePause: fprintf(stderr, "StatePause\n"); break;
case OMX_StateWaitForResources: fprintf(stderr, "StateWiat\n"); break;
default: fprintf(stderr, "State unknown\n"); break;
}
}
static void setHeader(OMX_PTR header, OMX_U32 size) {
/* header->nVersion */
OMX_VERSIONTYPE* ver = (OMX_VERSIONTYPE*)(header + sizeof(OMX_U32));
/* header->nSize */
*((OMX_U32*)header) = size;
/* for 1.2
ver->s.nVersionMajor = OMX_VERSION_MAJOR;
ver->s.nVersionMinor = OMX_VERSION_MINOR;
ver->s.nRevision = OMX_VERSION_REVISION;
ver->s.nStep = OMX_VERSION_STEP;
*/
ver->s.nVersionMajor = specVersion.s.nVersionMajor;
ver->s.nVersionMinor = specVersion.s.nVersionMinor;
ver->s.nRevision = specVersion.s.nRevision;
ver->s.nStep = specVersion.s.nStep;
}
/**
* Disable unwanted ports, or we can't transition to Idle state
*/
void disablePort(OMX_INDEXTYPE paramType) {
OMX_PORT_PARAM_TYPE param;
int nPorts;
int startPortNumber;
int n;
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, paramType, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
if (nPorts > 0) {
fprintf(stderr, "Other has %d ports\n", nPorts);
/* and disable it */
for (n = 0; n < nPorts; n++) {
err = OMX_SendCommand(handle, OMX_CommandPortDisable, n + startPortNumber, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting port to disabled\n");
exit(1);
}
}
}
}
#ifdef RASPBERRY_PI
/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(const char *name) {
int32_t success = -1;
OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;
if (name && strlen(name) < sizeof(arDest.sName)) {
setHeader(&arDest, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
strcpy((char *)arDest.sName, name);
err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting audio destination\n");
exit(1);
}
}
}
#endif
void setPCMMode(int startPortNumber) {
OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
setHeader(&sPCMMode, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
sPCMMode.nPortIndex = startPortNumber;
sPCMMode.nSamplingRate = 48000;
sPCMMode.nChannels;
err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
if(err != OMX_ErrorNone){
fprintf(stderr, "PCM mode unsupported\n");
return;
} else {
fprintf(stderr, "PCM mode supported\n");
fprintf(stderr, "PCM sampling rate %d\n", sPCMMode.nSamplingRate);
fprintf(stderr, "PCM nChannels %d\n", sPCMMode.nChannels);
}
}
int main(int argc, char** argv) {
OMX_PORT_PARAM_TYPE param;
OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
OMX_AUDIO_PORTDEFINITIONTYPE sAudioPortDef;
OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;
OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
OMX_BUFFERHEADERTYPE **inBuffers;
#ifdef RASPBERRY_PI
char *componentName = "OMX.broadcom.audio_render";
#endif
#ifdef LIM
char *componentName = "OMX.limoi.alsa_sink";
#endif
unsigned char name[OMX_MAX_STRINGNAME_SIZE];
OMX_UUIDTYPE uid;
int startPortNumber;
int nPorts;
int n;
# ifdef RASPBERRY_PI
bcm_host_init();
# endif
fprintf(stderr, "Thread id is %p\n", pthread_self());
if(argc < 2){
display_help();
exit(1);
}
fd = open(argv[1], O_RDONLY);
if(fd < 0){
perror("Error opening input file\n");
exit(1);
}
filesize = getFileSize(fd);
err = OMX_Init();
if(err != OMX_ErrorNone) {
fprintf(stderr, "OMX_Init() failed\n", 0);
exit(1);
}
/** Ask the core for a handle to the audio render component
*/
err = OMX_GetHandle(&handle, componentName, NULL /*app private data */, &callbacks);
if(err != OMX_ErrorNone) {
fprintf(stderr, "OMX_GetHandle failed\n", 0);
exit(1);
}
err = OMX_GetComponentVersion(handle, name, &compVersion, &specVersion, &uid);
if(err != OMX_ErrorNone) {
fprintf(stderr, "OMX_GetComponentVersion failed\n", 0);
exit(1);
}
/** disable other ports */
disablePort(OMX_IndexParamOtherInit);
/** Get audio port information */
setHeader(¶m, sizeof(OMX_PORT_PARAM_TYPE));
err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, ¶m);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
if (nPorts > 1) {
fprintf(stderr, "Render device has more than one port\n");
exit(1);
}
/* Get and check port information */
setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
sPortDef.nPortIndex = startPortNumber;
err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n", 0);
exit(1);
}
if (sPortDef.eDomain != OMX_PortDomainAudio) {
fprintf(stderr, "Port %d is not an audio port\n", startPortNumber);
exit(1);
}
if (sPortDef.eDir != OMX_DirInput) {
fprintf(stderr, "Port is not an input port\n");
exit(1);
}
if (sPortDef.format.audio.eEncoding == OMX_AUDIO_CodingPCM) {
fprintf(stderr, "Port encoding is PCM\n");
} else {
fprintf(stderr, "Port has unknown encoding\n");
}
/* Create minimum number of buffers for the port */
nBuffers = sPortDef.nBufferCountActual = sPortDef.nBufferCountMin;
fprintf(stderr, "Number of bufers is %d\n", nBuffers);
err = OMX_SetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
if(err != OMX_ErrorNone){
fprintf(stderr, "Error in setting OMX_PORT_PARAM_TYPE parameter\n", 0);
exit(1);
}
if (sPortDef.bEnabled) {
fprintf(stderr, "Port is enabled\n");
} else {
fprintf(stderr, "Port is not enabled\n");
}
/* call to put state into idle before allocating buffers */
err = OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateIdle, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting state to idle\n");
exit(1);
}
err = OMX_SendCommand(handle, OMX_CommandPortEnable, startPortNumber, NULL);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on setting port to enabled\n");
exit(1);
}
/* Configure buffers for the port */
nBufferSize = sPortDef.nBufferSize;
fprintf(stderr, "%d buffers of size is %d\n", nBuffers, nBufferSize);
inBuffers = malloc(nBuffers * sizeof(OMX_BUFFERHEADERTYPE *));
if (inBuffers == NULL) {
fprintf(stderr, "Can't allocate buffers\n");
exit(1);
}
for (n = 0; n < nBuffers; n++) {
err = OMX_AllocateBuffer(handle, inBuffers+n, startPortNumber, NULL,
nBufferSize);
if (err != OMX_ErrorNone) {
fprintf(stderr, "Error on AllocateBuffer in 1%i\n", err);
exit(1);
}
}
/* Make sure we've reached Idle state */
waitFor(OMX_StateIdle);
/* Now try to switch to Executing state */
err = OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateExecuting, NULL);
if(err != OMX_ErrorNone){
exit(1);
}
/* One buffer is the minimum for Broadcom component, so use that */
pEmptyBuffer = inBuffers[0];
emptyState = 1;
/* Fill and empty buffer */
for (;;) {
int data_read = read(fd, pEmptyBuffer->pBuffer, nBufferSize);
pEmptyBuffer->nFilledLen = data_read;
pEmptyBuffer->nOffset = 0;
filesize -= data_read;
if (data_read <= 0) {
fprintf(stderr, "In the %s no more input data available\n", __func__);
pEmptyBuffer->nFilledLen=0;
pEmptyBuffer->nFlags = OMX_BUFFERFLAG_EOS;
bEOS=OMX_TRUE;
}
fprintf(stderr, "Emptying again buffer %p %d bytes, %d to go\n", pEmptyBuffer, data_read, filesize);
err = OMX_EmptyThisBuffer(handle, pEmptyBuffer);
waitForEmpty();
fprintf(stderr, "Waited for empty\n");
if (bEOS) {
fprintf(stderr, "Exiting loop\n");
break;
}
}
fprintf(stderr, "Buffers emptied\n");
exit(0);
}
结论
Khronos 集团已经为低性能系统中的音频和视频制定了规范。这些目前被 Android 和 Raspberry Pi 使用。本章已经给出了这些规范和一些示例程序的介绍性概述。LIM 包自 2012 年以来就没有更新过,而 Bellagio 包自 2011 年以来就没有更新过,所以它们似乎没有得到积极的维护。另一方面,RPi 正在蓬勃发展,使用 GPU 的 OpenMAX 编程在我的书《Raspberry Pi GPU 音频视频编程》中有详细介绍。