Linux 声音编程教程(四)

原文:Linux Sound Programming

协议:CC BY-NC-SA 4.0

九、Java 声音

本章讲述了使用 Java Sound API 对采样数据进行编程的要点。本章假设读者具备良好的 Java 应用知识。Java 声音在 Java 早期就已经存在了。它处理采样和 MIDI 数据,是一个综合的系统。

资源

许多资源可用于 Java Sound。

关键 Java 声音类

这些是关键类:

  • AudioSystem类是所有采样音频类的入口点。
  • AudioFormat类指定了关于格式的信息,比如采样率。
  • AudioInputStream类从混合器的目标行提供一个输入流。
  • Mixer类代表一个音频设备。
  • SourceDataLine类代表一个设备的输入线。
  • TargetDataLine类代表一个设备的输出线。

关于设备的信息

每个设备由一个Mixer对象表示。向AudioSystem询问这些内容的列表。每个混音器都有一组目标(输出)线和源(输入)线。分别询问每个混音器。下面这个节目叫做DeviceInfo.java:

import javax.sound.sampled.*;

public class DeviceInfo {

    public static void main(String[] args) throws Exception {

        Mixer.Info[] minfoSet = AudioSystem.getMixerInfo();
        System.out.println("Mixers:");
        for (Mixer.Info minfo: minfoSet) {
            System.out.println("   " + minfo.toString());

            Mixer m = AudioSystem.getMixer(minfo);
            System.out.println("    Mixer: " + m.toString());
            System.out.println("      Source lines");
            Line.Info[] slines = m.getSourceLineInfo();
            for (Line.Info s: slines) {
                System.out.println("        " + s.toString());
            }

            Line.Info[] tlines = m.getTargetLineInfo();
            System.out.println("      Target lines");
            for (Line.Info t: tlines) {
                System.out.println("        " + t.toString());
            }
        }
    }
}

以下是我的系统上的部分输出:

Mixers:
   PulseAudio Mixer, version 0.02
      Source lines
        interface SourceDataLine supporting 42 audio formats, and buffers of 0 to 1000000 bytes
        interface Clip supporting 42 audio formats, and buffers of 0 to 1000000 bytes
      Target lines
        interface TargetDataLine supporting 42 audio formats, and buffers of 0 to 1000000 bytes
   default [default], version 1.0.24
      Source lines
        interface SourceDataLine supporting 512 audio formats, and buffers of at least 32 bytes
        interface Clip supporting 512 audio formats, and buffers of at least 32 bytes
      Target lines
        interface TargetDataLine supporting 512 audio formats, and buffers of at least 32 bytes
   PCH [plughw:0,0], version 1.0.24
      Source lines
        interface SourceDataLine supporting 24 audio formats, and buffers of at least 32 bytes
        interface Clip supporting 24 audio formats, and buffers of at least 32 bytes
      Target lines
        interface TargetDataLine supporting 24 audio formats, and buffers of at least 32 bytes
   NVidia [plughw:1,3], version 1.0.24
      Source lines
        interface SourceDataLine supporting 96 audio formats, and buffers of at least 32 bytes
        interface Clip supporting 96 audio formats, and buffers of at least 32 bytes
      Target lines
   NVidia [plughw:1,7], version 1.0.24
      Source lines
        interface SourceDataLine supporting 96 audio formats, and buffers of at least 32 bytes
        interface Clip supporting 96 audio formats, and buffers of at least 32 bytes
      Target lines
   NVidia [plughw:1,8], version 1.0.24
      Source lines
        interface SourceDataLine supporting 96 audio formats, and buffers of at least 32 bytes
        interface Clip supporting 96 audio formats, and buffers of at least 32 bytes
      Target lines

这显示了脉冲音频和 ALSA 混频器。例如,进一步的查询可以显示支持的格式。

播放文件中的音频

要从文件中播放,必须创建适当的对象,以便从文件中读取和写入输出设备。这些措施如下:

  • AudioSystem请求一个AudioInputStream。它是用文件名作为参数创建的。
  • 为输出创建源数据行。术语可能会混淆:程序产生输出,但这是数据线的输入。因此,数据线必须是输出设备的源。数据线的创建是一个多步骤的过程。
    • 首先创建一个AudioFormat对象来指定数据线的参数。
    • 为 audion 格式的源数据线创建一个DataLine.Info
    • 从将处理DataLine.InfoAudioSystem请求源数据线。

按照这些步骤,可以从输入流中读取数据,并将其写入数据线。图 9-1 显示了相关类的 UML 类图。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 9-1。

Class diagram for playing audio from a file

import java.io.File;
import java.io.IOException;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.FloatControl;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;

public class PlayAudioFile {
    /** Plays audio from given file names. */
    public static void main(String [] args) {
        // Check for given sound file names.
        if (args.length < 1) {
            System.out.println("Usage: java Play <sound file names>*");
            System.exit(0);
        }

        // Process arguments.
        for (int i = 0; i < args.length; i++)
            playAudioFile(args[i]);

        // Must exit explicitly since audio creates non-daemon threads.
        System.exit(0);
    } // main

    public static void playAudioFile(String fileName) {
        File soundFile = new File(fileName);

        try {
            // Create a stream from the given file.
            // Throws IOException or UnsupportedAudioFileException
            AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(soundFile);
            // AudioSystem.getAudioInputStream(inputStream); // alternate audio stream from inputstream
            playAudioStream(audioInputStream);
        } catch (Exception e) {
            System.out.println("Problem with file " + fileName + ":");
            e.printStackTrace();
        }
    } // playAudioFile

    /** Plays audio from the given audio input stream. */
    public static void playAudioStream(AudioInputStream audioInputStream) {
        // Audio format provides information like sample rate, size, channels.
        AudioFormat audioFormat = audioInputStream.getFormat();
        System.out.println("Play input audio format=" + audioFormat);

        // Open a data line to play our type of sampled audio.
        // Use SourceDataLine for play and TargetDataLine for record.
        DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
        if (!AudioSystem.isLineSupported(info)) {
            System.out.println("Play.playAudioStream does not handle this type of audio on this system.");
            return;
        }

        try {
            // Create a SourceDataLine for play back (throws LineUnavailableException).
            SourceDataLine dataLine = (SourceDataLine) AudioSystem.getLine(info);
            // System.out.println("SourceDataLine class=" + dataLine.getClass());

            // The line acquires system resources (throws LineAvailableException).
            dataLine.open(audioFormat);

            // Adjust the volume on the output line.
            if(dataLine.isControlSupported(FloatControl.Type.MASTER_GAIN)) {
                FloatControl volume = (FloatControl) dataLine.getControl(FloatControl.Type.MASTER_GAIN);
                volume.setValue(6.0F);
            }

            // Allows the line to move data in and out to a port.
            dataLine.start();

            // Create a buffer for moving data from the audio stream to the line.
            int bufferSize = (int) audioFormat.getSampleRate() * audioFormat.getFrameSize();
            byte [] buffer = new byte[ bufferSize ];

            // Move the data until done or there is an error.
            try {
                int bytesRead = 0;
                while (bytesRead >= 0) {
                    bytesRead = audioInputStream.read(buffer, 0, buffer.length);
                    if (bytesRead >= 0) {
                        // System.out.println("Play.playAudioStream bytes read=" + bytesRead +
                        // ", frame size=" + audioFormat.getFrameSize() + ", frames read=" + bytesRead / audioFormat.getFrameSize());
                        // Odd sized sounds throw an exception if we don't write the same amount.
                        int framesWritten = dataLine.write(buffer, 0, bytesRead);
                    }
                } // while
            } catch (IOException e) {
                e.printStackTrace();
            }

            System.out.println("Play.playAudioStream draining line.");
            // Continues data line I/O until its buffer is drained.
            dataLine.drain();

            System.out.println("Play.playAudioStream closing line.");
            // Closes the data line, freeing any resources such as the audio device.
            dataLine.close();
        } catch (LineUnavailableException e) {
            e.printStackTrace();
        }
    } // playAudioStream
} // PlayAudioFile

将音频录制到文件

做这件事的大部分工作是准备音频输入流。一旦完成,方法AudioSystemwrite将从音频输入流复制输入到输出文件。

要准备音频输入流,请执行以下步骤:

  1. 创建一个描述输入参数的AudioFormat对象。
  2. 麦克风产生音频。所以,它需要一个TargetDataLine。所以,为目标数据线创建一个DataLine.Info
  3. AudioSystem询问满足信息的行。
  4. AudioInputStream把线包起来。

输出只是一个 Java File

然后使用AudioSystem函数write()将流复制到文件中。图 9-2 显示了 UML 类图。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 9-2。

UML diagram for recording audio to a file

该计划如下:

import javax.sound.sampled.*;
import java.io.File;

/**
 * Sample audio recorder
 */
public class Recorder extends Thread
{
    /**
     * The TargetDataLine that we’ll use to read data from
     */
    private TargetDataLine line;

    /**
     * The audio format type that we’ll encode the audio data with
     */
    private AudioFileFormat.Type targetType = AudioFileFormat.Type.WAVE;

    /**
     * The AudioInputStream that we’ll read the audio data from
     */
    private AudioInputStream inputStream;

    /**
     * The file that we’re going to write data out to
     */
    private File file;

    /**
     * Creates a new Audio Recorder
     */
    public Recorder(String outputFilename)
    {
        try {
            // Create an AudioFormat that specifies how the recording will be performed
            // In this example we’ll 44.1Khz, 16-bit, stereo
            AudioFormat audioFormat = new AudioFormat(
            AudioFormat.Encoding.PCM_SIGNED,           // Encoding technique
            44100.0F,                                  // Sample Rate
            16,                                        // Number of bits in each channel
            2,                                         // Number of channels (2=stereo)
            4,                                         // Number of bytes in each frame
            44100.0F,                                  // Number of frames per second
            false);                                    // Big-endian (true) or little-
            // endian (false)

            // Create our TargetDataLine that will be used to read audio data by first
            // creating a DataLine instance for our audio format type
            DataLine.Info info = new DataLine.Info(TargetDataLine.class, audioFormat);

            // Next we ask the AudioSystem to retrieve a line that matches the
            // DataLine Info
            this.line = (TargetDataLine)AudioSystem.getLine(info);

            // Open the TargetDataLine with the specified format
            this.line.open(audioFormat);

            // Create an AudioInputStream that we can use to read from the line
            this.inputStream = new AudioInputStream(this.line);

            // Create the output file
            this.file = new File(outputFilename);
        }
        catch(Exception e) {
            e.printStackTrace();
        }
    }

    public void startRecording() {
        // Start the TargetDataLine
        this.line.start();

        // Start our thread
        start();
    }

    public void stopRecording() {
        // Stop and close the TargetDataLine
        this.line.stop();
        this.line.close();
    }

    public void run() {
        try {
            // Ask the AudioSystem class to write audio data from the audio input stream
            // to our file in the specified data type (PCM 44.1Khz, 16-bit, stereo)
            AudioSystem.write(this.inputStream, this.targetType, this.file);
        }
        catch(Exception e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        if (args.length == 0) {
            System.out.println("Usage: Recorder <filename>");
            System.exit(0);
        }

        try {
            // Create a recorder that writes WAVE data to the specified filename
            Recorder r = new Recorder(args[0]);
            System.out.println("Press ENTER to start recording");
            System.in.read();

            // Start the recorder
            r.startRecording();

            System.out.println("Press ENTER to stop recording");
            System.in.read();

            // Stop the recorder
            r.stopRecording();

            System.out.println("Recording complete");
        }
        catch(Exception e) {
            e.printStackTrace();
        }
    }

}

向扬声器播放麦克风

这是前两个节目的结合。准备一个AudioInputStream用于从麦克风读取。一个SourceDataLine是准备写给演讲者的。通过从音频输入流读取数据并写入源数据线,将数据从第一个复制到第二个。图 9-3 显示了 UML 类图。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 9-3。

UML diagram for sending microphone input to a speaker

该计划如下:

import java.io.File;
import java.io.IOException;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.Line;
import javax.sound.sampled.Line.Info;
import javax.sound.sampled.TargetDataLine;
import javax.sound.sampled.FloatControl;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;

public class PlayMicrophone {
    private static final int FRAMES_PER_BUFFER = 1024;

    public static void main(String[] args) throws Exception {

        new PlayMicrophone().

playAudio();

    }

    private void out(String strMessage)
    {
        System.out.println(strMessage);
    }

  //This method creates and returns an
  // AudioFormat object for a given set of format
  // parameters.  If these parameters don't work
  // well for you, try some of the other
  // allowable parameter values, which are shown
  // in comments following the declarations.
  private  AudioFormat getAudioFormat(){
    float sampleRate = 44100.0F;    //8000,11025,16000,22050,44100
    int sampleSizeInBits = 16;      //8,16
    int channels = 1;               //1,2
    boolean signed = true;          //true,false
    boolean bigEndian = false;      //true,false
    return new AudioFormat(sampleRate,
                           sampleSizeInBits,
                           channels,
                           signed,
                           bigEndian);
  }//end getAudioFormat

    public void playAudio() throws Exception {
        AudioFormat audioFormat;
        TargetDataLine targetDataLine;

        audioFormat = getAudioFormat();
        DataLine.Info dataLineInfo =
            new DataLine.Info(
                              TargetDataLine.class,
                              audioFormat);
        targetDataLine = (TargetDataLine)
            AudioSystem.getLine(dataLineInfo);

        /*
        Line.Info lines[] = AudioSystem.getTargetLineInfo(dataLineInfo);
        for (int n = 0; n < lines.length; n++) {
            System.out.println("Target " + lines[n].toString() + " " + lines[n].getLineClass());
        }
        targetDataLine = (TargetDataLine)
            AudioSystem.getLine(lines[0]);
        */

        targetDataLine.open(audioFormat,
                            audioFormat.getFrameSize() * FRAMES_PER_BUFFER);
        targetDataLine.start();

        playAudioStream(new AudioInputStream(targetDataLine));

        /*
        File soundFile = new File( fileName );

        try {
            // Create a stream from the given file.
            // Throws IOException or UnsupportedAudioFileException
            AudioInputStream audioInputStream = AudioSystem.getAudioInputStream( soundFile );
            // AudioSystem.getAudioInputStream( inputStream ); // alternate audio stream from inputstream
            playAudioStream( audioInputStream );
        } catch ( Exception e ) {
            System.out.println( "Problem with file " + fileName + ":" );
            e.printStackTrace();
        }
        */
    } // playAudioFile

    /** Plays audio from the given audio input stream. */
    public void playAudioStream( AudioInputStream audioInputStream ) {
        // Audio format provides information like sample rate, size, channels.
        AudioFormat audioFormat = audioInputStream.getFormat();
        System.out.println( "Play input audio format=" + audioFormat );

        // Open a data line to play our type of sampled audio.
        // Use SourceDataLine for play and TargetDataLine for record.
        DataLine.Info info = new DataLine.Info( SourceDataLine.class, audioFormat );

        Line.Info lines[] = AudioSystem.getSourceLineInfo(info);
        for (int n = 0; n < lines.length; n++) {
            System.out.println("Source " + lines[n].toString() + " " + lines[n].getLineClass());
        }

        if ( !AudioSystem.isLineSupported( info ) ) {
            System.out.println( "Play.playAudioStream does not handle this type of audio on this system." );
            return;
        }

        try {
            // Create a SourceDataLine for play back (throws LineUnavailableException).
            SourceDataLine dataLine = (SourceDataLine) AudioSystem.getLine( info );
            // System.out.println( "SourceDataLine class=" + dataLine.getClass() );

            // The line acquires system resources (throws LineAvailableException).
            dataLine.open( audioFormat,
                           audioFormat.getFrameSize() * FRAMES_PER_BUFFER);

            // Adjust the volume on the output line.
            if( dataLine.isControlSupported( FloatControl.Type.MASTER_GAIN ) ) {
                FloatControl volume = (FloatControl) dataLine.getControl( FloatControl.Type.MASTER_GAIN );
                volume.setValue( 6.0F );
            }

            // Allows the line to move data in and out to a port.
            dataLine.start();

            // Create a buffer for moving data from the audio stream to the line.
            int bufferSize = (int) audioFormat.getSampleRate() * audioFormat.getFrameSize();
            bufferSize =  audioFormat.getFrameSize() * FRAMES_PER_BUFFER;
            System.out.println("Buffer size: " + bufferSize);
            byte [] buffer = new byte[ bufferSize ];

            // Move the data until done or there is an error.
            try {
                int bytesRead = 0;
                while ( bytesRead >= 0 ) {
                    bytesRead = audioInputStream.read( buffer, 0, buffer.length );
                    if ( bytesRead >= 0 ) {
                        System.out.println( "Play.playAudioStream bytes read=" + bytesRead +
                        ", frame size=" + audioFormat.getFrameSize() + ", frames read=" + bytesRead / audioFormat.getFrameSize() );
                        // Odd sized sounds throw an exception if we don't write the same amount.
                        int framesWritten = dataLine.write( buffer, 0, bytesRead );
                    }
                } // while
            } catch ( IOException e ) {
                e.printStackTrace();
            }

            System.out.println( "Play.playAudioStream draining line." );
            // Continues data line I/O until its buffer is drained.
            dataLine.drain();

            System.out.println( "Play.playAudioStream closing line." );
            // Closes the data line, freeing any resources such as the audio device.
            dataLine.close();
        } catch ( LineUnavailableException e ) {
            e.printStackTrace();
        }
    } // playAudioStream

}

JavaSound 从哪里获得设备?

本章的第一个程序显示了调音台设备及其属性的列表。Java 是如何获得这些信息的?本节涵盖了 JDK 1.8,OpenJDK 大概也会类似。您将需要来自 Oracle 的 Java 源代码来跟踪这一点。或者,继续前进。

文件jre/lib/resources.jar包含 JRE 运行时使用的资源列表。这是一个 zip 文件,包含文件META-INF/services/javax.sound.sampled.spi.MixerProvider。在我的系统上,这个文件的内容如下:

# last mixer is default mixer
com.sun.media.sound.PortMixerProvider
com.sun.media.sound.DirectAudioDeviceProvider

com.sun.media.sound.PortMixerProvider在我系统上的文件java/media/src/share/native/com/sun/media/sound/PortMixerProvider.java中。它扩展了MixerProvider,并实现了Mixer.Info[] getMixerInfo等方法。这个类存储设备信息。

这个类完成的大部分工作实际上是由 C 文件java/media/src/share/native/com/sun/media/sound/PortMixerProvider.c中的本地方法执行的,它实现了PortMixerProvider类使用的两个方法nGetNumDevicesnNewPortMixerInfo。不幸的是,在这个 C 文件中找不到多少乐趣,因为它只是调用 C 函数PORT_GetPortMixerCountPORT_GetPortMixerDescription

有三个文件包含这些函数。

java/media/src/windows/native/com/sun/media/sound/PLATFORM_API_WinOS_Ports.c
java/media/src/solaris/native/com/sun/media/sound/PLATFORM_API_SolarisOS_Ports.c
java/media/src/solaris/native/com/sun/media/sound/PLATFORM_API_LinuxOS_ALSA_Ports.c

在文件PLATFORM_API_LinuxOS_ALSA_Ports.c中,你会看到第五章中描述的对 ALSA 的函数调用。这些调用填充了供 JavaSound 使用的 ALSA 设备的信息。

结论

Java Sound API 是有据可查的。我在这里展示了四个简单的程序,但是更复杂的程序也是可能的。简要讨论了与基本音响系统的联系。

十、GStreamer

GStreamer 是一个组件库,可以在复杂的管道中连接在一起。它可以用于过滤、转换格式和混音。它可以处理音频和视频格式,但本章只讨论音频。它着眼于使用 GStreamer 的用户级机制,以及链接 GStreamer 组件的编程模型。为编写新组件提供了参考。

资源

以下是一些资源:

概观

GStreamer 使用管道模型来连接元素,这些元素是源、过滤器和接收器。图 10-1 为模型。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 10-1。

GStreamer pipeline model

每个元素有零个或多个焊盘,可以是产生数据的源焊盘,也可以是消耗数据的宿焊盘,如图 10-2 所示。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 10-2。

GStreamer source and sink pads

pad 可以是静态的,也可以是响应事件而动态创建或销毁的。例如,要处理一个容器文件(如 MP4 ),元素必须先读取足够多的文件内容,然后才能确定所包含对象的格式,如 H.264 视频。完成后,它可以为下一阶段创建一个源 pad 来使用数据。

GStreamer 并不局限于像命令语言bash这样的线性流水线。例如,解复用器可能需要将音频和视频分开,并分别进行处理,如图 10-3 所示。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 10-3。

Complex GStreamer pipeline

元素遵循状态模型,如下所示:

  • GST_STATE_NULL
  • GST_STATE_READY
  • GST_STATE_PAUSED
  • GST_STATE_PLAYING

通常会创建元素并将其从NULL移动到PLAYING。其他状态可以进行更精确的控制。

元素还可以生成包含数据流状态信息的事件。事件通常在内部处理,但也可能被监视,例如表示数据流结束或数据流格式的事件。

插件是可加载的代码块。通常,一个插件包含单个元素的实现,但它可能包含更多元素。

每个 pad 都有一个相关的功能列表。每个功能都是关于 pad 可以处理什么的陈述。这包括有关数据类型(例如,audio/raw)、格式(S32LE、U32LE、S16LE、U16LE 等)、数据速率(例如,每秒 1-2147483647 位)等信息。当源焊盘链接到宿焊盘时,这些能力用于确定元件将如何通信。

命令行处理

处理 GStreamer 有三个层次:通过使用命令行,通过编写 C 程序(或者 Python、Perl、C++等等)链接元素,或者通过编写新元素。本节介绍命令行工具。

商品及服务税-检查

不带参数的命令gst-inspect(在我的 Ubuntu 系统上,gst-inspect-1.0)显示了插件列表、它们的元素和简短描述。简要摘录如下:

...
audiomixer:  liveadder: AudioMixer
audioparsers:  aacparse: AAC audio stream parser
audioparsers:  ac3parse: AC3 audio stream parser
audioparsers:  amrparse: AMR audio stream parser
audioparsers:  dcaparse: DTS Coherent Acoustics audio stream parser
audioparsers:  flacparse: FLAC audio parser
audioparsers:  mpegaudioparse: MPEG1 Audio Parser
audioparsers:  sbcparse: SBC audio parser
audioparsers:  wavpackparse: Wavpack audio stream parser
audiorate:  audiorate: Audio rate adjuster
...

这表明插件audioparsers包含许多元素,比如aacparse,它是一个“AAC 音频流解析器”

当以插件作为参数运行时,gst-inspect显示了关于插件的更多细节。

$gst-inspect-1.0 audioparsers
Plugin Details:
  Name                     audioparsers
  Description              Parsers for various audio formats
  Filename                 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstaudioparsers.so
  Version                  1.8.1
  License                  LGPL
  Source module            gst-plugins-good
  Source release date      2016-04-20
  Binary package           GStreamer Good Plugins (Ubuntu)
  Origin URL               https://launchpad.net/distros/ubuntu/+source/gst-plugins-good1.0

  aacparse: AAC audio stream parser
  amrparse: AMR audio stream parser
  ac3parse: AC3 audio stream parser
  dcaparse: DTS Coherent Acoustics audio stream parser
  flacparse: FLAC audio parser
  mpegaudioparse: MPEG1 Audio Parser
  sbcparse: SBC audio parser
  wavpackparse: Wavpack audio stream parser

  8 features:
  +-- 8 elements

特别要注意的是,它来自模块gst-plugins-good。插件按照稳定性、许可等进行分类。

当以元素作为参数运行时,gst-inspect显示了关于元素的大量信息。

$gst-inspect-1.0 aacparse
Factory Details:
  Rank                     primary + 1 (257)
  Long-name                AAC audio stream parser
  Klass                    Codec/Parser/Audio
  Description              Advanced Audio Coding parser
  Author                   Stefan Kost <stefan.kost@nokia.com>

Plugin Details:
  Name                     audioparsers
  Description              Parsers for various audio formats
  Filename                 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstaudioparsers.so
  Version                  1.8.1
  License                  LGPL
  Source module            gst-plugins-good
  Source release date      2016-04-20
  Binary package           GStreamer Good Plugins (Ubuntu)
  Origin URL               https://launchpad.net/distros/ubuntu/+source/gst-plugins-good1.0

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseParse
                         +----GstAacParse

Pad Templates:
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      audio/mpeg
            mpegversion: { 2, 4 }

  SRC template: 'src'
    Availability: Always
    Capabilities:
      audio/mpeg
                 framed: true
            mpegversion: { 2, 4 }
          stream-format: { raw, adts, adif, loas }

Element Flags:
  no flags set

Element Implementation:
  Has change_state() function: gst_base_parse_change_state

Element has no clocking capabilities.
Element has no URI handling capabilities.

Pads:
  SINK: 'sink'
    Pad Template: 'sink'
  SRC: 'src'
    Pad Template: 'src'

Element Properties:
  name                : The name of the object
                        flags: readable, writable
                        String. Default: "aacparse0"
  parent              : The parent of the object
                        flags: readable, writable
                        Object of type "GstObject"
  disable-passthrough : Force processing (disables passthrough)
                        flags: readable, writable
                        Boolean. Default: false

这表明它可以采用音频/mpeg 版本 2 或 4,并将数据转换为各种格式的音频/mpeg 版本 2 或 4。

GST-发现者

命令gst-discoverer(在我的系统gst-discoverer-1.0上)可以用来给出关于资源的信息,比如文件或者 URIs。在一个名为audio_01.ogg的音频文件上,它给出了以下信息:

$gst-discoverer-1.0 enigma/audio_01.ogg
Analyzing file:enigma/audio_01.ogg
Done discovering file:enigma/audio_01.ogg

Topology:
  container: Ogg
    audio: Vorbis

Properties:
  Duration: 0:02:03.586666666
  Seekable: yes
  Tags:
      encoder: Xiph.Org libVorbis I 20020717
      encoder version: 0
      audio codec: Vorbis
      nominal bitrate: 112001
      bitrate: 112001
      container format: Ogg

GST-设备-监视器

该命令可以提供关于系统中设备的大量信息:

$gst-device-monitor-1.0
Probing devices...

Device found:

        name  : Monitor of Built-in Audio Digital Stereo (HDMI)
        class : Audio/Source
        caps  : audio/x-raw, format=(string){ S16LE, S16BE, F32LE, F32BE, S32LE, S32BE, S24LE, S24BE, S24_32LE, S24_32BE, U8 }, layout=(string)interleaved, rate=(int)[ 1, 2147483647 ], channels=(int)[ 1, 32 ];
                audio/x-alaw, rate=(int)[ 1, 2147483647 ], channels=(int)[ 1, 32 ];
                audio/x-mulaw, rate=(int)[ 1, 2147483647 ], channels=(int)[ 1, 32 ];
        properties:
                device.description = "Monitor\ of\ Built-in\ Audio\ Digital\ Stereo\ \(HDMI\)"
                device.class = monitor
                alsa.card = 0
                alsa.card_name = "HDA\ Intel\ HDMI"
                alsa.long_card_name = "HDA\ Intel\ HDMI\ at\ 0xf7214000\ irq\ 52"
                alsa.driver_name = snd_hda_intel
                device.bus_path = pci-0000:00:03.0
                sysfs.path = /devices/pci0000:00/0000:00:03.0/sound/card0
                device.bus = pci
                device.vendor.id = 8086
                device.vendor.name = "Intel\ Corporation"
                device.product.id = 160c
                device.product.name = "Broadwell-U\ Audio\ Controller"
                device.form_factor = internal
                device.string = 0
                module-udev-detect.discovered = 1
                device.icon_name = audio-card-pci
...

这是关于我的 HDMI 显示器的音频功能的大量信息,然后是关于我的其他设备的音频和视频功能的其他信息。

商品及服务税-播放

这个程序是一站式播放各种媒体文件和 URIs,如下:

      $gst-play-1.0 enigma/audio_01.ogg

商品及服务税-推出

gst-launch程序允许你建立一个命令管道来处理媒体数据。格式如下:

      gst-launch <elmt> [<args>] ! <elmt> [<args>] ! ...

例如,要通过 ALSA 播放 WAV 文件,请使用以下命令:

      $gst-launch-1.0 filesrc location=enigma/audio_01.wav ! wavparse ! alsasink

使用 GStreamer 管道最困难的部分似乎是选择合适的插件。这看起来有点像艺术。请参阅位于 http://wiki.oz9aec.net/index.php/Gstreamer_cheat_sheet 的 GStreamer 备忘单以获取帮助。

例如,Ogg 文件是一种容器格式,通常包含 Vorbis 音频流和 Theora 视频流(尽管它们可以包含其他数据格式)。它们播放音频或视频,或者两者都播放,必须使用解复用器从容器中提取流,解码,然后播放。播放音频有多种方式,包括以下三种:

    $gst-launch-1.0 filesrc location=enigma/audio_01.ogg ! oggdemux ! vorbisdec ! audioconvert ! alsasink

    $gst-launch-1.0 filesrc location=enigma/audio_01.ogg ! oggdemux ! vorbisdec ! autoaudiosink

    $gst-launch-1.0 uridecodebin uri=file:enigma/audio_01.ogg ! audioconvert ! autoaudiosink

GStreamer 管道的语法允许将一个管道分成多个管道,例如管理音频和视频流。这在 GStreamer 的在线文档中有所介绍。

编程

同样的管道原则也适用于gst-launch,但是当然在 C 编程级别有更多的管道需要关注。以下来自 http://docs.gstreamer.com/display/GstSDK/Basic+tutorials 的 GStreamer SDK 基础教程的程序与最后一个gst-launch示例($gst-launch-1.0 uridecodebin uri=... ! audioconvert ! autoaudiosink)做的一样。

GStreamer 元素是通过如下调用创建的:

data.source = gst_element_factory_make ("uridecodebin", "source");

管道是用这个建造的:

data.pipeline = gst_pipeline_new ("test-pipeline")
gst_bin_add_many (GST_BIN (data.pipeline), data.source, data.convert , data.sink, NULL);

最终所有的元素都必须被连接起来。现在,convertsink可以与以下链接:

gst_element_link (data.convert, data.sink)

要播放的 URI 设置如下:

g_object_set (data.source, "uri", "http://docs.gstreamer.com/media/sintel_trailer-480p.webm", NULL);

数据源是一个容器;在我之前的例子中,它是一个 Ogg 容器,这里它是一个 web 媒体 URL。在读取足够的数据以确定数据格式和参数之前,这不会在数据源元素上创建源填充。因此,C 程序必须为pad-added添加一个事件处理程序,它是这样做的:

g_signal_connect (data.source, "pad-added", G_CALLBACK (pad_added_handler), &data);

当一个 pad 被添加到源时,pad_added_handler将被调用。这做了很多类型检查并得到了新的 pad,但最终完成了链接sourceconvert元素的关键步骤。

gst_pad_link (new_pad, sink_pad)

然后,应用通过将状态改变为PLAYING开始播放,并等待正常终止(GST_MESSAGE_EOS)或其他消息。

gst_element_set_state (data.pipeline, GST_STATE_PLAYING);
bus = gst_element_get_bus (data.pipeline);
msg = gst_bus_timed_pop_filtered (bus, GST_CLOCK_TIME_NONE,
        GST_MESSAGE_STATE_CHANGED | GST_MESSAGE_ERROR | GST_MESSAGE_EOS);

代码的最后一部分进行清理。完整的程序如下:

#include <gst/gst.h>

/* Structure to contain all our information, so we can pass it to callbacks */
typedef struct _CustomData {
  GstElement *pipeline;
  GstElement *source;
  GstElement *convert;
  GstElement *sink;
} CustomData;

/* Handler for the pad-added signal */
static void pad_added_handler (GstElement *src, GstPad *pad, CustomData *data);

int main(int argc, char *argv[]) {
  CustomData data;
  GstBus *bus;
  GstMessage *msg;
  GstStateChangeReturn ret;
  gboolean terminate = FALSE;

  /* Initialize GStreamer */
  gst_init (&argc, &argv);

  /* Create the elements */
  data.source = gst_element_factory_make ("uridecodebin", "source");
  data.convert = gst_element_factory_make ("audioconvert", "convert");
  data.sink = gst_element_factory_make ("autoaudiosink", "sink");

  /* Create the empty pipeline */
  data.pipeline = gst_pipeline_new ("test-pipeline");

  if (!data.pipeline || !data.source || !data.convert || !data.sink) {
    g_printerr ("Not all elements could be created.\n");
    return -1;
  }

  /* Build the pipeline. Note that we are NOT linking the source at this
   * point. We will do it later. */
  gst_bin_add_many (GST_BIN (data.pipeline), data.source, data.convert , data.sink, NULL);
  if (!gst_element_link (data.convert, data.sink)) {
    g_printerr ("Elements could not be linked.\n");
    gst_object_unref (data.pipeline);
    return -1;
  }

  /* Set the URI to play */
  g_object_set (data.source, "uri", "http://docs.gstreamer.com/media/sintel_trailer-480p.webm", NULL);

  /* Connect to the pad-added signal */
  g_signal_connect (data.source, "pad-added", G_CALLBACK (pad_added_handler), &data);

  /* Start playing */
  ret = gst_element_set_state (data.pipeline, GST_STATE_PLAYING);
  if (ret == GST_STATE_CHANGE_FAILURE) {
    g_printerr ("Unable to set the pipeline to the playing state.\n");
    gst_object_unref (data.pipeline);
    return -1;
  }

  /* Listen to the bus */
  bus = gst_element_get_bus (data.pipeline);
  do {
    msg = gst_bus_timed_pop_filtered (bus, GST_CLOCK_TIME_NONE,
        GST_MESSAGE_STATE_CHANGED | GST_MESSAGE_ERROR | GST_MESSAGE_EOS);

    /* Parse message */
    if (msg != NULL) {
      GError *err;
      gchar *debug_info;

      switch (GST_MESSAGE_TYPE (msg)) {
        case GST_MESSAGE_ERROR:
          gst_message_parse_error (msg, &err, &debug_info);
          g_printerr ("Error received from element %s: %s\n", GST_OBJECT_NAME (msg->src), err->message);
          g_printerr ("Debugging information: %s\n", debug_info ? debug_info : "none");
          g_clear_error (&err);
          g_free (debug_info);
          terminate = TRUE;
          break;
        case GST_MESSAGE_EOS:
          g_print ("End-Of-Stream reached.\n");
          terminate = TRUE;
          break;
        case GST_MESSAGE_STATE_CHANGED:
          /* We are only interested in state-changed messages from the pipeline */
          if (GST_MESSAGE_SRC (msg) == GST_OBJECT (data.pipeline)) {
            GstState old_state, new_state, pending_state;
            gst_message_parse_state_changed (msg, &old_state, &new_state, &pending_state);
            g_print ("Pipeline state changed from %s to %s:\n",
                gst_element_state_get_name (old_state), gst_element_state_get_name (new_state));
          }
          break;
        default:
          /* We should not reach here */
          g_printerr ("Unexpected message received.\n");
          break;
      }
      gst_message_unref (msg);
    }
  } while (!terminate);

  /* Free resources */
  gst_object_unref (bus);
  gst_element_set_state (data.pipeline, GST_STATE_NULL);
  gst_object_unref (data.pipeline);
  return 0;
}

/* This function will be called by the pad-added signal */
static void pad_added_handler (GstElement *src, GstPad *new_pad, CustomData *data) {
  GstPad *sink_pad = gst_element_get_static_pad (data->convert, "sink");
  GstPadLinkReturn ret;
  GstCaps *new_pad_caps = NULL;
  GstStructure *new_pad_struct = NULL;
  const gchar *new_pad_type = NULL;

  g_print ("Received new pad '%s' from '%s':\n", GST_PAD_NAME (new_pad), GST_ELEMENT_NAME (src));

  /* If our converter is already linked, we have nothing to do here */
  if (gst_pad_is_linked (sink_pad)) {
    g_print ("  We are already linked. Ignoring.\n");
    goto exit;
  }

  /* Check the new pad's type */
  new_pad_caps = gst_pad_get_caps (new_pad);
  new_pad_struct = gst_caps_get_structure (new_pad_caps, 0);
  new_pad_type = gst_structure_get_name (new_pad_struct);
  if (!g_str_has_prefix (new_pad_type, "audio/x-raw")) {
    g_print ("  It has type '%s' which is not raw audio. Ignoring.\n", new_pad_type);
    goto exit;
  }

  /* Attempt the link */
  ret = gst_pad_link (new_pad, sink_pad);
  if (GST_PAD_LINK_FAILED (ret)) {
    g_print ("  Type is '%s' but link failed.\n", new_pad_type);
  } else {
    g_print ("  Link succeeded (type '%s').\n", new_pad_type);
  }

exit:
  /* Unreference the new pad's caps, if we got them */
  if (new_pad_caps != NULL)
    gst_caps_unref (new_pad_caps);

  /* Unreference the sink pad */
  gst_object_unref (sink_pad);
}

编写插件

编写新的 GStreamer 插件是一项艰巨的任务。位于 https://gstreamer.freedesktop.org/data/doc/gstreamer/head/pwg/html/index.html 的文档“GStreamer 作者指南”对此给出了广泛的建议。

结论

本章从命令行和一个示例 C 程序两个方面介绍了 GStreamer 的使用。有一个庞大的可用插件列表,可以满足音频/视频开发人员的许多需求。我只是触及了 GStreamer 的皮毛,它还有许多其他特性,包括与 GTK 工具包的集成。

十一、libao

根据 libao 文档( www.xiph.org/ao/doc/overview.html ),“libao 旨在使使用各种音频设备和库进行简单的音频输出变得容易。由于这个原因,复杂的音频控制功能丢失了,并且可能永远不会被添加。然而,如果你只是想打开任何可用的音频设备并播放声音,libao 应该没问题。”

资源

查看以下内容:

libao

libao 是一个极小的图书馆;它基本上只是播放音频数据。它不能解码任何标准的文件格式:不支持 WAV、MP3、Vorbis 等等。您必须配置位、通道、速率和字节格式的格式参数,然后将适当的数据发送到设备。它的主要用途是输出 PCM 数据,可以在编解码器解码后使用,或者播放正弦波等简单声音。

下面是一个来自 libao 网站的简单例子,播放一秒钟的正弦音调:

/*
 *
 * ao_example.c
 *
 *     Written by Stan Seibert - July 2001
 *
 * Legal Terms:
 *
 *     This source file is released into the public domain.  It is
 *     distributed without any warranty; without even the implied
 *     warranty * of merchantability or fitness for a particular
 *     purpose.
 *
 * Function:
 *
 *     This program opens the default driver and plays a 440 Hz tone for
 *     one second.
 *
 * Compilation command line (for Linux systems):
 *
 *     gcc -lao -ldl -lm -o ao_example ao_example.c
 *
 */

#include <stdio.h>
#include <ao/ao.h>
#include <math.h>

#define BUF_SIZE 4096

int main(int argc, char **argv)
{
        ao_device *device;
        ao_sample_format format;
        int default_driver;
        char *buffer;
        int buf_size;
        int sample;
        float freq = 440.0;
        int i;

        /* -- Initialize -- */

        fprintf(stderr, "libao example program\n");

        ao_initialize();

        /* -- Setup for default driver -- */

        default_driver = ao_default_driver_id();

        memset(&format, 0, sizeof(format));
        format.bits = 16;
        format.channels = 2;
        format.rate = 44100;
        format.byte_format = AO_FMT_LITTLE;

        /* -- Open driver -- */
        device = ao_open_live(default_driver, &format, NULL /* no options */);
        if (device == NULL) {
                fprintf(stderr, "Error opening device.\n");
                return 1;
        }

        /* -- Play some stuff -- */
        buf_size = format.bits/8 * format.channels * format.rate;
        buffer = calloc(buf_size,
                        sizeof(char));

        for (i = 0; i < format.rate; i++) {
                sample = (int)(0.75 * 32768.0 *
                        sin(2 * M_PI * freq * ((float) i/format.rate)));

                /* Put the same stuff in left and right channel */
                buffer[4*i] = buffer[4*i+2] = sample & 0xff;
                buffer[4*i+1] = buffer[4*i+3] = (sample >> 8) & 0xff;
        }
        ao_play(device, buffer, buf_size);

        /* -- Close and shutdown -- */
        ao_close(device);

        ao_shutdown();

  return (0);
}

结论

libao 并不复杂;这是一个基本的库,可以在任何可用的设备上播放声音。它将适合的情况下,你有一个已知的 PCM 格式的声音。

十二、 FFmpeg/Libav

根据“FFmpeg 初学者教程”( http://keycorner.org/pub/text/doc/ffmpegtutorial.htm ),FFmpeg 是一个完整的、跨平台的命令行工具,能够记录、转换和流式传输各种格式的数字音频和视频。它可以用来快速轻松地完成大多数多媒体任务,如音频压缩、音频/视频格式转换、从视频中提取图像等。

FFmpeg 由一组命令行工具和一组库组成,可用于将音频(和视频)文件从一种格式转换为另一种格式。它既可以在容器上工作,也可以在编解码器上工作。它不是为播放或录制音频而设计的;它更像是一个通用的转换工具。

资源

FFmpeg/Libav 之争

FFmpeg 开始于 2000 年,为处理多媒体数据提供库和程序。然而,在过去的几年里,开发人员之间发生了一些纠纷,导致了 2011 年 Libav 项目的分叉。从那以后,这两个项目一直在进行,几乎是并行的,并且经常互相借鉴。然而,形势依然严峻,似乎没有解决的可能。

这对开发者来说是不幸的。虽然程序通常可以在这两个系统之间移植,但有时在 API 和行为上存在差异。还有发行版支持的问题。多年来,Debian 及其衍生产品只支持 Libav,忽略了 FFmpeg。这已经改变了,现在两者都支持。参见“为什么 Debian 回到 FFmpeg”(https://lwn.net/Articles/650816/)对其中一些问题的讨论。

FFmpeg 命令行工具

主要的 FFmpeg 工具是ffmpeg本身。最简单的用途是作为从一种格式到另一种格式的转换器,如下所示:

        ffmpeg -i file.ogg file.mp3

这将把 Vorbis 编解码器数据的 Ogg 容器转换成 MP2 编解码器数据的 MPEG 容器。

Libav 的等价物是avconv,运行方式类似。

      avconv -i file.ogg file.mp3

在内部,ffmpeg使用模块流水线,如图 12-1 所示。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

图 12-1。

FFmpeg/Libav pipeline (Source: http://ffmpeg.org/ffmpeg.html )

如果默认值不合适,可以使用选项设置多路复用器/多路分解器和解码器/编码器。

以下是其他命令:

  • ffprobe给出关于文件的信息。
  • 是一个简单的媒体播放器。
  • ffserver是媒体服务器。

设计

有许多库可用于 FFmpeg/Libav 编程。Libav 构建了以下库:

  • libavcodec 公司
  • libavdevice
  • libavfilter
  • libavformat
  • libavresample
  • 滑鹌

FFmepg 构建以下内容:

  • libavcodec 公司
  • libavdevice
  • libavfilter
  • libavformat
  • libavresample
  • 滑鹌
  • libpostproc
  • libswresample
  • libswscale

FFmpeg 中的额外库用于视频后处理和缩放。

使用这些系统都不是一个简单的过程。Libav 网站声明,“Libav 一直是一个非常实验性的、由开发者驱动的项目。它是许多多媒体项目中的关键组件,并且不断添加新功能。为了提供一个稳定的基础,主要版本每四到六个月削减一次,并至少维持两年。”

FFmpeg 网站声明,“FFmpeg 一直是一个非常实验性和开发者驱动的项目。它是许多多媒体项目中的关键组件,并且不断添加新功能。开发分支快照在 99%的时间里都工作得很好,所以人们不怕使用它们。”

我的经验是,这两个项目的“实验”性质导致了不稳定的核心 API,定期废弃和替换关键功能。比如libavcodec版本 56 中的函数avcodec_decode_audio现在升级到版本 4: avcodec_decode_audio4。甚至那个版本现在也在 FFmpeg 和 Libav 的上游版本(版本 57)中被弃用,取而代之的是在版本 56 中不存在的函数,比如avcodec_send_packet。除此之外,还有两个项目具有相同的目标和大体相同的 API,但并不总是如此。比如 FFmpeg 有swr_alloc_set_opts,而 Libav 用的是av_opt_set_int。此外,视听编解码器和容器本身也在不断发展。

这样做的结果是,互联网上的许多示例程序不再编译,不再使用废弃的 API,或者属于“其他”系统。这并不是要贬低两个成就高超的系统,只是希望不要这么乱。

解码 MP3 文件

以下程序将 MP3 文件解码为原始 PCM 文件。这是使用 FFmpeg/Libav 所能完成的最简单的任务,但不幸的是这并不简单。首先,你要注意你要处理的是一个编解码器,而不是一个包含编解码器的文件。这不是一个 FFmpeg/Libav 问题,而是一个一般性问题。

扩展名为.mpg.mp3的文件可能包含许多不同的格式。如果我对我拥有的一些文件运行命令file,我会得到不同的结果。

BST.mp3: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
Beethoven_Fr_Elise.mp3: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
Angel-no-vocal.mp3: Audio file with ID3 version 2.3.0
01DooWackaDoo.mp3: Audio file with ID3 version 2.3.0, \
    contains: MPEG ADTS, layer III, v1, 224 kbps, 44.1 kHz, JntStereo

前两个文件只包含一个编解码器,可以由下面的程序管理。第三和第四个文件是容器文件,包含 MPEG+ID3 数据。这些需要使用avformat函数来管理,例如av_read_frame 1

该程序基本上是 FFmpeg/Libav 源代码发行版中的一个标准示例。它基于 FFmpeg 源中的ffmpeg-3.2/doc/examples/decoding_encoding.c和 Libav 源中的libav-12/doc/examples/avcodec.c。顺便提一下,两个程序都使用了avcodec_decode_audio4,这在这两个上游版本中都被否决了,也没有替换函数avcodec_send_packet的例子。

更严重的问题是,MP3 文件越来越多地使用平面格式。在这种情况下,不同的通道位于不同的平面。FFmpeg/Libav 函数avcodec_decode_audio4通过将每个平面放置在单独的数据阵列中来正确处理这一问题,但当它作为 PCM 数据输出时,平面必须交错。示例中没有这样做,这可能会导致 PCM 数据不正确(大量咔嗒声,然后是半速音频)。

相关的 FFmpeg 功能如下:

  • 注册所有可能的多路复用器、多路分解器和协议。
  • avformat_open_input:打开输入流。
  • av_find_stream_info:提取流信息。
  • av_init_packet:设定数据包中的默认值。
  • avcodec_find_decoder:找到合适的解码器。
  • avcodec_alloc_context3:设置主数据结构的默认值。
  • avcodec_open2:打开解码器。
  • fread:FFmpeg 处理循环从数据流中一次读取一个缓冲区。
  • avcodec_decode_audio4:将音频帧解码成原始音频数据。

其余的代码交错数据流以输出到 PCM 文件。生成的文件可以通过以下方式播放:

      aplay -c 2 -r 44100 /tmp/test.sw -f S16_LE

该计划如下:

/*
 * copyright (c) 2001 Fabrice Bellard
 *
 * This file is part of Libav.
 *
 * Libav is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * Libav is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with Libav; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

// From http://code.haskell.org/∼thielema/audiovideo-example/cbits/
// Adapted to version version 2.8.6-1ubuntu2 by Jan Newmarch

/**
 * @file
 * libavcodec API use example.
 *
 * @example libavcodec/api-example.c
 * Note that this library only handles codecs (mpeg, mpeg4, etc...),
 * not file formats (avi, vob, etc...). See library 'libavformat' for the
 * format handling
 */

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#ifdef HAVE_AV_CONFIG_H
#undef HAVE_AV_CONFIG_H
#endif

#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>

#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096

void die(char *s) {
    fputs(s, stderr);
    exit(1);
}

/*
 * Audio decoding.
 */
static void audio_decode_example(AVFormatContext* container,
                                 const char *outfilename, const char *filename)
{
    AVCodec *codec;
    AVCodecContext *context = NULL;
    int len;
    FILE *f, *outfile;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
    AVPacket avpkt;
    AVFrame *decoded_frame = NULL;
    int num_streams = 0;
    int sample_size = 0;

    av_init_packet(&avpkt);

    printf("Audio decoding\n");

    int stream_id = -1;

    // To find the first audio stream. This process may not be necessary
    // if you can gurarantee that the container contains only the desired
    // audio stream
    int i;
    for (i = 0; i < container->nb_streams; i++) {
        if (container->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
            stream_id = i;
            break;
        }
    }

    /* find the appropriate audio decoder */
    AVCodecContext* codec_context = container->streams[stream_id]->codec;
    codec = avcodec_find_decoder(codec_context->codec_id);
    if (!codec) {
        fprintf(stderr, "codec not found\n");
        exit(1);
    }

    context = avcodec_alloc_context3(codec);;

    /* open it */
    if (avcodec_open2(context, codec, NULL) < 0) {
        fprintf(stderr, "could not open codec\n");
        exit(1);
    }

    f = fopen(filename, "rb");
    if (!f) {
        fprintf(stderr, "could not open %s\n", filename);
        exit(1);
    }
    outfile = fopen(outfilename, "wb");
    if (!outfile) {
        av_free(context);
        exit(1);
    }

    /* decode until eof */
    avpkt.data = inbuf;
    avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);

    while (avpkt.size > 0) {
        int got_frame = 0;

        if (!decoded_frame) {
            if (!(decoded_frame = av_frame_alloc())) {
                fprintf(stderr, "out of memory\n");
                exit(1);
            }
        } else {
            av_frame_unref(decoded_frame);
        }
        printf("Stream idx %d\n", avpkt.stream_index);

        len = avcodec_decode_audio4(context, decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding\n");
            exit(1);
        }
        if (got_frame) {
            printf("Decoded frame nb_samples %d, format %d\n",
                   decoded_frame->nb_samples,
                   decoded_frame->format);
            if (decoded_frame->data[1] != NULL)
                printf("Data[1] not null\n");
            else
                printf("Data[1] is null\n");
            /* if a frame has been decoded, output it */
            int data_size = av_samples_get_buffer_size(NULL, context->channels,
                                                       decoded_frame->nb_samples,
                                                       context->sample_fmt, 1);
            // first time: count the number of  planar streams
            if (num_streams == 0) {
                while (num_streams < AV_NUM_DATA_POINTERS &&
                       decoded_frame->data[num_streams] != NULL)
                    num_streams++;
                printf("Number of streams %d\n", num_streams);
            }

            // first time: set sample_size from 0 to e.g 2 for 16-bit data
            if (sample_size == 0) {
                sample_size =
                    data_size / (num_streams * decoded_frame->nb_samples);
            }

            int m, n;
            for (n = 0; n < decoded_frame->nb_samples; n++) {
                // interleave the samples from the planar streams
                for (m = 0; m < num_streams; m++) {
                    fwrite(&decoded_frame->data[m][n*sample_size],
                           1, sample_size, outfile);
                }
            }
        }
        avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
            if (len > 0)
                avpkt.size += len;
        }
    }

    fclose(outfile);
    fclose(f);

    avcodec_close(context);
    av_free(context);
    av_free(decoded_frame);
}

int main(int argc, char **argv)
{
    const char *filename = "Beethoven_Fr_Elise.mp3";
    AVFormatContext *pFormatCtx = NULL;

    if (argc == 2) {
        filename = argv[1];
    }

    // Register all formats and codecs
    av_register_all();
    if(avformat_open_input(&pFormatCtx, filename, NULL, NULL)!=0) {
        fprintf(stderr, "Can't get format of file %s\n", filename);
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
        return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, filename, 0);
    printf("Num streams %d\n", pFormatCtx->nb_streams);
    printf("Bit rate %d\n", pFormatCtx->bit_rate);
    audio_decode_example(pFormatCtx, "/tmp/test.sw", filename);

    return 0;
}

结论

本章简要介绍了 FFmpeg/Libav,查看了 libavcodec 库。FFmpeg 和 Libav 要复杂得多,它们可以进行复杂得多的转换。此外,他们还可以进行视频处理,这在第十五章中有说明。

Footnotes 1

第十五章和第二十一章中给出了 av_read_frame 的示例。

十三、OpenMAXIL

OpenMAX 是 Khronos Group 为低性能设备设计的音频和视频开放标准。卡的供应商被期望生产实现。一般的 Linux 实现方式很少,但是 Broadcom 已经实现了其中一个规范(OpenMAX IL),它的芯片被用于 Raspberry Pi。其他 Khronos 规范(OpenMAX AL 和 OpenSL ES)已经在 Android 设备中实现,可通过原生开发套件(NDK)访问,但这些并不打算直接使用;它们只能通过 Java APIs 使用。本书不讨论它们。本章仅讨论 OpenMAX IL。

资源

以下是一些资源:

引用

以下是一些引述:

OpenMAX 概念

OpenMAX IL API 与 OpenMAX AL 的 API 截然不同。基本概念是组件,即某种类型的音频/视频(或其他)处理单元,如音量控制、混音器或输出设备。每个组件有零个或多个输入和输出端口,每个端口可以有一个或多个携带数据的缓冲器。

OpenMAX IL 通常由某种 A/V 框架使用,如 OpenMAX AL。除了 OpenMAX AL,目前还有一个 GStreamer 插件在底层使用 OpenMAX IL。但是也可以构建独立的应用,直接调用 OpenMAX IL API。总的来说,这些都被称为 IL 客户端。

OpenMAX IL API 很难直接使用。错误消息经常是无用的,线程会毫无解释地阻塞,直到一切都完全正确,静默阻塞不会给你任何关于什么是不正确的线索。此外,我必须处理的例子并没有完全正确地遵循规范,这会导致大量的时间浪费。

OpenMAX IL 组件使用缓冲区来传送数据。组件通常会处理来自输入缓冲区的数据,并将其放在输出缓冲区。这种处理对 API 是不可见的,因此它允许供应商在硬件或软件中实现组件,构建在其他 A/V 组件之上,等等。OpenMAX IL 提供了设置和获取组件参数、调用组件上的标准函数或从组件中获取数据的机制。

虽然一些 OpenMAX IL 调用是同步的,但是那些可能需要大量处理的调用是异步的,通过回调函数传递结果。这自然会导致多线程处理模型,尽管 OpenMAX IL 并不明显使用任何线程库,并且应该不知道 IL 客户端如何使用线程。Bellagio 示例使用 pthreads,而 Broadcom 的 Raspberry Pi 示例使用 Broadcom 的 video core OS(VCO)线程( https://github.com/raspberrypi/userland/blob/master/interface/vcos/vcos_semaphore.h )。

有两种机制可以让数据进出组件。第一个是 IL 客户端调用组件的地方。所有组件都需要支持此机制。第二种是在两个组件之间建立一个隧道,让数据沿着共享缓冲区流动。支持这种机制不需要组件。

OpenMAX IL 组件

OpenMAX IL in 1.1.2 lists 中列出了许多标准组件,包括(对于音频)解码器、编码器、混合器、读取器、渲染器、写入器、捕获器和处理器。一个 IL 客户端通过调用OMX_GetHandle()获得这样一个组件,并传入组件的名称。这是一个问题:组件没有标准的名称。

1.1.2 规范说,“由于组件是按名称请求的,因此定义了命名约定。OpenMAX IL 组件名是以零结尾的字符串,格式如下:OMX.<vendor_name>.<vendor_specified_convention>,例如OMX.CompanyABC.MP3Decoder.productXYZ。不同供应商的组件名称之间没有标准化。”

在这一点上,您必须查看当前可用的实现,因为这种标准化的缺乏会导致即使是最基本的程序也存在差异。

履行

以下是实现。

树莓派

Raspberry Pi 有一个 Broadcom 图形处理单元(GPU),Broadcom 支持 OpenMAX IL。构建应用所需的包含文件在/opt/vc/include/IL/opt/vc/include/opt/vc/include/interface/vcos/pthreads中。需要链接的库在/opt/vc/lib目录下,分别是openmaxilbcm_host

Broadcom 库需要调用额外的代码以及标准的 OpenMAX IL 函数。此外,OpenMAX IL 还有许多(合法的)扩展,这些扩展在规范或其他实现中是找不到的。这些在/opt/vc/include/IL/OMX_Broadcom.h中有描述。由于这些原因,我定义了RASPBERRY_PI来允许这些被处理。

例如,listcomponents.c的编译行如下:

cc -g -DRASPBERRY_PI -I /opt/vc/include/IL -I /opt/vc/include \
   -I /opt/vc/include/interface/vcos/pthreads \
   -o listcomponents listcomponents.c \
   -L /opt/vc/lib -l openmaxil -l bcm_host

Broadcom 实现是闭源的。它似乎是其 GPU API 的一个薄薄的包装,Broadcom 不会发布该 API 的任何细节。这意味着您不能扩展组件集或支持的编解码器,因为没有关于如何构建新组件的详细信息。虽然组件的设置是合理的,但目前除了 PCM 之外不支持编解码器,也不支持非 GPU 硬件,如 USB 声卡。

OtherCrashOverride ( www.raspberrypi.org/phpBB3/viewtopic.php?f=70&t=33101&p=287590#p287590 )说他已经设法让 Broadcom 组件在 LIM 实现下运行,但我还没有证实这一点。

就音频而言,Raspberry Pi 上的实现非常弱,因为所有音频解码都要在软件中完成,并且它只能播放 PCM 数据。视频更令人印象深刻,在我的书《Raspberry Pi GPU 音频视频编程》中有所论述。

百乐宫(美国酒店名)

Bellagio 库不需要额外的代码或任何扩展。有一些小错误,所以我定义BELLAGIO来处理它们。我从源代码构建但没有安装,所以 includes 和 libraries 在一个有趣的地方。我的编译代码如下:

cc  -g -DBELLAGIO -I ../libomxil-bellagio-0.9.3/include/ \
    -o listcomponents listcomponents.c \
    -L ../libomxil-bellagio-0.9.3/src/.libs -l omxil-bellagio

这是运行时的代码行:

export LD_LIBRARY_PATH=../libomxil-bellagio-0.9.3/src/.libs/
./listcomponents

Bellagio 代码是开源的。

潜象存储器(Latent Image Memory 的缩写)

下载 1.1 版本很麻烦,因为 1.1 下载使用了已经消失的 Git repo(截至 2016 年 11 月)。相反,您必须运行以下命令:

  git clone git://limoa.git.sourceforge.net/gitroot/limoa/limoi-components
  git clone git://limoa.git.sourceforge.net/gitroot/limoa/limoi-core
  git clone git://limoa.git.sourceforge.net/gitroot/limoa/limoi-plugins
  git clone git://limoa.git.sourceforge.net/gitroot/limoa/limutil
  git clone git://limoa.git.sourceforge.net/gitroot/limoa/manifest

您必须将构建中的root.mk文件复制到包含所有代码的顶层文件夹中,并将其重命名为Makefileroot.readme文件有构建指令。感谢 OtherCrashOverride ( www.raspberrypi.org/phpBB3/viewtopic.php?f=70&t=33101&p=286516#p286516 )的这些指令。

建造图书馆遇到了一些小问题。我不得不注释掉一个视频文件中的几行,因为它引用了不存在的结构字段,并且不得不从一个Makefile.am中移除-Werrors,否则关于未使用变量的警告将会中止编译。

库构建将文件放在我的HOME中的新目录中。到目前为止,我在实现中发现了一些小错误。我的编译代码如下:

cc -g -DLIM -I ../../lim-omx-1.1/LIM/limoi-core/include/ \
   -o listcomponents listcomponents.c \
   -L /home/newmarch/osm-build/lib/ -l limoa -l limoi-core

以下是运行时的代码行:

export LD_LIBRARY_PATH=/home/newmarch/osm-build/lib/
./listcomponents

LIM 代码是开源的。

硬件支持的版本

您可以在 open max IL Conformant Products(www.khronos.org/conformance/adopters/conformant-products#openmaxil)找到硬件支持的版本列表。

组件的实现

Bellagio 库(你需要源码包才能看到这些文件)在其README中只列出了两个音频组件。

  • OMX 音量控制
  • OMX 混音器组件

它们的名字(来自示例测试文件)分别是OMX.st.volume.componentOMX.st.audio.mixer。百乐宫背后的公司是意法半导体( www.st.com/internet/com/home/home.jsp ),这就解释了st

Raspberry Pi 上使用的 Broadcom OpenMAX IL 实现有更好的文档记录。如果您下载 Raspberry Pi 的固件主文件,它会在documentation/ilcomponents目录中列出 IL 组件。这列出了组件audio_captureaudio_decodeaudio_encodeaudio_lowpoweraudio_mixeraudio_processoraudio_renderaudio_splitter

Broadcom 示例中的许多 OpenMAX IL 函数调用都隐藏在 Broadcom 便利函数中,如下所示:

ilclient_create_component(st->client, &st->audio_render,
                         "audio_render",
                         ILCLIENT_ENABLE_INPUT_BUFFERS | ILCLIENT_DISABLE_ALL_PORTS);

这围绕着OMX_GetHandle()。但是至少ilclient.h声明,“在传递给 IL 核心之前,所提供的组件名称会自动加上前缀OMX.broadcom.”所以,你可以断定真名是,比如OMX.broadcom.audio_render,等等。

有一种简单的方法可以通过编程获得受支持的组件。首先用OMX_init()初始化 OpenMAX 系统,然后调用OMX_ComponentNameEnum()。对于连续的索引值,它每次都返回一个唯一的名称,直到最后返回一个错误值OMX_ErrorNoMore

每个组件可以支持多个角色。这些都是由OMX_GetRolesOfComponent给出的。1.1 规范在第 8.6 节“标准音频组件”中列出了音频组件的类别和相关角色 LIM 库匹配这些,而 Bellagio 和 Broadcom 不匹配。

下面的程序是listcomponents.c:

#include <stdio.h>
#include <stdlib.h>

#include <OMX_Core.h>

#ifdef RASPBERRY_PI
#include <bcm_host.h>
#endif

OMX_ERRORTYPE err;

//extern OMX_COMPONENTREGISTERTYPE OMX_ComponentRegistered[];

void listroles(char *name) {
    int n;
    OMX_U32 numRoles;
    OMX_U8 *roles[32];

    /* get the number of roles by passing in a NULL roles param */
    err = OMX_GetRolesOfComponent(name, &numRoles, NULL);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Getting roles failed\n", 0);
        exit(1);
    }
    printf("  Num roles is %d\n", numRoles);
    if (numRoles > 32) {
        printf("Too many roles to list\n");
        return;
    }

    /* now get the roles */
    for (n = 0; n < numRoles; n++) {
        roles[n] = malloc(OMX_MAX_STRINGNAME_SIZE);
    }
    err = OMX_GetRolesOfComponent(name, &numRoles, roles);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Getting roles failed\n", 0);
        exit(1);
    }
    for (n = 0; n < numRoles; n++) {
        printf("    role: %s\n", roles[n]);
        free(roles[n]);
    }

    /* This is in version 1.2
    for (i = 0; OMX_ErrorNoMore != err; i++) {
        err = OMX_RoleOfComponentEnum(role, name, i);
        if (OMX_ErrorNone == err) {
            printf("   Role of omponent is %s\n", role);
        }
    }
    */
}

int main(int argc, char** argv) {

    int i;
    unsigned char name[OMX_MAX_STRINGNAME_SIZE];

# ifdef RASPBERRY_PI
    bcm_host_init();
# endif

    err = OMX_Init();
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_Init() failed\n", 0);
        exit(1);
    }

    err = OMX_ErrorNone;
    for (i = 0; OMX_ErrorNoMore != err; i++) {
        err = OMX_ComponentNameEnum(name, OMX_MAX_STRINGNAME_SIZE, i);
        if (OMX_ErrorNone == err) {
            printf("Component is %s\n", name);
            listroles(name);
        }
    }
    printf("No more components\n");

    /*
    i= 0 ;
    while (1) {
        printf("Component %s\n", OMX_ComponentRegistered[i++]);
    }
    */
    exit(0);
}

Bellagio 库的输出如下:

Component is OMX.st.clocksrc
  Num roles is 1
    role: clocksrc
Component is OMX.st.clocksrc
  Num roles is 1
    role: clocksrc
Component is OMX.st.video.scheduler
  Num roles is 1
    role: video.scheduler
Component is OMX.st.video.scheduler
  Num roles is 1
    role: video.scheduler
Component is OMX.st.volume.component
  Num roles is 1
    role: volume.component
Component is OMX.st.volume.component
  Num roles is 1
    role: volume.component
Component is OMX.st.audio.mixer
  Num roles is 1
    role: audio.mixer
Component is OMX.st.audio.mixer
  Num roles is 1
    role: audio.mixer
Component is OMX.st.clocksrc
  Num roles is 1
    role: clocksrc
Component is OMX.st.clocksrc
  Num roles is 1
    role: clocksrc
Component is OMX.st.video.scheduler
  Num roles is 1
    role: video.scheduler
Component is OMX.st.video.schedu

ler
  Num roles is 1
    role: video.scheduler
Component is OMX.st.volume.component
  Num roles is 1
    role: volume.component
Component is OMX.st.volume.component
  Num roles is 1
    role: volume.component
Component is OMX.st.audio.mixer
  Num roles is 1
    role: audio.mixer
Component is OMX.st.audio.mixer
  Num roles is 1
    role: audio.mixer
No more components

这不太正确。OpenMAX IL 规范规定每个组件只能出现一次,不能重复。

Raspberry Pi 报告了大量的组件,但是没有为它们中的任何一个定义角色。

Component is OMX.broadcom.audio_capture
  Num roles is 0
Component is OMX.broadcom.audio_decode
  Num roles is 0
Component is OMX.broadcom.audio_encode
  Num roles is 0
Component is OMX.broadcom.audio_render
  Num roles is 0
Component is OMX.broadcom.audio_mixer
  Num roles is 0
Component is OMX.broadcom.audio_splitter
  Num roles is 0
Component is OMX.broadcom.audio_processor
  Num roles is 0
Component is OMX.broadcom.camera
  Num roles is 0
Component is OMX.broadcom.clock
  Num roles is 0
Component is OMX.broadcom.coverage
  Num roles is 0
Component is OMX.broadcom.egl_render
  Num roles is 0
Component is OMX.broadcom.image_fx
  Num roles is 0
Component is OMX.broadcom.image_decode
  Num roles is 0
Component is OMX.broadcom.image_encode
  Num roles is 0
Component is OMX.broadcom.image_read
  Num roles is 0
Component is OMX.broadcom.image_write
  Num roles is 0
Component is OMX.broadcom.read_media
  Num roles is 0
Component is OMX.broadcom.resize
  Num roles is 0
Component is OMX.broadcom.source
  Num roles is 0
Component is OMX.broadcom.text_scheduler
  Num roles is 0
Component is OMX.broadcom.transition
  Num roles is 0
Component is OMX.broadcom.video_decode
  Num roles is 0
Component is OMX.broadcom.video_encode
  Num roles is 0
Component is OMX.broadcom.video_render
  Num roles is 0
Component is OMX.broadcom.video_scheduler
  Num roles is 0
Component is OMX.broadcom.video_splitter
  Num roles is 0
Component is OMX.broadcom.visualisation
  Num roles is 0
Component is OMX.broadcom.write_media
  Num roles is 0
Component is OMX.broadcom.write_still
  Num roles is 0
No more components

LIM 的输出如下:

Component is OMX.limoi.alsa_sink
  Num roles is 1
    role: audio_renderer.pcm
Component is OMX.limoi.clock
  Num roles is 1
    role: clock.binary
Component is OMX.limoi.ffmpeg.decode.audio
  Num roles is 8
    role: audio_decoder.aac
    role: audio_decoder.adpcm
    role: audio_decoder.amr
    role: audio_decoder.mp3
    role: audio_decoder.ogg
    role: audio_decoder.pcm
    role: audio_decoder.ra
    role: audio_decoder.wma
Component is OMX.limoi.ffmpeg.decode.video
  Num roles is 7
    role: video_decoder.avc
    role: video_decoder.h263
    role: video_decoder.mjpeg
    role: video_decoder.mpeg2
    role: video_decoder.mpeg4
    role: video_decoder.rv
    role: video_decoder.wmv
Component is OMX.limoi.ffmpeg.demux
  Num roles is 1
    role: container_demuxer.all
Component is OMX.limoi.ffmpeg.encode.audio
  Num roles is 2
    role: audio_encoder.aac
    role: audio_encoder.mp3
Component is OMX.limoi.ffmpeg.encode.video
  Num roles is 2
    role: video_encoder.h263
    role: video_encoder.mpeg4
Component is OMX.limoi.ffmpeg.mux
  Num roles is 1
    role: container_muxer.all
Component is OMX.limoi.ogg_dec
  Num roles is 1
    role: audio_decoder_with_framing.ogg
Component is OMX.limoi.sdl.renderer.video
  Num roles is 1
    role: iv_renderer.yuv.overlay
Component is OMX.limoi.vid

eo_scheduler
  Num roles is 1
    role: video_scheduler.binary
No more components

获取关于 IL 组件的信息

接下来,您将了解如何获取有关 OpenMAX IL 系统和您使用的任何组件的信息。所有 IL 客户端必须通过调用OMX_Init()来初始化 OpenMAX IL。几乎所有函数都返回错误值,Bellagio 使用的风格如下:

  err = OMX_Init();
  if(err != OMX_ErrorNone) {
      fprintf(stderr, "OMX_Init() failed\n", 0);
      exit(1);
  }

这在我看来是一种合理的风格,所以我在续集中遵循了它。

下一个需求是获得组件的句柄。这需要组件的供应商名称,可以使用前面显示的listcomponents.c程序找到。函数OMX_GetHandle接受一些参数,包括一组回调函数。这些是跟踪应用的行为所需要的,但对于本节中的示例并不需要。这段代码显示了如何获得 Bellagio 音量组件的句柄:

  OMX_HANDLETYPE handle;
  OMX_CALLBACKTYPE callbacks;
  OMX_ERRORTYPE err;

  err = OMX_GetHandle(&handle, "OMX.st.volume.component", NULL /*appPriv */, &callbacks);
  if(err != OMX_ErrorNone) {
      fprintf(stderr, "OMX_GetHandle failed\n", 0);
      exit(1);
  }

组件有端口,端口有通道。这些信息的获取和设置由函数OMX_GetParameter()OMX_SetParameter()OMX_GetConfig()OMX_GetConfig()完成。在组件被“加载”之前进行…Parameter调用,在组件被加载之后进行…Config调用。

c 不是 OO 语言,这是一个普通的函数调用(嗯,实际上是一个宏)。在 OO 语言中,它是一个对象将另一个对象作为参数的方法,如component.method(object)。在 OpenMAX IL 中,Get/Set 函数将调用“对象”作为第一个参数(组件,该方法的参数是什么类型的“对象”的指示符),可能的“对象”类型的索引,以及参数对象的结构。索引值与 1.1 规范表 4-2 中的结构相关。

这些调用采用一个(指向的)结构来填充或提取值。这些结构都是规范化的,因此它们共享公共字段,如结构的大小。在 Bellagio 示例中,这是通过宏setHeader()完成的。传入以获取端口信息的结构通常是类型为OMX_PORT_PARAM_TYPE的通用结构。有些字段可以直接访问,有些需要转换为更特殊的类型,有些隐藏在联合中,必须提取出来。

端口由整数索引标记。不同的功能有不同的端口,如音频、图像、视频等。要获取有关音频端口起始值的信息,请使用以下命令:

  setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
  err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, &param);
  if(err != OMX_ErrorNone){
      fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
    exit(1);
  }
  printf("Audio ports start on %d\n",
         ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber);
  printf("There are %d open ports\n",
         ((OMX_PORT_PARAM_TYPE)param).nPorts);

setHeader只是填充头部信息,比如版本号和数据结构的大小。

现在可以询问特定端口的能力。您可以查询端口类型(音频或其他)、方向(输入或输出)以及有关支持的 MIME 类型的信息。

  OMX_PARAM_PORTDEFINITIONTYPE sPortDef;

  setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
  sPortDef.nPortIndex = 0;
  err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
  if(err != OMX_ErrorNone){
      fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
    exit(1);
  }
  if (sPortDef.eDomain == OMX_PortDomainAudio) {
      printf("Is an audio port\n");
  } else {
      printf("Is other device port\n");
  }

  if (sPortDef.eDir == OMX_DirInput) {
      printf("Port is an input port\n");
  } else {
      printf("Port is an output port\n");
  }

  /* the Audio Port info is buried in a union format.audio within the struct */
  printf("Port min buffers %d,  mimetype %s, encoding %d\n",
         sPortDef.nBufferCountMin,
         sPortDef.format.audio.cMIMEType,
         sPortDef.format.audio.eEncoding);

Bellagio 库为其音量控制组件支持的 MIME 类型返回“raw/audio”。但是,这不是 IANA MIME 媒体类型( www.iana.org/assignments/media-types )列出的有效 MIME 类型。编码返回的值是零,对应OMX_AUDIO_CodingUnused,这个好像也不正确。

如果您在 Raspberry Pi 组件audio_render和 LIM 组件OMX.limoi.alsa_sink上尝试相同的程序,您会得到 MIME 类型的NULL,但是编码值为 2,也就是OMX_AUDIO_CodingPCM。PCM 有一个哑剧类型的audio/L16,所以NULL似乎不合适。

OpenMAX IL 库允许向端口查询其支持的数据类型。这是通过使用索引OMX_IndexParamAudioPortFormat查询OMX_AUDIO_PARAM_PORTFORMATTYPE对象来完成的。根据规范,对于从零开始的每个索引,对GetParameter()的调用应该返回一个编码,比如OMX_AUDIO_CodingPCMOMX_AUDIO_CodingMp3,直到不再有支持的格式,在这种情况下,调用将返回OMX_ErrorNoMore

Bellagio 代码返回值OMX_AUDIO_CodingUnused,这是不正确的。LIM 代码根本没有设置值,所以您得到的只是垃圾。Broadcom 实现工作正常,但正如将要讨论的那样,它会返回实际上不受支持的值。所以,这种呼吁的价值有限。

以下代码对此进行了测试:

void getSupportedAudioFormats(int indentLevel, int portNumber) {
    OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;

    setHeader(&sAudioPortFormat, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    sAudioPortFormat.nIndex = 0;
    sAudioPortFormat.nPortIndex = portNumber;

    printf("Supported audio formats are:\n");
    for(;;) {
        err = OMX_GetParameter(handle, OMX_IndexParamAudioPortFormat, &sAudioPortFormat);
        if (err == OMX_ErrorNoMore) {
            printf("No more formats supported\n");
            return;
        }

        /* This shouldn't occur, but does with Broadcom library */
        if (sAudioPortFormat.eEncoding == OMX_AUDIO_CodingUnused) {
             printf("No coding format returned\n");
             return;
        }

        switch (sAudioPortFormat.eEncoding) {
        case OMX_AUDIO_CodingPCM:
            printf("Supported encoding is PCM\n");
            break;
        case OMX_AUDIO_CodingVORBIS:
            printf("Supported encoding is Ogg Vorbis\n");
            break;
        case OMX_AUDIO_CodingMP3:
            printf("Supported encoding is MP3\n");
            break;
#ifdef RASPBERRY_PI
        case OMX_AUDIO_CodingFLAC:
            printf("Supported encoding is FLAC\n");
            break;
        case OMX_AUDIO_CodingDDP:
            printf("Supported encoding is DDP\n");
            break;
        case OMX_AUDIO_CodingDTS:
            printf("Supported encoding is DTS\n");
            break;
        case OMX_AUDIO_CodingWMAPRO:
            printf("Supported encoding is WMAPRO\n");
            break;
#endif
        case OMX_AUDIO_CodingAAC:
            printf("Supported encoding is AAC\n");
            break;
        case OMX_AUDIO_CodingWMA:
            printf("Supported encoding is WMA\n");
            break;
        case OMX_AUDIO_CodingRA:
            printf("Supported encoding is RA\n");
            break;
        case OMX_AUDIO_CodingAMR:
            printf("Supported encoding is AMR\n");
            break;
        case OMX_AUDIO_CodingEVRC:
            printf("Supported encoding is EVRC\n");
            break;
        case OMX_AUDIO_CodingG726:
            printf("Supported encoding is G726\n");
            break;
        case OMX_AUDIO_CodingMIDI:
            printf("Supported encoding is MIDI\n");
            break;
        case OMX_AUDIO_CodingATRAC3:
            printf("Supported encoding is ATRAC3\n");
            break;
        case OMX_AUDIO_CodingATRACX:
            printf("Supported encoding is ATRACX\n");
            break;
        case OMX_AUDIO_CodingATRACAAL:
            printf("Supported encoding is ATRACAAL\n");
            break;
        default:
            printf("Supported encoding is %d\n",
                  sAudioPortFormat.eEncoding);
        }
        sAudioPortFormat.nIndex++;
    }
}

请注意,该代码包含特定于 Broadcom 库的枚举值,如OMX_AUDIO_CodingATRAC3。根据 OpenMAX IL 扩展机制,这些是合法的值,但当然不是可移植的值。

Bellagio 库错误地为每个索引值返回OMX_AUDIO_CodingUnused

Broadcom 库可以返回许多值。例如,对于audio_decode组件,它返回以下内容:

      Supported audio formats are:
      Supported encoding is MP3
      Supported encoding is PCM
      Supported encoding is AAC
      Supported encoding is WMA
      Supported encoding is Ogg Vorbis
      Supported encoding is RA
      Supported encoding is AMR
      Supported encoding is EVRC
      Supported encoding is G726
      Supported encoding is FLAC
      Supported encoding is DDP
      Supported encoding is DTS
      Supported encoding is WMAPRO
      Supported encoding is ATRAC3
      Supported encoding is ATRACX
      Supported encoding is ATRACAAL
      Supported encoding is MIDI
      No more formats supported

遗憾的是,除了 PCM 之外,这些都不被支持。以下是根据 jamesh 在“音频解码器组件的 OMX _ 分配缓冲区失败”中的说法:

The way it works is that the component returns success for all codecs it may support (that is, all codecs we once owned). This is limited by the actual installed codec. It is best to detect which codecs exist at runtime, but these codes have never been written because they are never needed. This is also unlikely to happen, because Broadcom no longer supports audio codecs in this way, they have moved from the video core to the host CPU, because they are now strong enough to handle any audio decoding task.

这真的有点可悲。

将所有的位放在一起就产生了程序info.c,如下所示:

/**
   Based on code
   Copyright (C) 2007-2009 STMicroelectronics
   Copyright (C) 2007-2009 Nokia Corporation and/or its subsidiary(-ies).
   under the LGPL
*/

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/stat.h>

#include <OMX_Core.h>
#include <OMX_Component.h>
#include <OMX_Types.h>
#include <OMX_Audio.h>

#ifdef RASPBERRY_PI
#include <bcm_host.h>
#endif

OMX_ERRORTYPE err;
OMX_HANDLETYPE handle;
OMX_VERSIONTYPE specVersion, compVersion;

OMX_CALLBACKTYPE callbacks;

#define indent {int n = 0; while (n++ < indentLevel*2) putchar(' ');}

static void setHeader(OMX_PTR header, OMX_U32 size) {
    /* header->nVersion */
    OMX_VERSIONTYPE* ver = (OMX_VERSIONTYPE*)(header + sizeof(OMX_U32));
    /* header->nSize */
    *((OMX_U32*)header) = size;

    /* for 1.2
       ver->s.nVersionMajor = OMX_VERSION_MAJOR;
       ver->s.nVersionMinor = OMX_VERSION_MINOR;
       ver->s.nRevision = OMX_VERSION_REVISION;
       ver->s.nStep = OMX_VERSION_STEP;
    */
    ver->s.nVersionMajor = specVersion.s.nVersionMajor;
    ver->s.nVersionMinor = specVersion.s.nVersionMinor;
    ver->s.nRevision = specVersion.s.nRevision;
    ver->s.nStep = specVersion.s.nStep;
}

void printState() {
    OMX_STATETYPE state;
    err = OMX_GetState(handle, &state);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on getting state\n");
        exit(1);
    }
    switch (state) {
    case OMX_StateLoaded: fprintf(stderr, "StateLoaded\n"); break;
    case OMX_StateIdle: fprintf(stderr, "StateIdle\n"); break;
    case OMX_StateExecuting: fprintf(stderr, "StateExecuting\n"); break;
    case OMX_StatePause: fprintf(stderr, "StatePause\n"); break;
    case OMX_StateWaitForResources: fprintf(stderr, "StateWiat\n"); break;
    default:  fprintf(stderr, "State unknown\n"); break;
    }
}

OMX_ERRORTYPE setEncoding(int portNumber, OMX_AUDIO_CODINGTYPE encoding) {
    OMX_PARAM_PORTDEFINITIONTYPE sPortDef;

    setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
    sPortDef.nPortIndex = portNumber;
    sPortDef.nPortIndex = portNumber;
    err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n",
 0);
        exit(1);
    }

    sPortDef.format.audio.eEncoding = encoding;
    sPortDef.nBufferCountActual = sPortDef.nBufferCountMin;

    err = OMX_SetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
    return err;
}

void getPCMInformation(int indentLevel, int portNumber) {
    /* assert: PCM is a supported mode */
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;

    /* set it into PCM format before asking for PCM info */
    if (setEncoding(portNumber, OMX_AUDIO_CodingPCM) != OMX_ErrorNone) {
        fprintf(stderr, "Error in setting coding to PCM\n");
        return;
    }

    setHeader(&sPCMMode, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nPortIndex = portNumber;
    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        indent printf("PCM mode unsupported\n");
    } else {
        indent printf("  PCM default sampling rate %d\n", sPCMMode.nSamplingRate);
        indent printf("  PCM default bits per sample %d\n", sPCMMode.nBitPerSample);
        indent printf("  PCM default number of channels %d\n", sPCMMode.nChannels);
    }

    /*
    setHeader(&sAudioPortFormat, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    sAudioPortFormat.nIndex = 0;
    sAudioPortFormat.nPortIndex = portNumber;
    */

}
void getMP3Information(int indentLevel, int portNumber) {
    /* assert: MP3 is a supported mode */
    OMX_AUDIO_PARAM_MP3TYPE sMP3Mode;

    /* set it into MP3 format before asking for MP3 info */
    if (setEncoding(portNumber, OMX_AUDIO_CodingMP3) != OMX_ErrorNone) {
        fprintf(stderr, "Error in setting coding to MP3\n");
        return;
    }

    setHeader(&sMP3Mode, sizeof(OMX_AUDIO_PARAM_MP3TYPE));
    sMP3Mode.nPortIndex = portNumber;
    err = OMX_GetParameter(handle, OMX_IndexParamAudioMp3, &sMP3Mode);
    if(err != OMX_ErrorNone){
        indent printf("MP3 mode unsupported\n");
    } else {
        indent printf("  MP3 default sampling rate %d\n", sMP3Mode.nSampleRate);
        indent printf("  MP3 default bits per sample %d\n", sMP3Mode.nBitRate);
        indent printf("  MP3 default number of channels %d\n", sMP3Mode.nChannels);
    }
}

void getSupportedAudioFormats(int indentLevel, int portNumber) {
    OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;

    setHeader(&sAudioPortFormat, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    sAudioPortFormat.nIndex = 0;
    sAudioPortFormat.nPortIndex = portNumber;

#ifdef LIM
    printf("LIM doesn't set audio formats properly\n");
    return;
#endif

    indent printf("Supported audio formats are:\n");
    for(;;) {
        err = OMX_GetParameter(handle, OMX_IndexParamAudioPortFormat, &sAudioPortFormat);
        if (err == OMX_ErrorNoMore) {
            indent printf("No more formats supported\n");
            return;
        }

        /* This shouldn't occur, but does with Broadcom library */
        if (sAudioPortFormat.eEncoding == OMX_AUDIO_CodingUnused) {
             indent printf("No coding format returned\n");
             return;
        }

        switch (sAudioPortFormat.eEncoding) {
        case OMX_AUDIO_CodingPCM:
            indent printf("Supported encoding is PCM\n");
            getPCMInformation(indentLevel+1, portNumber);
            break;
        case OMX_AUDIO_CodingVORBIS:
            indent printf("Supported encoding is Ogg Vorbis\n");
            break;
        case OMX_AUDIO_CodingMP3:
            indent printf("Supported encoding is MP3\n");
            getMP3Information(indentLevel+1, portNumber);
            break;
#ifdef RASPBERRY_PI
        case OMX_AUDIO_CodingFLAC:
            indent printf("Supported encoding is FLAC\n");
            break;
        case OMX_AUDIO_CodingDDP:
            indent printf("Supported encoding is DDP\n");
            break;
        case OMX_AUDIO_CodingDTS:
            indent printf("Supported encoding is DTS\n");
            break;
        case OMX_AUDIO_CodingWMAPRO:
            indent printf("Supported encoding is WMAPRO\n");
            break;
        case OMX_AUDIO_CodingATRAC3:
            indent printf("Supported encoding is ATRAC3\n");
            break;
        case OMX_AUDIO_CodingATRACX:
            indent printf("Supported encoding is ATRACX\n");
            break;
        case OMX_AUDIO_CodingATRACAAL:
            indent printf("Supported encoding is ATRACAAL\n");
            break;
#endif
        case OMX_AUDIO_CodingAAC:
            indent printf("Supported encoding is AAC\n");
            break;
        case OMX_AUDIO_CodingWMA:
            indent printf("Supported encoding is WMA\n");
            break;
        case OMX_AUDIO_CodingRA:
            indent printf("Supported encoding is RA\n");
            break;
        case OMX_AUDIO_CodingAMR:
            indent printf("Supported encoding is AMR\n");
            break;
        case OMX_AUDIO_CodingEVRC:
            indent printf("Supported encoding is EVRC\n");
            break;
        case OMX_AUDIO_CodingG726:
            indent printf("Supported encoding is G726\n");
            break;
        case OMX_AUDIO_CodingMIDI:
            indent printf("Supported encoding is MIDI\n");
            break;

            /*
        case OMX_AUDIO_Coding:
            indent printf("Supported encoding is \n");
            break;
            */
        default:
            indent printf("Supported encoding is not PCM or MP3 or Vorbis, is 0x%X\n",
                  sAudioPortFormat.eEncoding);
        }
        sAudioPortFormat.nIndex++;
    }
}

void getAudioPortInformation(int indentLevel, int nPort, OMX_PARAM_PORTDEFINITIONTYPE sPortDef) {
    indent printf("Port %d requires %d buffers\n", nPort, sPortDef.nBufferCountMin);
    indent printf("Port %d has min buffer size %d bytes\n", nPort, sPortDef.nBufferSize);

    if (sPortDef.eDir == OMX_DirInput) {
        indent printf("Port %d is an input port\n", nPort);
    } else {
        indent printf("Port %d is an output port\n",  nPort);
    }
    switch (sPortDef.eDomain) {
    case OMX_PortDomainAudio:
        indent printf("Port %d is an audio port\n", nPort);
        indent printf("Port mimetype %s\n",
               sPortDef.format.audio.cMIMEType);

        switch (sPortDef.format.audio.eEncoding) {
        case OMX_AUDIO_CodingPCM:
            indent printf("Port encoding is PCM\n");
            break;
        case OMX_AUDIO_CodingVORBIS:
            indent printf("Port encoding is Ogg Vorbis\n");
            break;
        case OMX_AUDIO_CodingMP3:
            indent printf("Port encoding is MP3\n");
            break;
        default:
            indent printf("Port encoding is not PCM or MP3 or Vorbis, is %d\n",
                   sPortDef.format.audio.eEncoding);
        }
        getSupportedAudioFormats(indentLevel+1, nPort);

        break;
        /* could put other port types here */
    default:
        indent printf("Port %d is not an audio port\n",  nPort);
    }
}

void getAllAudioPortsInformation(int indentLevel) {
    OMX_PORT_PARAM_TYPE param;
    OMX_PARAM_PORTDEFINITIONTYPE sPortDef;

    int startPortNumber;
    int nPorts;
    int n;

    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));

    err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting audio OMX_PORT_PARAM_TYPE parameter\n", 0);
        return;
    }
    indent printf("Audio ports:\n");
    indentLevel++;

    startPortNumber = param.nStartPortNumber;
    nPorts = param.nPorts;
    if (nPorts == 0) {
        indent printf("No ports of this type\n");
        return;
    }

    indent printf("Ports start on %d\n", startPortNumber);
    indent printf("There are %d open ports\n", nPorts);

    for (n = 0; n < nPorts; n++) {
        setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
        sPortDef.nPortIndex = startPortNumber + n;
        err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
        if(err != OMX_ErrorNone){
            fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n", 0);
            exit(1);
        }
        getAudioPortInformation(indentLevel+1, startPortNumber + n, sPortDef);
    }
}

void getAllVideoPortsInformation(int indentLevel) {
    OMX_PORT_PARAM_TYPE param;
    int startPortNumber;
    int nPorts;
    int n;

    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));

    err = OMX_GetParameter(handle, OMX_IndexParamVideoInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting video OMX_PORT_PARAM_TYPE parameter\n", 0);
        return;
    }
    printf("Video ports:\n");
    indentLevel++;

    startPortNumber = param.nStartPortNumber;
    nPorts = param.nPorts;
    if (nPorts == 0) {
        indent printf("No ports of this type\n");
        return;
    }

    indent printf("Ports start on %d\n", startPortNumber);
    indent printf("There are %d open ports\n", nPorts);
}

void getAllImagePortsInformation(int indentLevel) {
    OMX_PORT_PARAM_TYPE param;
    int startPortNumber;
    int nPorts;
    int n;

    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));

    err = OMX_GetParameter(handle, OMX_IndexParamVideoInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting image OMX_PORT_PARAM_TYPE parameter\n", 0);
        return;
    }
    printf("Image ports:\n");
    indentLevel++;

    startPortNumber = param.nStartPortNumber;
    nPorts = param.nPorts;
    if (nPorts == 0) {
        indent printf("No ports of this type\n");
        return;
    }

    indent printf("Ports start on %d\n", startPortNumber);
    indent printf("There are %d open ports\n", nPorts);
}

void getAllOtherPortsInformation(int indentLevel) {
    OMX_PORT_PARAM_TYPE param;
    int startPortNumber;
    int nPorts;
    int n;

    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));

    err = OMX_GetParameter(handle, OMX_IndexParamVideoInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting other OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }
    printf("Other ports:\n");
    indentLevel++;

    startPortNumber = param.nStartPortNumber;
    nPorts = param.nPorts;
    if (nPorts == 0) {
        indent printf("No ports of this type\n");
        return;
    }

    indent printf("Ports start on %d\n", startPortNumber);
    indent printf("There are %d open ports\n", nPorts);
}

int main(int argc, char** argv) {

    OMX_PORT_PARAM_TYPE param;
    OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
    OMX_AUDIO_PORTDEFINITIONTYPE sAudioPortDef;
    OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;

#ifdef RASPBERRY_PI
    char *componentName = "OMX.broadcom.audio_mixer";
#endif
#ifdef LIM
    char *componentName = "OMX.limoi.alsa_sink";
#else
    char *componentName = "OMX.st.volume.component";
#endif
    unsigned char name[128]; /* spec says 128 is max name length */
    OMX_UUIDTYPE uid;
    int startPortNumber;
    int nPorts;
    int n;

    /* ovveride component name by command line argument */
    if (argc == 2) {
        componentName = argv[1];
    }

# ifdef RASPBERRY_PI
    bcm_host_init();
# endif

    err = OMX_Init();
    if(err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_Init() failed\n", 0);
        exit(1);
    }
    /** Ask the core for a handle to the volume control component
     */
    err = OMX_GetHandle(&handle, componentName, NULL /*app private data */, &callbacks);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_GetHandle failed\n", 0);
        exit(1);
    }
    err = OMX_GetComponentVersion(handle, name, &compVersion, &specVersion, &uid);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_GetComponentVersion failed\n", 0);
        exit(1);
    }
    printf("Component name: %s version %d.%d, Spec version %d.%d\n",
           name, compVersion.s.nVersionMajor,
           compVersion.s.nVersionMinor,
           specVersion.s.nVersionMajor,
           specVersion.s.nVersionMinor);

    /** Get  ports information */
    getAllAudioPortsInformation(0);
    getAllVideoPortsInformation(0);
    getAllImagePortsInformation(0);
    getAllOtherPortsInformation(0);

    exit(0);
}

Bellagio 版本的Makefile如下:

INCLUDES=-I ../libomxil-bellagio-0.9.3/include/
LIBS=-L ../libomxil-bellagio-0.9.3/src/.libs -l omxil-bellagio
CFLAGS = -g

info: info.c
        cc $(FLAGS) $(INCLUDES) -o info info.c $(LIBS)

使用 Bellagio 实现的输出如下:

Component name: OMX.st.volume.component version 1.1, Spec version 1.1
Audio ports:
  Ports start on 0
  There are 2 open ports
    Port 0 requires 2 buffers
    Port 0 is an input port
    Port 0 is an audio port
    Port mimetype raw/audio
    Port encoding is not PCM or MP3 or Vorbis, is 0
      Supported audio formats are:
      No coding format returned
    Port 1 requires 2 buffers
    Port 1 is an output port
    Port 1 is an audio port
    Port mimetype raw/audio
    Port encoding is not PCM or MP3 or Vorbis, is 0
      Supported audio formats are:
      No coding format returned
Video ports:
  No ports of this type
Image ports:
  No ports of this type
Other ports:
  No ports of this type

树莓派的Makefile如下:

INCLUDES=-I /opt/vc/include/IL -I /opt/vc/include -I /opt/vc/include/interface/vcos/pthreads
CFLAGS=-g -DRASPBERRY_PI
LIBS=-L /opt/vc/lib -l openmaxil -l bcm_host

info: info.c
        cc $(CFLAGS) $(INCLUDES) -o info info.c $(LIBS)

组件audio_render在 Raspberry Pi 上的输出如下:

Audio ports:
  Ports start on 100
  There are 1 open ports
    Port 100 requires 1 buffers
    Port 100 is an input port
    Port 100 is an audio port
    Port mimetype (null)
    Port encoding is PCM
      Supported audio formats are:
      Supported encoding is PCM
          PCM default sampling rate 44100
          PCM default bits per sample 16
          PCM default number of channels 2
      Supported encoding is DDP
      No more formats supported
Video ports:
  No ports of this type
Image ports:
  No ports of this type
Other ports:
  No ports of this type

直线电机的Makefile如下:

INCLUDES=-I ../../lim-omx-1.1/LIM/limoi-core/include/
#LIBS=-L ../../lim-omx-1.1/LIM/limoi-base/src/.libs -l limoi-base
LIBS = -L /home/newmarch/osm-build/lib/ -l limoa -l limoi-core
CFLAGS = -g -DLIM

info: info.c
        cc $(CFLAGS) $(INCLUDES) -o info info.c $(LIBS)

alsa_sink组件的 LIM 输出如下:

Component name: OMX.limoi.alsa_sink version 0.0, Spec version 1.1
Audio ports:
  Ports start on 0
  There are 1 open ports
    Port 0 requires 2 buffers
    Port 0 is an input port
    Port 0 is an audio port
    Port mimetype (null)
    Port encoding is PCM
LIM doesn't set audio formats properly
Error in getting video OMX_PORT_PARAM_TYPE parameter
Error in getting image OMX_PORT_PARAM_TYPE parameter
Error in getting other OMX_PORT_PARAM_TYPE parameter

当组件不支持某个模式时(这里的音频组件不支持视频、图像或其他模式),LIM 实现会抛出错误。这违反了 1.1 规范,该规范规定如下:

"All standard components shall support the following parameters:
  o OMX_IndexParamPortDefinition
  o OMX_IndexParamCompBufferSupplier
  o OMX_IndexParamAudioInit
  o OMX_IndexParamImageInit
  o OMX_IndexParamVideoInit
  o OMX_IndexParamOtherInit"

我想你可能会说alsa_sink组件不是标准组件,所以它是允许的。嗯,好吧…

播放 PCM 音频文件

向输出设备播放音频需要使用audio_render设备。这是 1.1 规范中的标准设备之一,包含在 Broadcom Raspberry Pi 库中,但不包含在 Bellagio 库中。LIM 有一个组件alsa_sink,起着同样的作用。

播放音频的程序结构如下:

  1. 初始化库和音频渲染组件。
  2. 不断填充输入缓冲区,并要求组件清空缓冲区。
  3. 从组件捕获事件,告知缓冲区已被清空,以便安排重新填充缓冲区并请求清空缓冲区。
  4. 完工后清理。

请注意,Raspberry Pi 音频渲染组件将只播放 PCM 数据,而 LIM alsa_sink组件只能以 44,100Hz 播放。

状态

初始化组件是一个多步骤的过程,具体取决于组件的状态。组件在Loaded状态下创建。它们通过OMX_SendCommand(handle, OMX_CommandStateSet, <next state>, <param>)从一种状态转换到另一种状态。从Loaded出发的下一个州应该是Idle,从那里到Executing。还有其他一些你不需要关心的状态。

改变状态的请求是异步的。send 命令立即返回(嗯,在 5 毫秒内)。当状态发生实际变化时,会调用事件处理程序回调函数。

线

一些命令要求组件处于特定状态。将组件置于某种状态的请求是异步的。因此,客户端可以发出请求,但是客户端可能必须等待,直到状态发生变化。这最好通过客户端暂停其线程的操作来完成,直到被事件处理程序中发生的状态变化唤醒。

Linux/Unix 已经在管理多线程的 Posix pthreads 库上实现了标准化。出于我们的目的,您使用了这个库中的两个部分:在关键部分放置互斥体的能力和基于条件挂起/唤醒线程的能力。Pthreads 在很多地方都有涉及,Blaise Barney 有一个很短很好的教程叫做“POSIX Threads 编程”( https://computing.llnl.gov/tutorials/pthreads/#Misc )。

您使用的函数和数据如下:

pthread_mutex_t mutex;
OMX_STATETYPE currentState = OMX_StateLoaded;
pthread_cond_t stateCond;

void waitFor(OMX_STATETYPE state) {
    pthread_mutex_lock(&mutex);
    while (currentState != state)
        pthread_cond_wait(&stateCond, &mutex);
    fprintf(stderr, "Wait successfully completed\n");
    pthread_mutex_unlock(&mutex);
}

void wakeUp(OMX_STATETYPE newState) {
    pthread_mutex_lock(&mutex);
    currentState = newState;
    pthread_cond_signal(&stateCond);
    pthread_mutex_unlock(&mutex);
}
pthread_mutex_t empty_mutex;
int emptyState = 0;
OMX_BUFFERHEADERTYPE* pEmptyBuffer;
pthread_cond_t emptyStateCond;

void waitForEmpty() {
    pthread_mutex_lock(&empty_mutex);
    while (emptyState == 1)
        pthread_cond_wait(&emptyStateCond, &empty_mutex);
    emptyState = 1;
    pthread_mutex_unlock(&empty_mutex);
}

void wakeUpEmpty(OMX_BUFFERHEADERTYPE* pBuffer) {
    pthread_mutex_lock(&empty_mutex);
    emptyState = 0;
    pEmptyBuffer = pBuffer;
    pthread_cond_signal(&emptyStateCond);
    pthread_mutex_unlock(&empty_mutex);
}

void mutex_init() {
    int n = pthread_mutex_init(&mutex, NULL);
    if ( n != 0) {
        fprintf(stderr, "Can't init state mutex\n");
    }
    n = pthread_mutex_init(&empty_mutex, NULL);
    if ( n != 0) {
        fprintf(stderr, "Can't init empty mutex\n");
    }
}

OpenMAX IL 中的匈牙利符号

匈牙利符号是由查尔斯·西蒙尼发明的,用来给变量、结构和字段名添加类型或功能信息。Microsoft Windows SDK 中大量使用了一个窗体。在 OpenMAX IL 中,通过为变量、字段等添加前缀,使用了一种简化形式,如下所示:

  • 以某种数字为前缀。
  • p给指针加前缀。
  • 给结构或字符串加前缀。
  • 给回调函数加前缀。

这些公约的价值是很有争议的。

回收

两种类型的回调函数与这个例子相关:在状态和一些其他事件改变时发生的事件回调,以及当组件清空输入缓冲区时发生的空缓冲区回调。这些在以下机构注册:

OMX_CALLBACKTYPE callbacks  = { .EventHandler = cEventHandler,
            .EmptyBufferDone = cEmptyBufferDone,
};
err = OMX_GetHandle(&handle, componentName, NULL /*app private data */, &callbacks);

组件资源

每个组件都有许多需要配置的端口。端口是组件的一些资源。每个端口开始时是启用的,但可以用OMX_SendCommand(handle, OMX_CommandPortDisable, <port number>, NULL)设置为禁用。

启用的端口可以分配缓冲区,用于将数据传入和传出组件。这可以通过两种方式完成:OMX_AllocateBuffer要求组件为客户端执行分配,而使用OMX_UseBuffer客户端将一个缓冲区交给组件。由于可能存在缓冲区内存对齐问题,我更喜欢让组件进行分配。

这是一个棘手的部分。要在组件上分配或使用缓冲区,必须请求从Loaded状态转换到Idle。因此,在分配缓冲区之前,必须调用OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateIdle, <param>)。但是直到每个端口都被禁用或者所有的缓冲区都被分配后,到Idle的转换才会发生。

这最后一步让我绞尽脑汁了将近一周。audio_render组件有两个端口:一个输入音频端口和一个时间更新端口。虽然我已经正确配置了音频端口,但我没有禁用时间端口,因为我不知道它有时间端口。因此,到Idle的转换从未发生。下面是处理这种情况的代码:

    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
    err = OMX_GetParameter(handle, OMX_IndexParamOtherInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }
    startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
    nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
    printf("Other has %d ports\n", nPorts);
    /* and disable it */
    err = OMX_SendCommand(handle, OMX_CommandPortDisable, startPortNumber, NULL);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on setting port to disabled\n");
        exit(1);
    }

以下是如何设置音频端口的参数:

    /** Get audio port information */
    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
    err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }
    startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
    nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
    if (nPorts > 1) {
        fprintf(stderr, "Render device has more than one port\n");
        exit(1);
    }

    setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
    sPortDef.nPortIndex = startPortNumber;
    err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n", 0);
        exit(1);
    }
    if (sPortDef.eDomain != OMX_PortDomainAudio) {
        fprintf(stderr, "Port %d is not an audio port\n", startPortNumber);
        exit(1);
    }

    if (sPortDef.eDir != OMX_DirInput) {
        fprintf(stderr, "Port is not an input port\n");
        exit(1);
    }
    if (sPortDef.format.audio.eEncoding == OMX_AUDIO_CodingPCM) {
        printf("Port encoding is PCM\n");
    }    else {
        printf("Port has unknown encoding\n");
    }

    /* create minimum number of buffers for the port */
    nBuffers = sPortDef.nBufferCountActual = sPortDef.nBufferCountMin;
    printf("Number of bufers is %d\n", nBuffers);
    err = OMX_SetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in setting OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }

    /* call to put state into idle before allocating buffers */
    err = OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateIdle, NULL);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on setting state to idle\n");
        exit(1);
    }

    err = OMX_SendCommand(handle, OMX_CommandPortEnable, startPortNumber, NULL);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on setting port to enabled\n");
        exit(1);
    }

    nBufferSize = sPortDef.nBufferSize;
    printf("%d buffers of size is %d\n", nBuffers, nBufferSize);

    inBuffers = malloc(nBuffers * sizeof(OMX_BUFFERHEADERTYPE *));
    if (inBuffers == NULL) {
        fprintf(stderr, "Can't allocate buffers\n");
        exit(1);
    }
    for (n = 0; n < nBuffers; n++) {
        err = OMX_AllocateBuffer(handle, inBuffers+n, startPortNumber, NULL,
                                 nBufferSize);
        if (err != OMX_ErrorNone) {
            fprintf(stderr, "Error on AllocateBuffer in 1%i\n", err);
            exit(1);
        }

    }

    waitFor(OMX_StateIdle);
    /* try setting the encoding to PCM mode */
    setHeader(&sPCMMode, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nPortIndex = startPortNumber;
    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        printf("PCM mode unsupported\n");
        exit(1);
    } else {
        printf("PCM mode supported\n");
        printf("PCM sampling rate %d\n", sPCMMode.nSamplingRate);
        printf("PCM nChannels %d\n", sPCMMode.nChannels);
    }

设置输出设备

OpenMAX 有一个标准的音频渲染组件。但是它渲染到什么设备上呢?内置声卡?USB 声卡?这不是 OpenMAX IL 的一部分;甚至没有办法列出音频设备,只有音频组件。

OpenMAX 有一个扩展机制,OpenMAX 实现者可以使用它来回答类似这样的问题。Broadcom 核心实现具有可用于设置音频目的(源)设备的扩展类型OMX_CONFIG_BRCMAUDIODESTINATIONTYPE(和OMX_CONFIG_BRCMAUDIOSOURCETYPE)。下面是执行此操作的代码:

void setOutputDevice(const char *name) {
   int32_t success = -1;
   OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;

   if (name && strlen(name) < sizeof(arDest.sName)) {
       setHeader(&arDest, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
       strcpy((char *)arDest.sName, name);

       err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
       if (err != OMX_ErrorNone) {
           fprintf(stderr, "Error on setting audio destination\n");
           exit(1);
       }
   }
}

这是它再次陷入黑暗的地方。头文件<IL/OMX_Broadcom.h>声明sName的默认值是“local ”,但没有给出任何其他值。Raspberry Pi 论坛表示,这是指 3.5 毫米模拟音频输出,hdmi 是通过使用值“HDMI”来选择的没有记录其他值,并且 Broadcom OpenMAX IL 似乎不支持任何其他音频设备。特别是,当前的 Broadcom OpenMAX IL 组件不支持 USB 音频设备的输入或输出。因此,你不能使用 OpenMAX IL 在 Raspberry Pi 上进行音频捕获,因为它没有 Broadcom 支持的音频输入。

主循环

一旦所有端口都设置好,播放音频文件包括填充缓冲区,等待它们变空,然后再填充它们,直到数据结束。有两种可能的样式。

  • 在主循环中填充缓冲区一次,然后在空缓冲区回调中继续填充和清空缓冲区。
  • 在主循环中,不断地填充和清空缓冲区,在每次填充之间等待缓冲区清空。

Bellagio 示例使用了第一种技术。然而,1.2 规范说“…IL 客户端不应该从 IL 回调上下文中调用 IL 核心或组件函数”,所以这不是一个好的技术。Raspberry Pi 示例使用了第二种技术,但是使用了一个非标准调用来查找当时的等待时间和延迟。最好只设置更多的 pthreads 条件,并在这些条件上进行阻塞。

这将导致一个如下所示的主循环:

    emptyState = 1;
    for (;;) {
        int data_read = read(fd, inBuffers[0]->pBuffer, nBufferSize);
        inBuffers[0]->nFilledLen = data_read;
        inBuffers[0]->nOffset = 0;
        filesize -= data_read;
        if (data_read <= 0) {
            fprintf(stderr, "In the %s no more input data available\n", __func__);
            inBuffers[0]->nFilledLen=0;
            inBuffers[0]->nFlags = OMX_BUFFERFLAG_EOS;
            bEOS=OMX_TRUE;
            err = OMX_EmptyThisBuffer(handle, inBuffers[0]);
            break;
        }
        if(!bEOS) {
            fprintf(stderr, "Emptying again buffer %p %d bytes, %d to go\n", inBuffers[0], data_read, filesize);
            err = OMX_EmptyThisBuffer(handle, inBuffers[0]);
        }else {
            fprintf(stderr, "In %s Dropping Empty This buffer to Audio Dec\n", __func__);
        }
        waitForEmpty();
        printf("Waited for empty\n");
    }

    printf("Buffers emptied\n");

完整程序

完整的程序如下:

/**
   Based on code
   Copyright (C) 2007-2009 STMicroelectronics
   Copyright (C) 2007-2009 Nokia Corporation and/or its subsidiary(-ies).
   under the LGPL
*/

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/stat.h>
#include <pthread.h>

#include <OMX_Core.h>
#include <OMX_Component.h>
#include <OMX_Types.h>
#include <OMX_Audio.h>

#ifdef RASPBERRY_PI
#include <bcm_host.h>
#include <IL/OMX_Broadcom.h>
#endif

OMX_ERRORTYPE err;
OMX_HANDLETYPE handle;
OMX_VERSIONTYPE specVersion, compVersion;

int fd = 0;
unsigned int filesize;
static OMX_BOOL bEOS=OMX_FALSE;

OMX_U32 nBufferSize;
int nBuffers;

pthread_mutex_t mutex;
OMX_STATETYPE currentState = OMX_StateLoaded;
pthread_cond_t stateCond;

void waitFor(OMX_STATETYPE state) {
    pthread_mutex_lock(&mutex);
    while (currentState != state)
        pthread_cond_wait(&stateCond, &mutex);
    pthread_mutex_unlock(&mutex);
}

void wakeUp(OMX_STATETYPE newState) {
    pthread_mutex_lock(&mutex);
    currentState = newState;
    pthread_cond_signal(&stateCond);
    pthread_mutex_unlock(&mutex);
}

pthread_mutex_t empty_mutex;
int emptyState = 0;
OMX_BUFFERHEADERTYPE* pEmptyBuffer;
pthread_cond_t emptyStateCond;

void waitForEmpty() {
    pthread_mutex_lock(&empty_mutex);
    while (emptyState == 1)
        pthread_cond_wait(&emptyStateCond, &empty_mutex);
    emptyState = 1;
    pthread_mutex_unlock(&empty_mutex);
}

void wakeUpEmpty(OMX_BUFFERHEADERTYPE* pBuffer) {
    pthread_mutex_lock(&empty_mutex);
    emptyState = 0;
    pEmptyBuffer = pBuffer;
    pthread_cond_signal(&emptyStateCond);
    pthread_mutex_unlock(&empty_mutex);
}

void mutex_init() {
    int n = pthread_mutex_init(&mutex, NULL);
    if ( n != 0) {
        fprintf(stderr, "Can't init state mutex\n");
    }
    n = pthread_mutex_init(&empty_mutex, NULL);
    if ( n != 0) {
        fprintf(stderr, "Can't init empty mutex\n");
    }
}

static void display_help() {
    fprintf(stderr, "Usage: render input_file");
}

/** Gets the file descriptor's size
 * @return the size of the file. If size cannot be computed
 * (i.e. stdin, zero is returned)
 */
static int getFileSize(int fd) {

    struct stat input_file_stat;
    int err;

    /* Obtain input file length */
    err = fstat(fd, &input_file_stat);
    if(err){
        fprintf(stderr, "fstat failed",0);
        exit(-1);
    }
    return input_file_stat.st_size;
}

OMX_ERRORTYPE cEventHandler(
                            OMX_HANDLETYPE hComponent,
                            OMX_PTR pAppData,
                            OMX_EVENTTYPE eEvent,
                            OMX_U32 Data1,
                            OMX_U32 Data2,
                            OMX_PTR pEventData) {

    fprintf(stderr, "Hi there, I am in the %s callback\n", __func__);
    if(eEvent == OMX_EventCmdComplete) {
        if (Data1 == OMX_CommandStateSet) {
            fprintf(stderr, "Component State changed in ", 0);
            switch ((int)Data2) {
            case OMX_StateInvalid:
                fprintf(stderr, "OMX_StateInvalid\n", 0);
                break;
            case OMX_StateLoaded:
                fprintf(stderr, "OMX_StateLoaded\n", 0);
                break;
            case OMX_StateIdle:
                fprintf(stderr, "OMX_StateIdle\n",0);
                break;
            case OMX_StateExecuting:
                fprintf(stderr, "OMX_StateExecuting\n",0);
                break;
            case OMX_StatePause:
                fprintf(stderr, "OMX_StatePause\n",0);
                break;
            case OMX_StateWaitForResources:
                fprintf(stderr, "OMX_StateWaitForResources\n",0);
                break;
            }
            wakeUp((int) Data2);
        } else  if (Data1 == OMX_CommandPortEnable){

        } else if (Data1 == OMX_CommandPortDisable){

        }
    } else if(eEvent == OMX_EventBufferFlag) {
        if((int)Data2 == OMX_BUFFERFLAG_EOS) {

        }
    } else {
        fprintf(stderr, "Param1 is %i\n", (int)Data1);
        fprintf(stderr, "Param2 is %i\n", (int)Data2);
    }

    return OMX_ErrorNone;
}

OMX_ERRORTYPE cEmptyBufferDone(
                               OMX_HANDLETYPE hComponent,
                               OMX_PTR pAppData,
                               OMX_BUFFERHEADERTYPE* pBuffer) {

    fprintf(stderr, "Hi there, I am in the %s callback.\n", __func__);
    if (bEOS) {
        fprintf(stderr, "Buffers emptied, exiting\n");
    }
    wakeUpEmpty(pBuffer);
    fprintf(stderr, "Exiting callback\n");

    return OMX_ErrorNone;
}

OMX_CALLBACKTYPE callbacks  = { .EventHandler = cEventHandler,
                                .EmptyBufferDone = cEmptyBufferDone,
};

void printState() {
    OMX_STATETYPE state;
    err = OMX_GetState(handle, &state);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on getting state\n");
        exit(1);
    }
    switch (state) {
    case OMX_StateLoaded: fprintf(stderr, "StateLoaded\n"); break;
    case OMX_StateIdle: fprintf(stderr, "StateIdle\n"); break;
    case OMX_StateExecuting: fprintf(stderr, "StateExecuting\n"); break;
    case OMX_StatePause: fprintf(stderr, "StatePause\n"); break;
    case OMX_StateWaitForResources: fprintf(stderr, "StateWiat\n"); break;
    default:  fprintf(stderr, "State unknown\n"); break;
    }
}

static void setHeader(OMX_PTR header, OMX_U32 size) {
    /* header->nVersion */
    OMX_VERSIONTYPE* ver = (OMX_VERSIONTYPE*)(header + sizeof(OMX_U32));
    /* header->nSize */
    *((OMX_U32*)header) = size;

    /* for 1.2
       ver->s.nVersionMajor = OMX_VERSION_MAJOR;
       ver->s.nVersionMinor = OMX_VERSION_MINOR;
       ver->s.nRevision = OMX_VERSION_REVISION;
       ver->s.nStep = OMX_VERSION_STEP;
    */
    ver->s.nVersionMajor = specVersion.s.nVersionMajor;
    ver->s.nVersionMinor = specVersion.s.nVersionMinor;
    ver->s.nRevision = specVersion.s.nRevision;
    ver->s.nStep = specVersion.s.nStep;
}

/**
 * Disable unwanted ports, or we can't transition to Idle state
 */
void disablePort(OMX_INDEXTYPE paramType) {
    OMX_PORT_PARAM_TYPE param;
    int nPorts;
    int startPortNumber;
    int n;

    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
    err = OMX_GetParameter(handle, paramType, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }
    startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
    nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
    if (nPorts > 0) {
        fprintf(stderr, "Other has %d ports\n", nPorts);
        /* and disable it */
        for (n = 0; n < nPorts; n++) {
            err = OMX_SendCommand(handle, OMX_CommandPortDisable, n + startPortNumber, NULL);
            if (err != OMX_ErrorNone) {
                fprintf(stderr, "Error on setting port to disabled\n");
                exit(1);
            }
        }
    }
}

#ifdef RASPBERRY_PI
/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(const char *name) {
   int32_t success = -1;
   OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;

   if (name && strlen(name) < sizeof(arDest.sName)) {
       setHeader(&arDest, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
       strcpy((char *)arDest.sName, name);

       err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
       if (err != OMX_ErrorNone) {
           fprintf(stderr, "Error on setting audio destination\n");
           exit(1);
       }
   }
}
#endif

void setPCMMode(int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;

    setHeader(&sPCMMode, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nPortIndex = startPortNumber;
    sPCMMode.nSamplingRate = 48000;
    sPCMMode.nChannels;

    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "PCM mode unsupported\n");
        return;
    } else {
        fprintf(stderr, "PCM mode supported\n");
        fprintf(stderr, "PCM sampling rate %d\n", sPCMMode.nSamplingRate);
        fprintf(stderr, "PCM nChannels %d\n", sPCMMode.nChannels);
    }
}

int main(int argc, char** argv) {

    OMX_PORT_PARAM_TYPE param;
    OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
    OMX_AUDIO_PORTDEFINITIONTYPE sAudioPortDef;
    OMX_AUDIO_PARAM_PORTFORMATTYPE sAudioPortFormat;
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_BUFFERHEADERTYPE **inBuffers;

#ifdef RASPBERRY_PI
    char *componentName = "OMX.broadcom.audio_render";
#endif
#ifdef LIM
    char *componentName = "OMX.limoi.alsa_sink";
#endif
    unsigned char name[OMX_MAX_STRINGNAME_SIZE];
    OMX_UUIDTYPE uid;
    int startPortNumber;
    int nPorts;
    int n;

# ifdef RASPBERRY_PI
    bcm_host_init();
# endif

    fprintf(stderr, "Thread id is %p\n", pthread_self());
    if(argc < 2){
        display_help();
        exit(1);
    }

    fd = open(argv[1], O_RDONLY);
    if(fd < 0){
        perror("Error opening input file\n");
        exit(1);
    }
    filesize = getFileSize(fd);

    err = OMX_Init();
    if(err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_Init() failed\n", 0);
        exit(1);
    }
    /** Ask the core for a handle to the audio render component
     */
    err = OMX_GetHandle(&handle, componentName, NULL /*app private data */, &callbacks);
    if(err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_GetHandle failed\n", 0);
        exit(1);
    }
    err = OMX_GetComponentVersion(handle, name, &compVersion, &specVersion, &uid);
    if(err != OMX_ErrorNone) {
        fprintf(stderr, "OMX_GetComponentVersion failed\n", 0);
        exit(1);
    }

    /** disable other ports */
    disablePort(OMX_IndexParamOtherInit);

    /** Get audio port information */
    setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
    err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }
    startPortNumber = ((OMX_PORT_PARAM_TYPE)param).nStartPortNumber;
    nPorts = ((OMX_PORT_PARAM_TYPE)param).nPorts;
    if (nPorts > 1) {
        fprintf(stderr, "Render device has more than one port\n");
        exit(1);
    }

    /* Get and check port information */
    setHeader(&sPortDef, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
    sPortDef.nPortIndex = startPortNumber;
    err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter\n", 0);
        exit(1);
    }
    if (sPortDef.eDomain != OMX_PortDomainAudio) {
        fprintf(stderr, "Port %d is not an audio port\n", startPortNumber);
        exit(1);
    }

    if (sPortDef.eDir != OMX_DirInput) {
        fprintf(stderr, "Port is not an input port\n");
        exit(1);
    }
    if (sPortDef.format.audio.eEncoding == OMX_AUDIO_CodingPCM) {
        fprintf(stderr, "Port encoding is PCM\n");
    }    else {
        fprintf(stderr, "Port has unknown encoding\n");
    }

    /* Create minimum number of buffers for the port */
    nBuffers = sPortDef.nBufferCountActual = sPortDef.nBufferCountMin;
    fprintf(stderr, "Number of bufers is %d\n", nBuffers);
    err = OMX_SetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in setting OMX_PORT_PARAM_TYPE parameter\n", 0);
        exit(1);
    }
    if (sPortDef.bEnabled) {
        fprintf(stderr, "Port is enabled\n");
    } else {
        fprintf(stderr, "Port is not enabled\n");
    }

    /* call to put state into idle before allocating buffers */
    err = OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateIdle, NULL);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on setting state to idle\n");
        exit(1);
    }

    err = OMX_SendCommand(handle, OMX_CommandPortEnable, startPortNumber, NULL);
    if (err != OMX_ErrorNone) {
        fprintf(stderr, "Error on setting port to enabled\n");
        exit(1);
    }

    /* Configure buffers for the port */
    nBufferSize = sPortDef.nBufferSize;
    fprintf(stderr, "%d buffers of size is %d\n", nBuffers, nBufferSize);

    inBuffers = malloc(nBuffers * sizeof(OMX_BUFFERHEADERTYPE *));
    if (inBuffers == NULL) {
        fprintf(stderr, "Can't allocate buffers\n");
        exit(1);
    }

    for (n = 0; n < nBuffers; n++) {
        err = OMX_AllocateBuffer(handle, inBuffers+n, startPortNumber, NULL,
                                 nBufferSize);
        if (err != OMX_ErrorNone) {
            fprintf(stderr, "Error on AllocateBuffer in 1%i\n", err);
            exit(1);
        }
    }
    /* Make sure we've reached Idle state */
    waitFor(OMX_StateIdle);

    /* Now try to switch to Executing state */
    err = OMX_SendCommand(handle, OMX_CommandStateSet, OMX_StateExecuting, NULL);
    if(err != OMX_ErrorNone){
        exit(1);
    }

    /* One buffer is the minimum for Broadcom component, so use that */
    pEmptyBuffer = inBuffers[0];
    emptyState = 1;
    /* Fill and empty buffer */
    for (;;) {
        int data_read = read(fd, pEmptyBuffer->pBuffer, nBufferSize);
        pEmptyBuffer->nFilledLen = data_read;
        pEmptyBuffer->nOffset = 0;
        filesize -= data_read;
        if (data_read <= 0) {
            fprintf(stderr, "In the %s no more input data available\n", __func__);
            pEmptyBuffer->nFilledLen=0;
            pEmptyBuffer->nFlags = OMX_BUFFERFLAG_EOS;
            bEOS=OMX_TRUE;
        }
        fprintf(stderr, "Emptying again buffer %p %d bytes, %d to go\n", pEmptyBuffer, data_read, filesize);
        err = OMX_EmptyThisBuffer(handle, pEmptyBuffer);
        waitForEmpty();
        fprintf(stderr, "Waited for empty\n");
        if (bEOS) {
            fprintf(stderr, "Exiting loop\n");
            break;
        }
    }
    fprintf(stderr, "Buffers emptied\n");
    exit(0);
}

结论

Khronos 集团已经为低性能系统中的音频和视频制定了规范。这些目前被 Android 和 Raspberry Pi 使用。本章已经给出了这些规范和一些示例程序的介绍性概述。LIM 包自 2012 年以来就没有更新过,而 Bellagio 包自 2011 年以来就没有更新过,所以它们似乎没有得到积极的维护。另一方面,RPi 正在蓬勃发展,使用 GPU 的 OpenMAX 编程在我的书《Raspberry Pi GPU 音频视频编程》中有详细介绍。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值