VideoToolbox解析

最新推荐文章于 2024-05-16 09:46:30 发布

weixin_34008784

最新推荐文章于 2024-05-16 09:46:30 发布

阅读量1.8k

点赞数

文章标签： ffmpeg

原文链接：https://juejin.im/post/5a30de56f265da431a432f19

版权

由于公司项目的原因，一开始参照github上的kxmovie，利用FFMPEG和OpenGL写了一个RTMPVideoPlayer。在播放解析的过程中，因为CPU和Memory的使用率比较大，手机播放久了会发热。所以就只能想办法解决这个问题了。在网上搜了一天的资料，发现iOS 8.0以后，Apple开放了VideoToolbox这个framework，可以用于视频的硬编码。可是在Apple的开发者官网找了好久都没有找到相关的资料啊，简直欲哭无泪啊。。后来只能在Stack Overflow和Apple的视频里面找到资料。把坑给填好。

接口概述

在iOS中，与视频相关的接口有5个，从顶层开始分别是 AVKit - AVFoundation - VideoToolbox - Core Media - Core Video

其中VideoToolbox可以将视频解压到CVPixelBuffer,也可以压缩到CMSampleBuffer。

如果需要使用硬编码的话，在5个接口中，就需要用到AVKit，AVFoundation和VideoToolbox。在这里我就只介绍VideoToolbox。

VideoToolbox对象

CVPixelBuffer - 未压缩光栅图像缓存区(Uncompressed Raster Image Buffer)

CVPixelBufferPool - 顾名思义，存放CVPixelBuffer

pixelBufferAttributes - CFDictionary对象，可能会包含视频的宽高，像素格式类型（32RGBA, YCbCr420），是否可以用于OpenGL ES等相关信息
CMTime - 分子是64-bit的时间值，分母是32-bit的时标(time scale)
CMVideoFormatDescription - 视频宽高，格式(kCMPixelFormat_32RGBA, kCMVideoCodecType_H264), 其他诸如颜色空间等信息的扩展
CMBlockBuffer -

CMSampleBuffer - 对于压缩的视频帧来说，包含了CMTime，CMVideoFormatDesc和CMBlockBuffer；对于未压缩的光栅图像的话，则包含了CMTime，CMVideoFormatDesc和CMPixelBuffer

CMClock - 封装了时间源，其中CMClockGetHostTimeClock()封装了mach_absolute_time()
CMTimebase - CMClock上的控制视图。提供了时间的映射:CMTimebaseSetTime(timebase, kCMTimeZero);; 速率控制: CMTimebaseSetRate(timebase, 1.0);

Case One - 播放视频流文件

使用VideoToolbox硬编码来播放网络上的流文件时，整个完整的流程是这样的：获取网络文件 -> 获取多个已压缩的H.264采样 -> 调用AVSampleBufferDisplayLayer -> 播放

更详细点看的话，在AVSamplerBufferDisplayLayer这一层中，我们还需要将视频解码到CVPixelBuffer中

处理过程

下面要介绍的就是流文件到CMSampleBuffers的H.264的处理过程：

在H.264的语法中，有一个最基础的层，叫做Network Abstraction Layer, 简称为NAL。H.264流数据正是由一系列的NAL单元(NAL Unit, 简称NALU)组成的。

一个NALU可能包含有：

视频帧(或者是视频帧的片段) - P帧， I帧， B帧

H.264属性集合：Sequence Parameter Set(SPS)和Picture Parameter Set（PPS）

流数据中，属性集合可能是这样的：

经过处理之后，在Format Description中则是:

要从基础的流数据将SPS和PPS转化为Format Desc中的话，需要调用CMVideoFormatDescriptionCreateFromH264ParameterSets()方法

NALU header

对于流数据来说，一个NALU的Header中，可能是0x00 00 01或者是0x00 00 00 01作为开头(两者都有可能，下面以0x00 00 01作为例子)。0x00 00 01因此被称为开始码(Start code).

一个MP4文件的话，则是以0x00 00 80 00作为开头。因此要将基本流数据转换成CMSampleBuffer的话，需CMBlockBuffer+CMVideoFormatDesc+CMTime(Optional)。我们可以调用CMSampleBufferCreate()来完成转换

时间控制

如果需要控制每一帧图片的显示时间的话，可以通过CMTimebase进行时间的控制

sbDisplayLayer.controlTimebase = CMTimebaseCreateWithMasterClock(CMClockGetHostTimeClock());
CMTimebaseSetTime(sbDisplayLayer.controlTimebase, CMTimeMake(5, 1));CMTimebaseSetRate(sbDisplayLayer.controlTimebase, 1.0);
复制代码

总结

播放一个网络流文件的流程大概就是这样，总结起来就是：

1）创建AVSampleBufferDisplayLayer

2）将H.264基础流转换为CMSampleBuffer

3）将CMSampleBuffers提供给AVSampleBufferDisplayLayer

4）可以使用自定义的CMTimebase

Case Two - 从已压缩的流中获取CVPixelBuffers

获取解码器

这个步骤中，我们所需要的有：

源数据的描述 - CMVideoFormatDescription
输出缓存所需要的参数 - pixelBufferAttributes:

e.g :

NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:

[NSNumber numberWithBool:YES],(id)kCVPixelBufferOpenGLESCompatibilityKey,nil];
复制代码

回调函数 - VTDecompressionOutputCallback。该回调函数接收一下参数： CVPixelBuffer输出，时间戳，编码的错误码，丢弃的帧

以上为Apple的Keynote中的介绍，下面通过代码来解释

在我自己的Project中，我利用FFMPEG和VideoToolbox来进行网络MP4文件的解析。关于FFMPEG的部分我就不解释了。只贴VideoToolbox硬解码部分。

另外，关于H.264开始码这部分相关的信息，也可以参考我另一篇文章

...

// 利用FFMPEG的解码器，获取到sps和pps，IDR数据
// SPS和PPS数据在codec中的extradata中
// IDR数据在packet的data中
- (void)setupVideoDecoder {
  _pCodecCtx = _pFormatCtx->streams[_videoStream]->codec;
  
  while (av_read_frame(_pFormatCtx, &_packet) >= 0) {
    // Whether is video stream
    if (_packet.stream_index == _videoStream) {
      [self.videoDecoder decodeWithCodec:_pCodecCtx packet:_packet];
    }
  }
}

...
复制代码

Decoder.m

#import "UFVideoDecoder.h"

@interface UFVideoDecoder () {
  NSData *_spsData;
  NSData *_ppsData;
  VTDecompressionSessionRef _decompressionSessionRef;
  CMVideoFormatDescriptionRef _formatDescriptionRef;
  OSStatus _status;
}

@end

@implementation UFVideoDecoder

- (void)decodeWithCodec:(AVCodecContext *)codec packet:(AVPacket)packet {
  
  [self findSPSAndPPSInCodec:codec];
  [self decodePacket:packet];
}

#pragma mark - Private Methods
// 找寻SPS和PPS数据
- (void)findSPSAndPPSInCodec:(AVCodecContext *)codec {
  // 将用不上的字节替换掉，在SPS和PPS前添加开始码
  // 假设extradata数据为 0x01 64 00 0A FF E1 00 19 67 64 00 00...其中67开始为SPS数据
  //  则替换后为0x00 00 00 01 67 64...

// 使用FFMPEG提供的方法。
// 我一开始以为FFMPEG的这个方法会直接获取到SPS和PPS，谁知道只是替换掉开始码。
// 要注意的是，这段代码会一直报**Packet header is not contained in global extradata, corrupted stream or invalid MP4/AVCC bitstream**。可是貌似对数据获取没什么影响。我就直接忽略了
  uint8_t *dummy = NULL;
  int dummy_size;
  AVBitStreamFilterContext* bsfc =  av_bitstream_filter_init("h264_mp4toannexb");
  av_bitstream_filter_filter(bsfc, codec, NULL, &dummy, &dummy_size, NULL, 0, 0);
  av_bitstream_filter_close(bsfc);
  
// 获取SPS和PPS的数据和长度
  int startCodeSPSIndex = 0;
  int startCodePPSIndex = 0;
  uint8_t *extradata = codec->extradata;
  for (int i = 3; i < codec->extradata_size; i++) {
    if (extradata[i] == 0x01 && extradata[i-1] == 0x00 && extradata[i-2] == 0x00 && extradata[i-3] == 0x00) {
      if (startCodeSPSIndex == 0) startCodeSPSIndex = i + 1;
      if (i > startCodeSPSIndex) {
        startCodePPSIndex = i + 1;
        break;
      }
    }
  }
  
  // 这里减4是因为需要减去PPS的开始码的4个字节
  int spsLength = startCodePPSIndex - 4 - startCodeSPSIndex;
  int ppsLength = codec->extradata_size - startCodePPSIndex;
  
  _spsData = [NSData dataWithBytes:&extradata[startCodeSPSIndex] length:spsLength];
  _ppsData = [NSData dataWithBytes:&extradata[startCodePPSIndex] length:ppsLength];

  if (_spsData != nil && _ppsData != nil) {
    // Set H.264 parameters
    const uint8_t* parameterSetPointers[2] = { (uint8_t *)[_spsData bytes], (uint8_t *)[_ppsData bytes] };
    const size_t parameterSetSizes[2] = { [_spsData length], [_ppsData length] };
// 创建CMVideoFormatDesc
    _status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, parameterSetPointers, parameterSetSizes, 4, &_formatDescriptionRef);
    if (_status != noErr) NSLog(@"\n\nFormat Description ERROR: %d", (int)_status);
  }
  
  if (_status == noErr && _decompressionSessionRef == NULL) [self createDecompressionSession];
}

// 创建session
- (void)createDecompressionSession {
  // Make sure to destory the old VTD session
  _decompressionSessionRef = NULL;
  
// 回调函数
  VTDecompressionOutputCallbackRecord callbackRecord;
  callbackRecord.decompressionOutputCallback = decompressionSessionDecodeFrameCallback;
// 如果需要在回调函数中调用到self的话
  callbackRecord.decompressionOutputRefCon = (__bridge void*)self;
  
  // pixelBufferAttributes
  NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithBool:YES], (id)kCVPixelBufferOpenGLCompatibilityKey, [NSNumber numberWithInt:kCVPixelFormatType_32BGRA], (id)kCVPixelBufferPixelFormatTypeKey, nil];
  _status = VTDecompressionSessionCreate(NULL, _formatDescriptionRef, NULL, (__bridge CFDictionaryRef)(destinationImageBufferAttributes), &callbackRecord, &_decompressionSessionRef);

  if(_status != noErr) NSLog(@"\t\t VTD ERROR type: %d", (int)_status);
}

// 回调函数
void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef imageBuffer, CMTime presentationTimestamp, CMTime presentationDuration) {
  UFVideoDecoder *decoder = (__bridge UFVideoDecoder*)decompressionOutputRefCon;
  if (status != noErr) {
    NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
    NSLog(@"Decompressed error: %@", error);
  } else {
    [decoder.delegate getDecodeImageData:imageBuffer];
  }
  
}

// 解析IDR或no-IDR数据
- (void)decodePacket:(AVPacket)packet {
  uint8_t* frame = packet.data;
  int size = packet.size;
  
  int startIndex = 4; // 数据都从第5位开始
  int nalu_type = ((uint8_t)frame[startIndex] & 0x1F);
  // 1为IDR，5为no-IDR
  if (nalu_type == 1 || nalu_type == 5) {
  // 创建CMBlockBuffer
    CMBlockBufferRef blockBufferRef = NULL;
    _status = CMBlockBufferCreateWithMemoryBlock(NULL, frame, size, kCFAllocatorNull, NULL, 0, size, 0, &blockBufferRef);
   
    // 移除掉前面4个字节的数据
    int reomveHeaderSize = size - 4;
    const uint8_t sourceBytes[] = {(uint8_t)(reomveHeaderSize >> 24), (uint8_t)(reomveHeaderSize >> 16), (uint8_t)(reomveHeaderSize >> 8), (uint8_t)reomveHeaderSize};
    _status = CMBlockBufferReplaceDataBytes(sourceBytes, blockBufferRef, 0, 4);
    
    // CMSampleBuffer
    CMSampleBufferRef sbRef = NULL;
    //        int32_t timeSpan = 90000;
    //        CMSampleTimingInfo timingInfo;
    //        timingInfo.presentationTimeStamp = CMTimeMake(0, timeSpan);
    //        timingInfo.duration =  CMTimeMake(3000, timeSpan);
    //        timingInfo.decodeTimeStamp = kCMTimeInvalid;
    const size_t sampleSizeArray[] = {size};
    _status = CMSampleBufferCreate(kCFAllocatorDefault, blockBufferRef, true, NULL, NULL, _formatDescriptionRef, 1, 0, NULL, 1, sampleSizeArray, &sbRef);
    
    // 解析
    VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
    VTDecodeInfoFlags flagOut;
    _status = VTDecompressionSessionDecodeFrame(_decompressionSessionRef, sbRef, flags, &sbRef, &flagOut);
    CFRelease(sbRef);
  }
}

@end

复制代码

根据以下步骤的话，就可以完成流的硬编码：

1） FFMPEG解析

2）获取SPS和PPS数据，创建CMVideoFormatDescription对象

3）创建VTDecompressionSession：注意回调函数和pixelBufferAttributes

4）解析IDR数据，创建CMBlockBuffer对象

5）去除IDR前面4个字节的数据

6）创建CMSampleBuffer

7）解码：VTDecompressionSessionDecodeFrame

展示的部分还在写，关于VideoToolbox的话就先写到这里。

下面这个传送门通向SO里关于硬解码的一个回答，回答很详细。可以作为参照

传送门：

Stack Overflow - how-to-use-videotoolbox-to-decompress-h-264-video-stream

转载于:https://juejin.im/post/5a30de56f265da431a432f19

weixin_34008784

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
VideoToolbox解析

由于公司项目的原因，一开始参照github上的kxmovie，利用FFMPEG和OpenGL写了一个RTMPVideoPlayer。在播放解析的过程中，因为CPU和Memory的使用率比较大，手机播放久了会发热。所以就只能想办法解决这个问题了。在网上搜了一天的资料，发现iOS 8.0以后，Apple开放了VideoToolbox这个framework，可以用于视频的硬编码。可是在Apple的开发者...
复制链接

扫一扫