iOS直播（四）对视频进行压缩编码

最新推荐文章于 2023-09-27 11:23:57 发布

imJackXu

最新推荐文章于 2023-09-27 11:23:57 发布

阅读量1.8k

点赞数

分类专栏： iOS 文章标签：直播编码 h264 视频

本文链接：https://blog.csdn.net/dolacmeng/article/details/86649217

版权

iOS 专栏收录该内容

106 篇文章 3 订阅

订阅专栏

1.为什么要进行编码?

不经过压缩编码的原视频，所占空间大，不便于保存和网络传输，所以视频录制完后，需要先编码，再传输，解码后再播放。

2.视频为什么可以被压缩？

视频存在冗余信息，主要为数据冗余和视觉冗余
1.数据冗余：图像的各像素之间存在着很强的相关性。消除这些冗余并不会导致信息损失，属于无损压缩。可以细分为：

空间冗余：同一帧图像像素之间有较强的相关性，可以进行帧内预测编码去除冗余。
时间冗余：相邻帧的图像具有相似性，可以通过帧间预测编码去除冗余。

2.视觉冗余：人眼的一些特性比如亮度辨别阈值，视觉阈值，对亮度和色度的敏感度不同，使得在编码的时候引入适量的误差，也不会被察觉出来。可以利用人眼的视觉特性，以一定的客观失真换取数据压缩。这种压缩属于有损压缩。

3.压缩编码的标准

目前主要主要使用ITU国际电传视频联盟主导的H.26x系列标准，目前应用最广泛的为H.264，随着4k、8k等超高清时代的来临，H.265也逐渐开始普及。

4.H.264压缩方式

（1）H264中图像以序列（GOP）为单位进行组织，把几帧图像分为一个GOP,也就是一个GOP为一段图像编码后的数据流。
（2）一个GOP内的各帧图像被划分为I帧、B帧、P帧。

I帧：帧内编码帧（intra picture）：为每个GOP的第一帧，通过去除空间冗余进行压缩，每个GOP有且仅有这一个I帧。

P帧：预测编码帧（predictive-frame）:通过去除GOP中前面已编码的帧（I帧或P帧）的时间冗余信息来编码图像。每个GOP中有一个或多个P帧。

一个序列（GOP）的第一个图像叫做 IDR 图像（立即刷新图像），IDR 图像都是 I 帧图像。H.264 引入 IDR 图像是为了解码的重同步，当解码器解码到 IDR 图像时，立即将参考帧队列清空，将已解码的数据全部输出或抛弃，重新查找参数集，开始一个新的序列。

B帧：双向预测帧（bi-directional interpolated prediction frame）：根据相邻的前一帧、本帧以及后一帧数据的不同点来压缩本帧，也即仅记录本帧与前后帧的差值。I帧和P帧间或两个P帧间有一个或多个B帧。
I帧为基础帧，以I帧预测P帧，再由I帧和P帧一起预测B帧，一般地，I帧压缩效率最低，P帧较高，B帧最高。
（3）最后将I帧数据与预测的差值信息进行存储和传输。

5.H.264分层结构

H.264的功能分为两层

视频编码层（VCL：Video Coding Layer）：即被压缩编码后的视频数据序列，我们前面介绍的内容均为VCL层
网络提取层（NAL:Network Abstraction Layer）：在VCL数据封装到NAL单元中之后，才可以用来传输或存储。

6.NAL封装

（1）封装方式：
NAL是将每一帧数据写入到一个NAL单元（NALU）中，进行传输或存储的
NALU分为NAL头和NAL体
NALU头通常为00 00 00 01，作为一个新的NALU的起始标识
NALU体封装着VCL编码后的信息或者其他信息

（2）封装过程：
I帧、P帧、B帧都是被封装成一个或者多个NALU进行传输或者存储的
I帧开始之前也有非VCL的NAL单元，用于保存其他信息，比如：PPS、SPS

PPS（Picture Parameter Sets）：图像参数集
SPS（Sequence Parameter Set）：序列参数集

在实际的H264数据帧中，往往帧前面带有00 00 00 01 或 00 00 01分隔符，一般来说编码器编出的首帧数据为PPS与SPS，接着为I帧，后续是B帧、P帧等数据

7.编码方式

（1）硬编码：使用非CPU进行编码，例如使用GPU、专用DSP、FPGA、ASIC芯片等。用此方式对CPU负载小，但对GPU等硬件要求高，iOS8中苹果已经为我们封装了VideoToolBox和AudioToolBox两个框架进行硬编码。
（2）软编码：使用CPU进行编码，通常使用开源的ffmpeg+x264

8.代码Demo

下面以代码演示整个采集和编码的流程：采集–>获取视频帧–>对视频帧进行编码–>获取视频帧信息–>将编码后的数据以NALU方式写入文件

既然iPhone拥有强大的GPU硬件，也提供了VideoToolBox和AudioToolBox两个优秀的框架，那demo当然选择硬编码啦～

（1）将前文中利用AVFoundation进行视频采集的代码进行封住，创建VideoCapture类，实现开始采集方法startCapture:和停止采集方法stopCapture。

在获取到sampleBuffer后，使用下一步创建的VideoEncoder类进行编码。
VideoCapture.h

@interface VideoCapture : NSObject

- (void)startCapture:(UIView *)preview;

- (void)stopCapture;

@end

VideoCapture.m

#import "VideoCapture.h"
#import "VideoEncoder.h"
#import <AVFoundation/AVFoundation.h>

@interface VideoCapture () <AVCaptureVideoDataOutputSampleBufferDelegate>

/** 编码对象 */
@property (nonatomic, strong) VideoEncoder *encoder;

/** 捕捉会话*/
@property (nonatomic, weak) AVCaptureSession *captureSession;

/** 预览图层 */
@property (nonatomic, weak) AVCaptureVideoPreviewLayer *previewLayer;

/** 捕捉画面执行的线程队列 */
@property (nonatomic, strong) dispatch_queue_t captureQueue;

@end

@implementation VideoCapture

- (void)startCapture:(UIView *)preview
{
    // 0.初始化编码对象
    self.encoder = [[VideoEncoder alloc] init];
    
    // 1.创建捕捉会话
    AVCaptureSession *session = [[AVCaptureSession alloc] init];
    session.sessionPreset = AVCaptureSessionPreset1280x720;
    self.captureSession = session;
    
    // 2.设置输入设备
    AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
    NSError *error = nil;
    AVCaptureDeviceInput *input = [[AVCaptureDeviceInput alloc] initWithDevice:device error:&error];
    [session addInput:input];
    
    // 3.添加输出设备
    AVCaptureVideoDataOutput *output = [[AVCaptureVideoDataOutput alloc] init];
    self.captureQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    [output setSampleBufferDelegate:self queue:self.captureQueue];
    [session addOutput:output];
    
    // 设置录制视频的方向
    AVCaptureConnection *connection = [output connectionWithMediaType:AVMediaTypeVideo];
    [connection setVideoOrientation:AVCaptureVideoOrientationPortrait];
    
    // 4.添加预览图层
    AVCaptureVideoPreviewLayer *previewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:session];
    previewLayer.frame = preview.bounds;
    [preview.layer insertSublayer:previewLayer atIndex:0];
    self.previewLayer = previewLayer;
    
    // 5.开始捕捉
    [self.captureSession startRunning];
}

- (void)stopCapture {
    [self.captureSession stopRunning];
    [self.previewLayer removeFromSuperlayer];
    [self.encoder endEncode];
}

#pragma mark - 获取到数据
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
    [self.encoder encodeSampleBuffer:sampleBuffer];
}

(2)创建VideoEncoder类，实现编码功能。

setupFileHandle方法创建了保存编码后视频的文件路径。
setupVideoSession创建并初始化了编码会话compressionSession，其中创建方法VTCompressionSessionCreate()中，第8个参数为指定编码回调的c语言方法为didCompressH264。
在上一步采集到SampleBuffer后，调用encodeSampleBuffer:方法进行编码，回调上一步的didCompressH264。
在didCompressH264中判断若为关键帧，则增加sps和pps，并转换为NSData，拼接为NALU单元后写入文件，其他帧也拼接为NALU单元后写入文件。
视频采集结束后调用endEncode方法销毁对话。

VideoEncoder.h

#import <UIKit/UIKit.h>
#import <VideoToolbox/VideoToolbox.h>

@interface VideoEncoder : NSObject

- (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer;
- (void)endEncode;

@end

VideoEncoder.m

#import "VideoEncoder.h"

@interface VideoEncoder()

/** 记录当前的帧数 */
@property (nonatomic, assign) NSInteger frameID;

/** 编码会话 */
@property (nonatomic, assign) VTCompressionSessionRef compressionSession;

/** 文件写入对象 */
@property (nonatomic, strong) NSFileHandle *fileHandle;

@end

@implementation VideoEncoder

-(instancetype)init{
    if(self = [super init]){
        // 1.初始化写入文件的对象(NSFileHandle用于写入二进制文件)
        [self setUpFileHandle];
        // 2.初始化压缩编码的会话
        [self setUpVideoSession];
    }
    return self;
}

-(void)setUpFileHandle{
    //1.获取沙盒路径
    NSString *file = [[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject] stringByAppendingPathComponent:@"abc.h264"];

    //2.如果已有文件则删除后再创建
    [[NSFileManager defaultManager] removeItemAtPath:file error:nil];
    [[NSFileManager defaultManager] createFileAtPath:file contents:nil attributes:nil];
    
    //3.创建对象
    self.fileHandle = [NSFileHandle fileHandleForWritingAtPath:file];
}

-(void)setUpVideoSession{
    //1.用于记录当前是第几帧数据
    self.frameID = 0;
    
    //2.录制视频的宽高
    int width = [UIScreen mainScreen].bounds.size.width;
    int height = [UIScreen mainScreen].bounds.size.height;
    
    //3.创建CompressionSession对象，用于对画面进行编码
    VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &_compressionSession);
    
    //4.设置实时编码输出（直播需要实时输出）
    VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
    
    //5.设置期望帧率为每秒30帧
    int fps = 30;
    CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);
    VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);

    //6.设置码率
    int biteRate = 800*1024;
    CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &biteRate);
    VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef);
    NSArray *limit = @[@(biteRate*1.5/8),@(1)];
    VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFArrayRef)limit);
    
    //7.设置关键帧间隔为30（GOP长度）
    int frameInterval = 30;
    CFNumberRef frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);
    VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef);
    
    //8.准备编码
    VTCompressionSessionPrepareToEncodeFrames(self.compressionSession);
}

- (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer{
    //1.将sampleBuffer转成imageBuffer
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    
    //2.根据当前帧数，创建CMTime
    CMTime presentationTimeStamp = CMTimeMake(self.frameID++, 1000);
    VTEncodeInfoFlags flags;
    
    //3.开始编码该帧数据
    OSStatus statusCode = VTCompressionSessionEncodeFrame(self.compressionSession,
                                                          imageBuffer,
                                                          presentationTimeStamp,
                                                          kCMTimeInvalid,
                                                          NULL, (__bridge void * _Nullable)(self), &flags);
    if (statusCode == noErr) {
        NSLog(@"H264: VTCompressionSessionEncodeFrame Success");
    }
}


// 编码完成回调
void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
    //1.判断状态是否是没有报错
    if (status != noErr) {
        return;
    }
    
    //2.根据传入的参数获取对象
    VideoEncoder* encoder = (__bridge VideoEncoder*)outputCallbackRefCon;

    //3.判断是否是关键帧
    bool isKeyframe = !CFDictionaryContainsKey(CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true),0), kCMSampleAttachmentKey_NotSync);
    //判断当前帧是否为关键帧
    //获取sps & pps数据
    if(isKeyframe){
        //获取编码后的信息
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        
        //获取SPS信息
        size_t sparameterSetSize, sparameterSetCount;
        const uint8_t *sparameterSet;
        CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 );

        // 获取PPS信息
        size_t pparameterSetSize, pparameterSetCount;
        const uint8_t *pparameterSet;
        CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 );
        
        //装sps/pps转成NSData，以便写入文件
        NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
        NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
        
        //写入文件
        [encoder gotSpsPps:sps pps:pps];
    }
    
    // 获取数据块
    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length, totalLength;
    char *dataPointer;
    OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
    if (statusCodeRet == noErr) {
        size_t bufferOffset = 0;
        static const int AVCCHeaderLength = 4; // 返回的nalu数据前四个字节不是0001的startcode，而是大端模式的帧长度length
        
        // 循环获取nalu数据
        while (bufferOffset < totalLength - AVCCHeaderLength) {
            uint32_t NALUnitLength = 0;
            // Read the NAL unit length
            memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
            
            // 从大端转系统端
            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
            
            NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
            [encoder gotEncodedData:data isKeyFrame:isKeyframe];
            
            // 移动到写一个块，转成NALU单元
            bufferOffset += AVCCHeaderLength + NALUnitLength;
        }
    }
}

- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
    // 1.拼接NALU的header
    const char bytes[] = "\x00\x00\x00\x01";
    size_t length = (sizeof bytes) - 1;
    NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
    
    // 2.将NALU的头&NALU的体写入文件
    [self.fileHandle writeData:ByteHeader];
    [self.fileHandle writeData:sps];
    [self.fileHandle writeData:ByteHeader];
    [self.fileHandle writeData:pps];
    
}
- (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame
{
    NSLog(@"gotEncodedData %d", (int)[data length]);
    if (self.fileHandle != NULL)
    {
        const char bytes[] = "\x00\x00\x00\x01";
        size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0'
        NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
        [self.fileHandle writeData:ByteHeader];
        [self.fileHandle writeData:data];
    }
}


- (void)endEncode {
    VTCompressionSessionCompleteFrames(self.compressionSession, kCMTimeInvalid);
    VTCompressionSessionInvalidate(self.compressionSession);
    CFRelease(self.compressionSession);
    self.compressionSession = NULL;
}


@end

（3）在主页面ViewController.m中,加入点击开始采集和结束采集的按钮并实现对应方法：

@interface ViewController () 

/** 视频捕捉对象 */
@property (nonatomic, strong) VideoCapture *videoCapture;

@end

@implementation ViewController

- (IBAction)startCapture {
    [self.videoCapture startCapture:self.view];
}

- (IBAction)stopCapture {
    [self.videoCapture stopCapture];
}

- (VideoCapture *)videoCapture {
    if (_videoCapture == nil) {
        _videoCapture = [[VideoCapture alloc] init];
    }
    return _videoCapture;
}

demo源码下载：https://github.com/dolacmeng/encodeWithVideoToolBox

imJackXu

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
iOS直播（四）对视频进行压缩编码

1.为什么要进行编码？不经过压缩编码的原视频，所占空间大，不便于保存和网络传输，所以视频录制完后，需要先编码，再传输，解码后再播放。2.视频为什么可以被压缩？视频存在冗余信息，主要为数据冗余和视觉冗余1.数据冗余：图像的各像素之间存在着很强的相关性。消除这些冗余并不会导致信息损失，属于无损压缩。可以细分为：（1）空间冗余：同一帧图像像素之间有较强的相关性，可以进行帧内预测编码去除冗余。...
复制链接

扫一扫

专栏目录