ios 读取视频每一帧 Video Processing with AV Foundation

最新推荐文章于 2023-04-24 21:16:49 发布

王晓鹏

最新推荐文章于 2023-04-24 21:16:49 发布

阅读量1.7k

点赞数

分类专栏： ios

ios 专栏收录该内容

74 篇文章 0 订阅

订阅专栏

Construct an AVAssetReader:

asset_reader = [[AVAssetReader alloc] initWithAsset:asset error:&error]; (error checking goes here)
Get the video track(s) from your asset:

NSArray* video_tracks = [asset tracksWithMediaType:AVMediaTypeVideo]; AVAssetTrack* video_track = [video_tracks objectAtIndex:0];
Set the desired video frame format:
```
NSMutableDictionary* dictionary = [[NSMutableDictionary alloc] init];
[dictionary setObject:[NSNumber numberWithInt:<format code from CVPixelBuffer.h>] forKey:(NSString*)kCVPixelBufferPixelFormatTypeKey];
```
Note that certain video formats just will not work, and if you're doing something real-time, certain video formats perform better than others (BGRA is faster than ARGB, for instance).

Construct the actual track output and add it to the asset reader:

AVAssetReaderTrackOutput* asset_reader_output = [[AVAssetReaderTrackOutput alloc] initWithTrack:video_track outputSettings:dictionary];
[asset_reader addOutput:asset_reader_output];

Kick off the asset reader:

[asset_reader startReading];

Read off the samples:

CMSampleBufferRef buffer;
while ( [asset_reader status]==AVAssetReaderStatusReading )
      buffer = [asset_reader_output copyNextSampleBuffer];

Video Processing with AV Foundation

November 18, 2010

UPDATE: The app talked about in this post has been released. More information about Videoscope here.

I've been working on a project that required playing back video in iOS. Seems that most people wanting to play back video are just interested in the basic operation of opening a video and passing off the playback to a MPMoviePlayerController.

MPMoviePlayerController is a really nice way to just point to a URL (either local or networked) and say "go." As with all really simple API layers, there's not much further you can go with it.

What I really wanted to do was to be able to point to a URL (for me, a video saved in the device's Photo Library) and get direct access to the pixel data. New in iOS4.0 are a pile of classes to do just this (and much, much more) in the AVFoundation framework. There is actually quite a bit of great documentation on it, but there are so many classes that need to work together that it's a pretty daunting task to get started.

This post is going to cover just reading in a video track from a specified URL that points to a local QuickTime, however, it should be applicable to other bits of AV Foundation.

First bit of required data is to create a NSURL object. In my case, I was using a UIImagePickerController and retrieving the URL of the movie that was picked by the user:

     
     - (void)imagePickerController:(UIImagePickerController *)picker    didFinishPickingMediaWithInfo:(NSDictionary *)info{   NSString * mediaType = [info objectForKey:UIImagePickerControllerMediaType];   if ([mediaType isEqualToString:kUTTypeMovie])       [self readMovie:[info objectForKey:UIImagePickerControllerMediaURL]];   [self dismissModalViewControllerAnimated:YES];}

One note, the kUTTypeMovie constant is defined in the MobileCoreServices framework. Now to actually do something with that URL. I floundered for a while, but finding a combination of blog posts, messageboard attempts, and finally a great page in the iOS reference guide, the AV Foundation Programming Guide : Playback.

This guide is pretty buried. In fact, just now it took me a little while to find it again. Although much of this guide is geared again towards playback into a view, a lot of it is very applicable to general AV Foundation programming. After a lot of trial and error, I came up with this:

     
     - (void) readMovie:(NSURL *)url{AVURLAsset * asset = [AVURLAsset URLAssetWithURL:url options:nil];[asset loadValuesAsynchronouslyForKeys:[NSArray arrayWithObject:@"tracks"] completionHandler:^{dispatch_async(dispatch_get_main_queue(),   ^{   AVAssetTrack * videoTrack = nil;   NSArray * tracks = [asset tracksWithMediaType:AVMediaTypeVideo];   if ([tracks count] == 1)   {   videoTrack = [tracks objectAtIndex:0];      NSError * error = nil;      // _movieReader is a member variable   _movieReader = [[AVAssetReader alloc] initWithAsset:asset error:&error];   if (error)   NSLog(error.localizedDescription);      NSString* key = (NSString*)kCVPixelBufferPixelFormatTypeKey;   NSNumber* value = [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA];   NSDictionary* videoSettings = [NSDictionary dictionaryWithObject:value forKey:key];       [_movieReader addOutput:[AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:videoTrack outputSettings:videoSettings]];   [_movieReader startReading];   }   });}];}

What is this doing? First, we create an AVURLAsset with the given URL. Then we tell that asset to load its tracks asynchronously with a given completionHandler. The completion handler gets called when the track loading completes (presumably in another thread). In this completion handler we dispatch a chunk of instructions to run in the main queue. Honestly, I'm not totally sure that we need to dispatch to the main queue, but the AV Foundation guide said so...so who am I to argue?

When we say to load the "tracks" asynchronously, it's only loading the necessary metadata required to actually start reading the data inside those tracks. In my case the QuickTimes accessed are very simple and probably could be read asynchronously, but if someone opens a 2-hour movie into this, that could take some time.

In any case, once the track data is loaded into the AVURLAsset, we can actually pull out all the tracks from the asset鈥� and we can specify which type of track we care about. In this case, I assume we have one video track, and can safely (sort of) pull out the video track. We can then create our AVAssetReader with the asset and and an output to use the video track that we just found. This AVAssetReaderTrack is specified with both the track and a dictionary of settings. Conveniently the settings are pretty close to those used in AVCaptureVideoDataOutput. That makes me very happy.

Finally, we call the startReading method on the AVAssetReader. The iOS documentation says that the startReading method tells the reader to start preparing samples to be retrieved. In theory it will be reading far enough ahead so that you can do real-time playback. I haven't actually tested that bit, but it does seem to be close to real-time.

Probably a way more-readable way to deal with all this is to put the whole block into a method on my class, then inside the dispatch_async block, just call that method. Oh well. Live and learn.

Okay...so the AVAssetReader is reading...now what? We can start requesting sample buffers from it. It's surprisingly straightforward.

     
     - (void) readNextMovieFrame{   if (_movieReader.status == AVAssetReaderStatusReading)   {      AVAssetReaderTrackOutput * output = [_movieReader.outputs objectAtIndex:0];      CMSampleBufferRef sampleBuffer = [output copyNextSampleBuffer];      if (sampleBuffer)      {         CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);          // Lock the image buffer         CVPixelBufferLockBaseAddress(imageBuffer,0);          // Get information of the image         uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);         size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);         size_t width = CVPixelBufferGetWidth(imageBuffer);         size_t height = CVPixelBufferGetHeight(imageBuffer);          //         //  Here's where you can process the buffer!         //  (your code goes here)         //         //  Finish processing the buffer!         //         // Unlock the image buffer         CVPixelBufferUnlockBaseAddress(imageBuffer,0);         CFRelease(sampleBuffer);      }   }}

All that is happening here is that we first verify that the AVAssetReader is actually reading (which really means that the reader has a sample buffer that we can grab), then we can get a copy of it, and process it with the CoreMedia and CoreVideo libraries. In this case we end up with an image that is 32-bit per pixel, 8-bits per channel BGRA format (as we specified up in the readMovie method). This gives us DIRECT access to the pixel data.

One last note. The AVAssetReader doesn't loop when it hits the end. You need to detect that it finished (via the reader's status property) and just restart it again.

That's all there is to it...the first blog post at 7twenty7!