IOS4 and direct access to the camera
The iPhone SDK4 brought a lot of interesting features. Among them the direct access to the camera is a real asset for AR (Augmented reality) applications or in a more general way for all applications processing the image/video to modify it or extract informations. I played around with the new related APIs so I'm going to tell you more about it and share some snippets showing how to use these APIs. If you want to know more about the AVFoundation
framework, AVCaptureDeviceInput
, AVCaptureVideoDataOutput
and AVCaptureSession
this article is for you.
Introduction
The iPhone SDK4 brought a lot of interesting features. Among them the direct access to the camera is a real asset for AR (Augmented reality) applications or in a more general way for all applications processing the image/video to modify it or extract informations. I played around with the new related APIs so I'm going to tell you more about it and share some snippets showing how to use these APIs. If you want to know more about the AVFoundation
framework, AVCaptureDeviceInput
, AVCaptureVideoDataOutput
and AVCaptureSession
this article is for you.
How to access the raw data of the camera
So, apple finally released the "V" of the AVFoundation framework. They provide handy components to get the raw data from the camera. Basically, the path to follow to get these is :
- Setup an
AVCaptureDeviceInput
instance and tell it to read the data provided by the camera. - Setup an
AVCaptureVideoDataOutput
instance. This object will output the data and will pass it to its delegate. - Now that the input and the output are ready you have to setup the component which will connect them.
AVCaptureSession
plays this role. To setup anAVCaptureSession
instance you have to specify the input and the output and then you can call[yourAVCaptureSession startRunning]
- Now every time a new frame is processed, the
AVCaptureVideoDataOutput
delegate will be notified.
Let's do some coding
Here you have a code sample that you can use to try some possibilities offered by the SDK 4.
[Update] I made a small Xcode project which shows how to use this sample. It's on my github
[Update] As a lot of people were asking about how to use a different queue than the main one to do the processing, I updated the code to show how to do it. Pay attention to the fact that we have to call all the display stuff in the main thread to make it work (UIKit is not thread safe).
#import <UIKit/UIKit.h>
#import <AVFoundation/AVFoundation.h>
#import <CoreGraphics/CoreGraphics.h>
#import <CoreVideo/CoreVideo.h>
#import <CoreMedia/CoreMedia.h>
/*!
@class AVController
@author Benjamin Loulier
@brief Controller to demonstrate how we can have a direct access to the camera using the iPhone SDK 4
*/
@interface MyAVController : UIViewController <AVCaptureVideoDataOutputSampleBufferDelegate> {
AVCaptureSession *_captureSession;
UIImageView *_imageView;
CALayer *_customLayer;
AVCaptureVideoPreviewLayer *_prevLayer;
}
/*!
@brief The capture session takes the input from the camera and capture it
*/
@property (nonatomic, retain) AVCaptureSession *captureSession;
/*!
@brief The UIImageView we use to display the image generated from the imageBuffer
*/
@property (nonatomic, retain) UIImageView *imageView;
/*!
@brief The CALayer we use to display the CGImageRef generated from the imageBuffer
*/
@property (nonatomic, retain) CALayer *customLayer;
/*!
@brief The CALAyer customized by apple to display the video corresponding to a capture session
*/
@property (nonatomic, retain) AVCaptureVideoPreviewLayer *prevLayer;
/*!
@brief This method initializes the capture session
*/
- (void)initCapture;
@end
#import "MyAVController.h"
@implementation MyAVController
@synthesize captureSession = _captureSession;
@synthesize imageView = _imageView;
@synthesize customLayer = _customLayer;
@synthesize prevLayer = _prevLayer;
#pragma mark -
#pragma mark Initialization
- (id)init {
self = [super init];
if (self) {
/*We initialize some variables (they might be not initialized depending on what is commented or not)*/
self.imageView = nil;
self.prevLayer = nil;
self.customLayer = nil;
}
return self;
}
- (void)viewDidLoad {
/*We intialize the capture*/
[self initCapture];
}
- (void)initCapture {
/*We setup the input*/
AVCaptureDeviceInput *captureInput = [AVCaptureDeviceInput
deviceInputWithDevice:[AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo]
error:nil];
/*We setupt the output*/
AVCaptureVideoDataOutput *captureOutput = [[AVCaptureVideoDataOutput alloc] init];
/*While a frame is processes in -captureOutput:didOutputSampleBuffer:fromConnection: delegate methods no other frames are added in the queue.
If you don't want this behaviour set the property to NO */
captureOutput.alwaysDiscardsLateVideoFrames = YES;
/*We specify a minimum duration for each frame (play with this settings to avoid having too many frames waiting
in the queue because it can cause memory issues). It is similar to the inverse of the maximum framerate.
In this example we set a min frame duration of 1/10 seconds so a maximum framerate of 10fps. We say that
we are not able to process more than 10 frames per second.*/
//captureOutput.minFrameDuration = CMTimeMake(1, 10);
/*We create a serial queue to handle the processing of our frames*/
dispatch_queue_t queue;
queue = dispatch_queue_create("cameraQueue", NULL);
[captureOutput setSampleBufferDelegate:self queue:queue];
dispatch_release(queue);
// Set the video output to store frame in BGRA (It is supposed to be faster)
NSString* key = (NSString*)kCVPixelBufferPixelFormatTypeKey;
NSNumber* value = [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA];
NSDictionary* videoSettings = [NSDictionary dictionaryWithObject:value forKey:key];
[captureOutput setVideoSettings:videoSettings];
/*And we create a capture session*/
self.captureSession = [[AVCaptureSession alloc] init];
/*We add input and output*/
[self.captureSession addInput:captureInput];
[self.captureSession addOutput:captureOutput];
/*We use medium quality, ont the iPhone 4 this demo would be laging too much, the conversion in UIImage and CGImage demands too much ressources for a 720p resolution.*/
[self.captureSession setSessionPreset:AVCaptureSessionPresetMedium];
/*We add the Custom Layer (We need to change the orientation of the layer so that the video is displayed correctly)*/
self.customLayer = [CALayer layer];
self.customLayer.frame = self.view.bounds;
self.customLayer.transform = CATransform3DRotate(CATransform3DIdentity, M_PI/2.0f, 0, 0, 1);
self.customLayer.contentsGravity = kCAGravityResizeAspectFill;
[self.view.layer addSublayer:self.customLayer];
/*We add the imageView*/
self.imageView = [[UIImageView alloc] init];
self.imageView.frame = CGRectMake(0, 0, 100, 100);
[self.view addSubview:self.imageView];
/*We add the preview layer*/
self.prevLayer = [AVCaptureVideoPreviewLayer layerWithSession: self.captureSession];
self.prevLayer.frame = CGRectMake(100, 0, 100, 100);
self.prevLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;
[self.view.layer addSublayer: self.prevLayer];
/*We start the capture*/
[self.captureSession startRunning];
}
#pragma mark -
#pragma mark AVCaptureSession delegate
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
/*We create an autorelease pool because as we are not in the main_queue our code is
not executed in the main thread. So we have to create an autorelease pool for the thread we are in*/
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
/*Lock the image buffer*/
CVPixelBufferLockBaseAddress(imageBuffer,0);
/*Get information about the image*/
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
/*Create a CGImageRef from the CVImageBufferRef*/
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef newContext = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
CGImageRef newImage = CGBitmapContextCreateImage(newContext);
/*We release some components*/
CGContextRelease(newContext);
CGColorSpaceRelease(colorSpace);
/*We display the result on the custom layer. All the display stuff must be done in the main thread because
UIKit is no thread safe, and as we are not in the main thread (remember we didn't use the main_queue)
we use performSelectorOnMainThread to call our CALayer and tell it to display the CGImage.*/
[self.customLayer performSelectorOnMainThread:@selector(setContents:) withObject: (id) newImage waitUntilDone:YES];
/*We display the result on the image view (We need to change the orientation of the image so that the video is displayed correctly).
Same thing as for the CALayer we are not in the main thread so ...*/
UIImage *image= [UIImage imageWithCGImage:newImage scale:1.0 orientation:UIImageOrientationRight];
/*We relase the CGImageRef*/
CGImageRelease(newImage);
[self.imageView performSelectorOnMainThread:@selector(setImage:) withObject:image waitUntilDone:YES];
/*We unlock the image buffer*/
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
[pool drain];
}
#pragma mark -
#pragma mark Memory management
- (void)viewDidUnload {
self.imageView = nil;
self.customLayer = nil;
self.prevLayer = nil;
}
- (void)dealloc {
[self.captureSession release];
[super dealloc];
}
@end
Ok cool, but what can I do with it ?
Actually what you can do is pretty simple but very instructive. This code shows how to capture data from the camera and then output them on a view using three different ways :
- 1 - Using an
AVCaptureVideoPreviewLayer
instance, this class created by Apple is a subclass ofCALayer
. You can specify the capture session you want to output the video. - 2 - Getting the raw data, process them to create a
CGImageRef
and write it on aCALayer
- 3 - Getting the raw data, process them to create a
CGImageRef
, then process thisCGImageRef
to create aUIImage
and finally write it on aUIImageView
.
Performances
The code can probably be improved but for now the only way to get something as smooth as what you can get with the Camera app is using the first method. So if you want to add informations over the camera output my advice would be:- To display the
AVCaptureVideoPreviewLayer
. - Use the delegate method to process the data (do some pattern detection for instance)
- And then create an overlay - using a
CALayer
for example - where you display your informations.
You could also try to apply filters on the video, you can try modifying the AVCaptureVideoPreviewLayer
as if it was a regular CALayer
- as it is a subclass of it. And if you do it please let me know the results ;-)
Finally if you really need to show an image computed from the raw data, the faster is surprisingly to use the UIImage
method.
Ok, I think that's all for my first technical post. If you have questions or remarks the comment section is waiting for you. And don't hesitate, as it is a new exercise I would really appreciate your feedback.
bbye
Ben