Item 30: Use ARC to Make Reference Counting Easier
Reference counting is a fairly easy concept to understand (see Item 29). The semantics of where retains and releases need to appear are easily expressed. So with the Clang compiler project came a static analyzer that is able to indicate the location of problems with the reference counting. For example, consider the following snippet written with manual reference counting:
if ([self shouldLogMessage]) {
NSString *message = [[NSString alloc] initWithFormat:
@"I am object, %p", self];
NSLog(@"message = %@", message);
}
This code has a memory leak because the message object is not released at the end of the if
statement. Since it cannot be referenced outside the if
statement, the object is leaked. The rules governing why this is a leak are straightforward. The call to NSString
’s alloc
method returns an object with a +1 retain count. But there is no balancing release. These rules are easy to express, and a computer could easily apply these rules and tell us that the object has been leaked. That’s exactly what the static analyzer does.
The static analyzer was taken one step further. Since it is able to tell you where there are memory-management problems, it should easily be able to go ahead and fix them by adding in the required retain or release, right? That is the idea from which Automatic Reference Counting (ARC) was born. ARC does exactly what it says in the name: makes reference counting automatic. So in the preceding code snippet, the message
object would automatically have a release
added in just before the end of the if
statement scope, automatically turning the code into the following:
if ([self shouldLogMessage]) {
NSString *message = [[NSString alloc] initWithFormat:
@"I am object, %p", self];
NSLog(@"message = %@", message);
[message release]; ///< Added by ARC
}
The important thing to remember with ARC is that reference counting is still being performed. But ARC adds in the retains and releases for you. ARC does more than apply memory-management semantics to methods that return objects, as you will see. But it is these core semantics, which have become standard throughout Objective-C, on which ARC is built.
Because ARC adds retains, releases, and autoreleases for you, calling memory-management methods directly under ARC is illegal. Specifically, you cannot call the following methods:
retain
release
autorelease
dealloc
Calling any of these methods directly will result in a compiler error because doing so would interfere with ARC’s being able to work out what memory-management calls are required. You have to put your trust in ARC to do the right thing, which can be daunting for developers used to manual reference counting.
In fact, ARC does not call these methods through the normal Objective-C message dispatch but instead calls lower-level C variants. This is optimal, since retains and releases are performed frequently, and so saving CPU cycles here is a big win. For example, the equivalent for retain
is objc_retain.
This is also why it is illegal to override retain
, release,
or autorelease
, as these methods are never called directly. For the rest of this item, I will usually talk about the equivalent Objective-C method rather than the lower-level C variants. This should help if your background is with manual reference counting.
Method-Naming Rules Applied by ARC
The memory-management semantics dictated through method names have long been convention in Objective-C, but ARC has cemented them as hard rules. The rules are simple and relate to the method name. A method returning an object returns it owned by the caller if its method name begins with one of the following:
alloc
new
copy
mutableCopy
“Owned by the caller” means that the code calling any of the four methods listed is responsible for releasing the returned object. That is to say, the object will have a positive retain count, where exactly 1 needs to be balanced by the calling code. The retain count may be greater than 1 if the object has been retained additionally and autoreleased, which is one reason why the retainCount
method is not useful (see Item 36).
Any other method name indicates that any returned object will be returned not owned by the calling code. In these cases, the object will be returned autoreleased, so that the value is alive across the method call boundary. If it wants to ensure that the object stays alive longer, the calling code must retain it.
ARC automatically handles all memory management required to maintain these rules, including the code for returning objects autoreleased, as illustrated in the following code:
+ (EOCPerson*)newPerson {
EOCPerson *person = [[EOCPerson alloc] init];
return person;
/**
* The method name begins with 'new', and since 'person'
* already has an unbalanced +1 retain count from the
* 'alloc', no retains, releases, or autoreleases are
* required when returning.
*/
}
+ (EOCPerson*)somePerson {
EOCPerson *person = [[EOCPerson alloc] init];
return person;
/**
* The method name does not begin with one of the "owning"
* prefixes, therefore ARC will add an autorelease when
* returning 'person'.
* The equivalent manual reference counting statement is:
* return [person autorelease];
*/
}
- (void)doSomething {
EOCPerson *personOne = [EOCPerson newPerson];
// ...
EOCPerson *personTwo = [EOCPerson somePerson];
// ...
/**
* At this point, 'personOne' and 'personTwo' go out of
* scope, therefore ARC needs to clean them up as required.
* - 'personOne' was returned as owned by this block of
* code, so it needs to be released.
* - 'personTwo' was returned not owned by this block of
* code, so it does not need to be released.
* The equivalent manual reference counting cleanup code
* is:
* [personOne release];
*/
}
ARC standardizes the memory-management rules through naming conventions, something that newcomers to the language often see as unusual. Very few other languages put as much emphasis on naming as Objective-C does. Becoming comfortable with this concept is crucial to being a good Objective-C developer. ARC helps with the process because it does a lot of the work for you.
In addition to adding in retains and releases, ARC has other benefits. It is also able to perform optimizations that would be difficult or impossible to do by hand. For example, at compile time, ARC can collapse retains, releases, and autoreleases to cancel them out, if possible. If it sees that the same object is being retained multiple times and released multiple times, ARC can sometimes remove pairs of retains and releases.
ARC also includes a runtime component. The optimizations that occur here are even more interesting and should help prove why all future code should be written under ARC. Recall that some objects are returned from methods autoreleased. Sometimes, the calling code needs to retain the object straightaway, as in this scenario:
// From a class where _myPerson is a strong instance variable
_myPerson = [EOCPerson personWithName:@"Bob Smith"];
The call to personWithName:
returns a new EOCPerson
autoreleased. But the compiler also needs to add a retain when setting the instance variable, since it holds a strong reference. Therefore, the preceding code is equivalent to the following in a world of manual reference counting:
EOCPerson *tmp = [EOCPerson personWithName:@"Bob Smith"];
_myPerson = [tmp retain];
You would be correct to note here that the autorelease
from the personWithName:
method and the retain
areextraneous. It would be beneficial for performance to remove both. But code compiled under ARC needs to be compatible with non-ARC code, for backward compatibility. ARC could have removed the concept of autorelease
and dictated that all objects returned from methods be returned with a +1 retain count. However, that would break backward compatibility.
But ARC does in fact contain runtime behavior to detect the situation of extraneous autorelease
plus immediateretain
. It does this through a special function that is run when an object is returned autoreleased. Instead of a plain call to the object’s autorelease
method, it calls objc_autoreleaseReturnValue
. This function inspects the code that is going to be run immediately after returning from the current method. If it is detected that this is going to be a retain of the returned object, a flag is set within a global data structure (processor dependent) instead of performing the autorelease
. Similarly, the calling code that retains an autoreleased object returned from a method uses a function called objc_retainAutoreleasedReturnValue
instead of calling retain
directly. This function checks the flag and, if set, doesn’t perform retain
. This extra work to set and check flags is faster than performing autorelease
and retain
.
The following code illustrates this optimization by showing how ARC uses these special functions:
// Within EOCPerson class
+ (EOCPerson*)personWithName:(NSString*)name {
EOCPerson *person = [[EOCPerson alloc] init];
person.name = name;
objc_autoreleaseReturnValue(person);
}
// Code using EOCPerson class
EOCPerson *tmp = [EOCPerson personWithName:@"Matt Galloway"];
_myPerson = objc_retainAutoreleasedReturnValue(tmp);
These special functions have processor-specific implementations to make use of the most optimal solution. The following pseudocode implementations explain what happens:
id objc_autoreleaseReturnValue(id object) {
if ( /* caller will retain object */ ) {
set_flag(object);
return object; ///< No autorelease
} else {
return [object autorelease];
}
}
id objc_retainAutoreleasedReturnValue(id object) {
if (get_flag(object)) {
clear_flag(object);
return object; ///< No retain
} else {
return [object retain];
}
}
The way in which objc_autoreleaseReturnValue
detects whether the calling code is going to immediately retain the object is processor specific. Only the author of the compiler can implement this, since it uses inspection of the raw machine-code instructions. The author of the compiler is the only person who can ensure that the code in the calling method is arranged in such a way that detection like this is possible.
This is just one such optimization that is made possible by putting memory management in the hands of the compiler and the runtime. It should help to illustrate why using ARC is such a good idea. As the compiler and runtime mature, I’m sure that other optimizations will be making an appearance.
Memory-Management Semantics of Variables
ARC also handles memory management of local variables and instance variables. By default, every variable is said to hold a strong reference to the object. This is important to understand, particularly with instance variables, since for certain code, the semantics can be different from manual reference counting. For example, consider the following code:
@interface EOCClass : NSObject {
id _object;
}
@implementation EOCClass
- (void)setup {
_object = [EOCOtherClass new];
}
@end
The _object
instance variable does not automatically retain its value under manual reference counting but does under ARC. Therefore, when the setup
method is compiled under ARC, the method transforms into this:
- (void)setup {
id tmp = [EOCOtherClass new];
_object = [tmp retain];
[tmp release];
}
Of course, in this situation, retain
and release
can be cancelled out. So ARC does this, leaving the same code as before. But this comes in handy when writing a setter. Before ARC, you may have written a setter like this:
- (void)setObject:(id)object {
[_object release];
_object = [object retain];
}
But this reveals a problem. What if the new value being set is the same as the one already held by the instance variable? If this object was the only thing holding a reference to it, the release in the setter would cause the retain count to drop to 0, and the object would be deallocated. The subsequent retain would cause the application to crash. ARC makes this sort of mistake impossible. The equivalent setter under ARC is this:
- (void)setObject:(id)object {
_object = object;
}
ARC performs a safe setting of the instance variable by retaining the new value, then releasing the old one before finally setting the instance variable. You may have understood this under manual reference counting and written your setters correctly, but with ARC, you don’t have to worry about such edge cases.
The semantics of local and instance variables can be altered through the application of the following qualifiers:
__strong The default; the value is retained.
__unsafe_unretained The value is not retained and is potentially unsafe, as the object may have been deallocated already by the time the variable is used again.
__weak The value is not retained but is safe because it is automatically set to nil if the current object is ever deallocated.
__autoreleasing This special qualifier is used when an object is passed by reference to a method. The value is autoreleased on return.
For example, to make an instance variable behave the same as it does without ARC, you would apply the __weak
or __unsafe_unretained
attribute:
@interface EOCClass : NSObject {
id __weak _weakObject;
id __unsafe_unretained _unsafeUnretainedObject;
}
In either case, when setting the instance variable, the object will not be retained. Automatically nilling weak
references with the __weak
qualifier is available only in the latest versions of the runtime (Mac OS X 10.7 and iOS 5.0) because they rely on features that have been added.
When applied to local variables, the qualifiers are often used to break retain cycles that can be introduced with blocks (see Item 40). A block automatically retains all objects it captures, which can sometimes lead to a retain cycle if an object retaining a block is retained by the block. A __weak
local variable can be used to break the retain cycle:
NSURL *url = [NSURL URLWithString:@"http://www.example.com/"];
EOCNetworkFetcher *fetcher =
[[EOCNetworkFetcher alloc] initWithURL:url];
EOCNetworkFetcher * __weak weakFetcher = fetcher;
[fetcher startWithCompletion:^(BOOL success){
NSLog(@"Finished fetching from %@", weakFetcher.url);
}];
ARC Handling of Instance Variables
As explained, ARC also handles the memory management of instance variables. Doing so requires ARC to automatically generate the required cleanup code during deallocation. Any variables holding a strong reference need releasing, which ARC does by hooking into the dealloc
method. With manual reference counting, you would have found yourself writing dealloc
methods that look like this:
- (void)dealloc {
[_foo release];
[_bar release];
[super dealloc];
}
With ARC, this sort of dealloc
method is not required; the generated cleanup routine will perform these two releases for you by stealing a feature from Objective-C++. An Objective-C++ object has to call the destructors for all C++ objects held by the object during deallocation. When the compiler saw that an object contained C++ objects, it would generate a method called .cxx_destruct
. ARC piggybacks on this method and emits the required cleanup code within it.
However, you still need to clean up any non-Objective-C objects if you have any, such as CoreFoundation objects or heap-allocated memory, with malloc()
. But you do not need to call the superclass implementation of dealloc
as you did before. Recall that calling dealloc
under ARC explicitly is illegal. So ARC, along with generating and running the .cxx_destruct
method for you, also automatically calls the superclass’s dealloc
method. Under ARC, a dealloc
method may end up looking like this:
- (void)dealloc {
CFRelease(_coreFoundationObject);
free(_heapAllocatedMemoryBlob);
}
The fact that ARC generates deallocation code means that usually, a dealloc
method is not required. This often considerably reduces the size of a project’s source code and helps to reduce boilerplate code.
Overriding the Memory-Management Methods
Before ARC, it was possible to override the memory-management methods. For example, a singleton implementation often overrode release
to be a no-op, as a singleton cannot be released. This is now illegal under ARC because doing so could interfere with ARC’s understanding of an object’s lifetime. Also, because the methods are illegal to call and override, ARC makes the optimization of not going through an Objective-C message dispatch (see Item 11) when it needs to perform a retain
, release
, or autorelease
. Instead, the optimization is implemented with C functions deep in the runtime. This means that ARC is able to do optimizations such as the one described earlier when returning an autoreleased object that is immediately retained.
Things to Remember
Automatic Reference Counting (ARC) frees the developer from having to worry about most memory management. Using ARC reduces boilerplate code from classes.
ARC handles the object life cycle almost entirely by adding in retains and releases as it sees appropriate. Variable qualifiers can be used to indicate memory-management semantics; previously, retains and releases were manually arranged.
Method names have always been used to indicate memory-management semantics of returned objects. ARC has solidified these and made it impossible not to follow them.
ARC handles only Objective-C objects. In particular, this means that CoreFoundation objects are not handled, and the appropriate CFRetain
/CFRelease
calls must be applied.a