Toggle menu
Toggle preferences menu
Toggle personal menu
Not logged in
Log in or create an account to edit The Apple Wiki.

Automatic Reference Counting

From The Apple Wiki

Automatic Reference Counting (ARC) is a feature in the Objective-C language to do automatic memory management of Objective-C objects and blocks. The compiler automatically adds calls to retain and release, similar to what the programmer would have written manually without ARC. Unlike a garbage collector, it doesn't change the runtime model.

It was announced at WWDC 2011, and fully supported in OS X 10.7 (Lion) and iOS 5. By using a compatibility library called "ARCLite", apps can also use it when targeting OS X 10.6 (Snow Leopard) and iOS 4, though without zeroing weak references.

For general explanation of what ARC is, the WWDC 2011 talk is probably the best resource. This article (for now) will focus on the internal changes. This is especially useful when understanding code from a decompiler.

Runtime functions

With the introduction of ARC, several new Objective-C runtime functions were introduced.

For example, the usual explanation of ARC is that it adds [obj retain] and [obj release] calls for you, but internally, it's actually calling new functions objc_retain(obj) and objc_release(obj). The rationale for using these new functions is that a C function call requires smaller code than an Objective-C message send, and more importantly, that the clang optimizer can more easily identify and manipulate retains and releases in the generated code.

The full list of runtime functions is in the clang documentation.

Autoreleased return values

There's a special optimization for autoreleased return values. Some methods such as getters are supposed to return objects with a net +0 reference count, which is done by autoreleasing the object before returning it. The caller of those methods often retains the return value immediately after receiving it.

When using manual reference counting, this is what a developer would write manually:

-(NSString*)name {
    NSString* retval = [[NSString alloc] initWithFormat: ...];
    return [retval autorelease];
}

-(void)caller {
    self->thingName = [[self->thing name] retain];
}

Under ARC, the source code would not have any memory management calls and would simply have return retval;. The compiler will add the needed retain and release calls automatically.

However, it's wasteful to autorelease the object (conceptually decrementing the reference count) just to have the caller immediately retain it again. In addition, putting an object into the autorelease pool has its own overhead costs, and means it's retained for longer than it otherwise would.

Instead of using objc_autorelease and objc_retain, the compiler generates these runtime calls:

-(NSString*)name {
    NSString* retval = [[NSString alloc] initWithFormat: ...];
    return objc_autoreleaseReturnValue(retval);
}

-(void)caller {
    self->thingName = objc_retainAutoreleasedReturnValue([self->thing name]);
}

(Note that objc_retainAutoreleaseReturnValue and objc_retainAutoreleasedReturnValue are very different functions!)

At runtime, these two functions basically detect if the other is present, and if so they skip the reference count operations. If an optimized callee detects that the caller is also optimized, it doesn't autorelease the object. If an optimized caller detects the callee is also optimized, it doesn't retain the object again. This effectively "hands off ownership" of the retain count from callee to caller, without needing extra retain/release calls.

  1. In the caller, the compiler emits the call to objc_retainAutoreleasedReturnValue in a specific way that the callee can recognize (architecture-specific, see below).
  2. At runtime, objc_autoreleaseReturnValue looks at the caller's instructions as data, by reading via the return address. Since the call to autoreleaseReturnValue is (usually) a tail call, this reads instructions from the actual caller method, not the callee.
  3. If the instructions show that the caller is also "optimized", then autoreleaseReturnValue stores the object pointer in a dedicated thread-local storage (TLS) slot, and doesn't perform the autorelease. Otherwise, it calls objc_autorelease as usual.
  4. In the caller, objc_retainAutoreleasedReturnValue checks if the object pointer matches the TLS slot. If so, it means the callee did the optimization, so retainAutoreleasedReturnValue can skip the retain call because the object is already at +1 refcount. It will clear the TLS slot for the next call and return. If the TLS slot doesn't match, it means the callee is not optimized, so retainAutoreleasedReturnValue retains the object as usual.

This skips both the autorelease on the callee and the retain on the caller.

Note that this mechanism ensures that if either the caller or the callee is not optimized (eg. because they aren't using ARC), the other side will do the normal autorelease or retain operation to preserve correctness. This also applies if the compiler doesn't tail-call objc_autoreleaseReturnValue in the callee for whatever reason.

objc_retainAutoreleaseReturnValue

When the object being returned does not already have a +1 reference count, it has to be retained first. For example when returning an ivar, return self->name turns into return [[self->name retain] autorelease].

Normally this would be compiled into objc_autoreleaseReturnValue(objc_retain(self->name)), but instead the compiler emits a call to another runtime function objc_retainAutoreleaseReturnValue(self->name) (again, this is different from "retainAutoreleasedReturnValue").

Before iOS 9 and OSX 10.11, this function is simply a wrapper for objc_autoreleaseReturnValue(objc_retain(obj)). The reason for having it as a combined function is just as a minor code size optimization.

(in iOS 9+ / OS X 10.11+, this is more complex; to be documented later)

How the optimization is recognized

The way the callee recognizes if the caller is optimized depends on the CPU architecture:

x86_64

On x86_64, the compiler ensures that the caller method calls objc_retainAutoreleasedReturnValue immediately after calling the callee, with no other instructions in between. This means the caller (in Intel syntax) looks like:

call objc_msgSend ; call [self->thing name], return value in rax
mov rdi, rax      ; use it as the first argument of the next call
call objc_retainAutoreleasedReturnValue

In the callee, objc_autoreleaseReturnValue reads from the return address to detect the pattern of mov rdi, rax followed by a call or jump to objc_retainAutoreleasedReturnValue.

Code comments in the objc4 runtime say the same approach is used by 32-bit x86, but this isn't true. The optimization is disabled altogether on x86 (both OS X and iOS Simulator).

armv7

In the caller, the compiler ensures the method call is immediately followed by the marker instruction mov r7, r7:

bl objc_msgSend ; call [self->thing name]
mov r7, r7
bl objc_retainAutoreleasedReturnValue

In the callee, objc_autoreleaseReturnValue reads from the return address and checks for the mov r7, r7 instruction. Both ARMv7 and Thumb instruction encodings are checked.

arm64

In the caller, the compiler ensures the method call is immediately followed by the marker instruction mov x29, x29:

bl objc_msgSend ; call [self->thing name]
mov x29, x29
bl objc_retainAutoreleasedReturnValue

In the callee, objc_autoreleaseReturnValue reads from the return address and checks for the mov x29, x29 instruction.

Resources