Accessor methods revisited

Introduction

In a previous article we saw that accessor methods can be useful in ensuring that memory is managed correctly. We also saw just one accessor method pattern. As most developers quickly come to realise, however, accessor methods come in a number of different forms. In this article we examine the various flavours, and discuss the differences between them.

For the purposes of illustration, consider a Person object, with two instance variables:

@interface Person : NSObject
{
    NSString *_name;
    Person *_spouse;
}

Notice that the variable names have been prepended with an underscore -- this discourages the developer from accessing them directly in the class implementation! To discourage others from using them, they might also be declared @private. Do not use underscores for private method names -- this pattern is reserved by Apple.

To get and set the name, we should define two accessor methods. Following the convention established in the previous article, they would be defined thus:

- (NSString *)name 
{
    return _name;
}
- (void)setName:(NSString *)newName 
{
    [newName retain];
    [_name release];
    _name = newName;
}
The order of the calls is important. There is a possibility that _name and newName may be the same object. If they are, and if self is the only object retaining the variable, calling release first will cause the variable to be freed (and the subsequent retain will be sent to a freed object, resulting in a crash).

For most situations, these would be fine. In some circumstances, however, alternatives may be desirable. Most of the variants of the accessor methods only affect the "setter", so we will ignore the "getter" method for the moment.

First variant

Since accessor methods are likely to be called frequently, it is worth ensuring that they are effficient. This first variant checks to see if the old and new values are identical. If they are, then we don't bother with the reassignment (or the "expensive" memory management).
- (void)setName:(NSString *)newName 
{
    if (_name != newName) {
        [newName retain];
        [_name release];
        _name = newName;
    }
}
Use of the if test represents a tradeoff: we avoid memory management in the case where it is unnecessary, but at the cost of always performing the test even though it isn't necessary. We can't rationally decide whether always using the test is an appropriate rule unless we know: (a) the relative cost of the processing overhead of each; and (b) the ratio of accessor calls that might involve identical objects to those that won't. The latter is an application-specific issue, however in most cases it is likely to be the that frequency of _name and newName being the same is low, so this variant will overall be less efficient than the original.

Second variant

When using "get" accessors, it is generally considered to be the case that the value returned is valid for the scope of a current code block. This assumption may not always be true, particularly if you change the value within the block. Consider the following example of a traditional Western wedding (where the bride adopts the groom's name) which uses a hypothetical setSpouse:andUpdateName: method which might in turn employ our accessor methods:

+ (void)makeHusband:(Person *)husband andWife:(Person *)wife
{ 
    NSString *oldName = [wife name];
    [husband setSpouse:wife andUpdateName:NO];
    [wife setSpouse:husband andUpdateName:YES];

    NSLog(@"Changed %@'s name to %@", oldName, [wife name]);
}
If the wife is the only object retaining the string for her original name, and we assume that it will be released (and so freed -- something we might overlook in a first implementation) in [setSpouse:husband andUpdateName:YES], the reference to it as oldName will consequently become invalid. This method will therefore fail with a runtime error (sending a message to a freed object) in NSLog. The remedy is to ensure that variables are not released immediately. This may be achieved using autorelease in the set method:
- (void)setName:(NSString *)newName 
{
    if (_name != newName) {
        [_name autorelease];
        _name = [newName retain];
    }
}
This variant is preferred for applications where performance is an important consideration (we'll explain this later), however it still allows scope for one possible error.

Defensive programming

Suppose the setSpouse:andUpdateName: method were implemented such that it used its own autorelease pool, and calls setName: as in this example (for illustrative purposes only!):

- (void)setSpouse:(Person *)newSpouse andUpdateName:(BOOL)shouldUpdateName {

    [self setSpouse:newSpouse];
    if (shouldUpdateName) {
        NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
        
        NSArray *oldNames = [[self name] componentsSeparatedByString:@" "];
        NSArray *newNames = [[newSpouse name] componentsSeparatedByString:@" "];

        NSString *newName = [NSString stringWithFormat:@"%@ %@",
            [oldNames objectAtIndex:0], [newNames objectAtIndex:1]];
        [self setName:newName];
        [pool release];
    }
}
Given this implementation, the makeHusband:andWife: method above will again fail in NSLog if the accessors are implemented as per the second variant. To ensure that both:
  1. Memory management is maintained using accessors; and
  2. Variables accessed remain valid for the scope of a block
the following combination of the get and set methods should be used:
- (NSString *)name 
{
    return [[_name retain] autorelease];
}
- (void)setName:(NSString *)newName 
{
    [newName retain];
    [_name release];
    _name = newName;
}
The getter method ensures that the variable is (auto)released in the block's current autorelease pool, so that it will remain valid for the block.

Since autorelease is a more expensive operation than release, however, and in this pattern autorelease may be called frequently, in applications where high performance is a major constraint you might wish to use the prior pattern, but explicitly retain and release any returned values.

It is also worth pointing out that one disadvantage of using autorelease is that it can mask and delay "over-releasing" errors (not that you should have any if you follow the Golden Rules of memory management!). This tends to make debugging cases of over-release much more difficult.

There is, however, one further consideration.

Attribute or relationship?

One of the basic tenets of Object-Oriented Programming is data encapsulation, so it is generally considered that allowing external agents to modify instance variables directly is a Bad Thing. Consider this, rather contrived, example:

- (void)setName:(NSString *)newName 
{
    [newName retain];
    [_name release];
    _name = newName;
    [self makeAndCacheInitialsFromName];
}
If the person's name is set to an NSMutableString, the following might occur:
    NSMutableString *ms = [NSMutableString stringWithString:@"Jo"];
    [aPerson  setName:ms];

    [ms appendString:@" Smith"];
Although Jo's name will now be "Jo Smith", the cached initials will be "J". Whilst this particular example might appear unlikely, recall also that in all the setter methods, what is happening is that a pointer is being reassigned. If you are not careful, a number of persons could end up with their name variables all pointing to the same mutable string. This latter situation arises comparatively frequently when using NSTexts to display variables from a succession of different objects (consider [aPerson setName:[anNSText string]]). For this reason, if there is a likelihood that a mutable version of an object may be passed as a value, it may be worth making a private copy of the object:
- (void)setName:(NSString *)newName 
{
    [_name autorelease];
    _name = [newName copy];
}
When trying to determine when to use this approach, bear in mind the difference between attributes and relationships. Whilst this is appropriate for a Person's name, it may not be appropriate for their spouse..! Ask yourself the question "Do I want the value, or the actual object?"

Summary

This article has presented a numbe of different paterns for accessor methods. There are even more. There are a number of factors which will influence your decision as to which is most suitable for a given set of circumstances, however the following The final variant:

- (ClassName *)variableName 
{
    return [[variableName retain] autorelease];
}
- (void)setVariableName:(ClassName *)newVariableName 
{
    [newVariableName retain];
    [variableName release];
    variableName = newVariableName;
    
}
should be used if you are concerned to be as sure as possible that you reduce the chance of messages being inadvertently sent to freed objects. Most developers, however, do not follow this pattern, and the second variant is much more common.
- (ClassName *)variableName 
{
    return variableName;
}
- (void)setVariableName:(ClassName *)newVariableName 
{
    [newVariableName retain];
    [variableName autorelease];
    variableName = newVariableName;
}
This may be used for applications where performance is more important, and you are in some circumstances willing to take extra care to take account of the possibility that other methods might use their own autorelease pools.

You should also consider the possibility that mutable objects could be passed as attribute values, and if necessary make a private copy.


Undo management and change notification

In some applications you need to provide support for undo management, and/or change notification.

- (void)setVariableName:(ClassName *)newVariableName 
{
    if (variableName != newVariableName) {
        [newVariableName retain];
        [variableName autorelease];
        variableName = newVariableName;
        [self sendVariableNamedChangedNotification];
    }
}
Here, since the test for change must be made anyway, it is worth putting the reassignment and memory management calls within the conditional block.


Multi-threaded applications

If the Person object is accessed from more than one thread, it is possible for a variable to be accessed and "cached" (as per the previous example) in one thread, reassigned and released in another, and then the original (freed object) accessed again in the first. To ensure thread safety, we need to place a lock (NSLock) around the reassignment.

Let us assume we add a lock instance variable to the person class:

@interface Person : NSObject
{
    NSString *_name;
    Person *_spouse;
    NSLock * _nameLock;
}
_nameLock should be created in the Person's init method. Given this addition, here is one implementation of the set method:
- (void)setName:(NSString *)newName
{
    [_nameLock lock];

    [newName retain];
    [_name autorelease];
    _name = newName;

    [_nameLock unlock];
}

The following version is rather more complex, but makes the locked code shorter, especially since there are no method calls in the locked portion. This is beneficial since critical (locked) sections of code should be as short and fast as possible to reduce the possibility of lock contention.

- (void)setName:(NSString *)newName
{
    id originalValue;

    [newName retain];

    [_nameLock lock];
    originalValue = _name;
    _name = newName;
    [_nameLock unlock];

    [originalValue release];

}

We also need locks in the get accessor, around the retain, to guarantee the vriable is not stale by the time we retain it.

-(NSString *)name 
{
   id tmp = nil;
   [_nameLock lock];
   tmp = [_name retain];
   [_nameLock unlock];

    return [tmp autorelease];
}
The retainautorelease is needed to make it "more thread-safe"; without it, even if we lock to obtain the variable, it might be stale by the time the caller gets around to using it.

Wrong

A number of developers seem to implement the set method as follows:

-(void)setName:(NSString *)newName
{
    id   old = _name;

    _name = [newName retain];
    [old release];
}
in the belief that this assures thread-safety. Unfortunately this does not. All it does is change the possible locus of failure -- if the thread switches between the assignment of old and the reassignment of _name.

Multi-threading issues

Multi-threaded programming is hard, and the overheads may be high, both in terms of programmer effort, and in CPU cycles for assuring thread-safety. As Bill Bumgarner put it (June 7, 2002; cocoa-dev@lists.apple.com), "...a lot of well intentioned effort to achieve greater performance or responsiveness through threading simply leads to an application that is either intermittently unstable or has so many points of synchronization that it might as well be single threaded." On the other hand, as Paul Williamson (private email) countered: "CPU overheads tend to be high when you try to build thread-safety into low-level mechanisms, as we're doing here, in hopes of reducing the burden on higher-layer programmers. If the architecture is designed carefully and the threads are well decoupled, thread safety mechanisms are only needed a small percentage of the time, and so don't necessarily have a big performance impact. That might mean requiring higher-layer code to be thread-aware, which again makes development more difficult." In a more sophisticated architecture it might make more sense for the Person class to conform to the NSLocking protocol and leave locking up to the callers. This might be particularly true if there were lots of Persons in the program but there was only an occasional risk of multi-thread access to a Person.


Many thanks to Ali Ozer for his inspirational presentation at WWDC 2002; and to Ali, Ian P. Cardenas, Bill Cheeseman, John Hörnkvist, Richard Jackson, Michael B. Johnson, Nat!, John Randolph, Paul Williamson and Don Yacktman for their comments and suggestions. Special thanks are due to Edwin Zacharias for his input and example application regarding multi-threading. Improvements on early versions of this article are due to them; errors and omissions remain mine.