Tuesday, April 8, 2008

You're Doing it Wrong! A Brief Discussion of Categories and the Cocoa Way

Let's talk about a part of the Objective-C language that is somewhat unique: Categories. Well, not completely unique: the latest edition of C# has added something called Extension Methods, which is very similar, but it hasn't been around long enough to have really sunk into the collective psyche of .Net developers yet. Other than this recent addition to C# and the obscure language TOM, I don't personally know of any programming languages that have an equivalent to Objective-C categories. There probably are some floating out there, but the "biggie" languages in the world of object-oriented programming - Java, Ruby, Python, C++ - do not have an equivalent construct. 

Categories are über-cool, and almost universally underused by programmers new to Objective-C. Although I'm not a big fan of C# or .net, I have to give Microsoft kudos for recognizing awesomeness and incorporating it. Let's keep our fingers crossed that they use it as effectively as the NeXT and Apple engineers have in Cocoa.

A lot of people who come to Cocoa from other languages and read Apple's reference book on Objective-C  skim over the section on Categories, say "cool," and then move on. Heck, Apple's own Objective-C primer makes no mention of categories, which is just a shame, because Cocoa programmers use categories a lot. So, what are they, you ask?

Simply put, categories allow you to add methods to existing classes without subclassing them. You can't add instance variables (colloquially called "iVars" or "ivars" in the Cocoa world) or properties, but you can add both class and instance methods with impunity.

So why is this cool? Let's take an example, and look at the design choices available to us in Objective-C versus other languages. I'll use a real-world but very simple example here, but there are countless examples in a programmer's day-to-day life where categories can make your overall code base more elegant, many of which are more complex and more compelling than this one.

One of my current side projects is building two Cocoa-based NNTP clients (for accessing Usenet ): one for the Mac and the other for the iPhone. Now, NNTP servers expects date information to be passed into it in a very specific way - in the format specified in RFC 822. A date string for NNTP needs to look something like this:
08 Apr 2008 12:14:40 -0500
Formatting dates with Objective-C/Cocoa is relatively easy, so I can create a string in this format from an NSDate object relatively easily, by doing this:

NSString *dateString = [theDate descriptionWithCalendarFormat:@"%d %b %Y %H:%M:%S %z" timezone:nil locale:nil];
But where does this code go? Do I really want to put this code everywhere I need to use it? Well, maybe... like I said, this is a simple example for purposes of illustration, and not as compelling as many other real-world examples, but play along with me here, okay?

In an NNTP client, this same long line of code would probably need to be put into dozens or maybe even hundreds of places. But what if RFC822 were to change? Or what if, after using similar code in dozens of places, I realize that I have made a mistake? What a pain!

There are many ways of dealing with this problem in other languages. I could, for example, create a private static method or a function (depending on language)  that takes a date and returns a correctly formatted string. In C or any of its supersets, I could choose to create a preprocessor macro using #define to avoid having the same logic in multiple places. I could also choose to subclass NSDate and add a method in the subclass to return this method. In some cases, that might be appropriate, but in a strongly-typed language like C++ or Java, I probably don't want to do that just to get a single method because it would force a lot of type coercion and messiness into my code.

Without categories, subclassing might be a good choice in Objective-C because it is weakly-typed and could be done with minimal messiness. But we can do it with NO messiness thanks to categories. In Objective-C, I can simply add a method to the existing NSDate class to do what I need. Anywhere I need to use this new method, I just make sure to include the header file from my category. In my simple example, the category header file (.h) would look like this:
#import cocoa/cocoa.h
#import appkit/appkit.h
@interface NSDate(NNTP)

-(NSString *)rfc822DateString;
@end
and my implementation (.m) file would look like this:
#import "NSDate-NNTP.h"

@implementation NSDate(NNTP)
-(NSString *)rfc822DateString
{
return [self descriptionWithCalendarFormat:@"%d %b %Y %H:%M:%S %z" timeZone:nil locale:nil];
}
@end
Once I have this, anywhere I need a date formatted in this particular way, I simply call this new method right on the date object, like this:
NSString *dateString = [myDate rfc822DateString];
The result? Very readable and much shorter code, and it places the actual logic into a single place making it easy to maintain. 

If you want to see a more complex use of Objective-C categories, check out this this earlier blog posting where I showed how to add network socket support to the delivered NSFileHandle class.

Categories make it easier to stick with the overriding design philosphy used throughout Cocoa, and its presence allows the Application Kit and UI Kit to have much shallower class hierarchies than you find in other object-oriented application toolkits. If you're working in Cocoa/Objective-C, you really, really should get into the habit of using Categories whenever it makes sense. When does it make sense? Any time you think to yourself, "Gee, I wish the delivered class [X] had a method that does [Y]".

No comments:

Post a Comment