Saturday, July 29, 2006

Access argc and argv from Anywhere

Say you're in some random function that's stuck deep down in the middle of some big program you're writing on the Mac. And now assume that you'd like to have access to the arguments that were passed to main() when the program started. How can you do this?

Well, as it turns out, this work has already been done for us. Let's have a look inside /usr/lib/libSystem.B.dylib shall we.

$ nm /usr/lib/libSystem.B.dylib | grep _NSGetArg
...
9003a382 T __NSGetArgc
90020f60 T __NSGetArgv
...

Ahh, so it looks like we found some symbols (functions) whose names look very revealing (man nm to see more details on what the output means above). The C compiler automatically prepends a leading underscore to symbol names, so the actual function names should be: _NSGetArgc(void) and _NSGetArgv(void). Let's pick one and try to use it.

Since the function NSGetArgc() is not declared in any header files (that I know of), we'll need to declare it ourselves. But we want to link against the version of the function that's in libSystem, so we'll declare it extern. Let's take a first stab:
$ cat nsargc.c
#include <stdio.h>
extern int _NSGetArgc(void);
int main(void) {
printf("argc=%d\n", _NSGetArgc());
return 0;
}
$ gcc -o nsargc nsargc.c -Wall -std=c99
$ ./nsargc foo bar
argc=8192

Well, that's certainly not correct. Maybe we're using the function wrong. The real "argc" should be an int, but maybe this function returns a pointer to it instead of returning the actual value. Let's try that:
$ cat nsargc.c
#include <stdio.h>
extern int *_NSGetArgc(void);
int main(void) {
printf("argc=%d\n", *_NSGetArgc());
return 0;
}
$ gcc -o nsargc nsargc.c -Wall -std=c99
$ ./nsargc foo bar
argc=3

Hey! Now that's more like it. So it looks like these functions may return pointers to the values we want rather than the actual values we want. Now let's try this with argv as well.
$ cat NSGetArgs.c
#include <stdio.h>

extern int *_NSGetArgc(void);
extern char ***_NSGetArgv(void);

void DoStuff(void) {
printf("%20s = %d\n", "_NSGetArgc()", *_NSGetArgc());

char **argv = *_NSGetArgv();
for (int i = 0; argv[i] != NULL; ++i)
printf("%15s [%02d] = '%s'\n", "_NSGetArgv()", i, argv[i]);
}

int main(void) {
DoStuff();
return 0;
}
$ gcc -o NSGetArgs NSGetArgs.c -Wall -std=c99
$ ./NSGetArgs foo bar
_NSGetArgc() = 3
_NSGetArgv() [00] = './NSGetArgs'
_NSGetArgv() [01] = 'foo'
_NSGetArgv() [02] = 'bar'


Sweet, it looks like that works. So now we can get access to argc and argv from anywhere within a program. And notice that we didn't even need to declare the arguments in main's signature.

Here's a few similar functions that may be interesting.
$ nm /usr/lib/libSystem.B.dylib | grep _NSGet | grep ' T '
9002a55f T _NSGetNextSearchPathEnumeration
9003a382 T __NSGetArgc
90020f60 T __NSGetArgv
90003074 T __NSGetEnviron
90029e2d T __NSGetMachExecuteHeader
90027506 T __NSGetProgname
9014aa84 T _NSGetSectionDataInObjectFileImage
90036106 T __NSGetExecutablePath

Tuesday, July 25, 2006

The Singleton Smell

I'm a big fan of using Object Oriented design patterns, especially the classics popularized by the GoF. Design patterns are the OO analog of algorithms. Just like a good software engineer needs to be knowledgeable about data structures and algorithms, knowledge of OO patterns is a requirement in todays OO software world (disclaimer: the previous statement is only my opinion). And just like bubblesort can be misapplied to solve a problem, OO patterns can also be used inappropriately.

One of the most (if not the most) misused patterns is the Singleton. The singleton can be very useful when applied correctly, but it can also make for some poorly designed and almost unmaintainable software. The singleton is probably the easiest pattern to understand from the GoF's Design Patterns book (above), which may be why it's so often misused by engineers who are new to patterns. In a nutshell, the singleton pattern attempts to ensure that only one instance of a class is every created. Many callers may use the class, they just end up using the same instance.

Code that uses a lot of singletons gives off a distinct smell. A small that a software engineer should recognize. It's similar to, but more potent than, the smell given off by global variables, because a singleton is effectively a global variable. Singletons are generally global in scope, thus allowing any class, at any level, access to the singleton (read: global variable). This makes for classes that are tightly coupled.

Additionally, singletons let you design classes (or more (in)appropriately, "implementations") without having to think about the class's interface, or how they interact with other classes. The singleton doesn't need to be an argument to the class's constructor, or to methods, so it's often not considered when thinking through the object model for your code. This is similar to the way that C functions don't need to declare global variables in their parameter list, because again, they're globals, so the function's interface doesn't need to consider them. And I think we'd probably all agree that global variables are generally not the best idea.

Singletons are often misused in situations where you have multiple classes that all need to access the same instance of an object. In this case, the singleton is used as a convenience to let all the classes access this one central global variable. It would likely be a better idea to think about the class' interfaces, possibly add an extra parameter here and there, and avoid the singleton altogether. Now, not only do the classes communicate interface to interface, but you may also have a new reusable class (the one that used to be a singleton)!

It's generally possible (and usually a good idea) to replace singletons with non-singletons, but it often requires a little extra though. But this is a good thing and it's one of the best reasons to get rid of singletons. Once you think through your class's interfaces, you'll likely discover that you can loosen the coupling of your classes by pushing their interaction out to their interfaces rather than leaving it buried down in their implementations. This also allows you to add documentation about this interaction in the class's *interface* rather than code comments in its implementation.

Singletons can also make unit testing difficult. If class A uses the singleton class B in A's implementation, then unit testing A requires that B be all setup and able to run correctly. However, if A's constructor were changed to take a B as an argument, then A's unit test could simply create a mock B object, and just focus on the testing of A (which is what a unit test is supposed to do).

Now, singletons aren't always evil. They can be very useful in some situations. I won't go into those examples now, but I just want to be on record as having said that they are not always bad. The main point here is, do not jump to a singleton solution just for convenience. Singletons are convenient because they can be accessed from anywhere (think, global variables), but this should be avoided in favor thinking through class interactions and making your classes communicate via their interfaces. The singleton pattern should be used when you really need to ensure that only one instance of a class is ever created. And don't use a singleton without fully understanding why you really need one.

So, please take a big whiff of your code. Do you smell a lot of singletons? If so, consider refactoring it or make sure you understand why you actually need them. The extra thought you put into your class design now will pay back double when it comes time for maintenance.

Uh, sorry this was a little out of order and jumbled... I just wanted to jot down a few thoughts that were floating around my head on the way home tonight.

Thursday, July 20, 2006

Command Line Processing in Cocoa

Why would you want to process the command line in Cocoa? I mean, Cocoa's all GUI and other cool stuff, right? Well, yes.. but there's really nothing quite as cool as the command line, is there? Good, so let's get to it.

The typical ways to parse command line arguments on Unix systems are to use either getopt(), getopt_long(), or just parse argv yourself. Well now Cocoa (actually, it's Foundation that provides this) offers an even easier alternative. Enter NSUserDefaults. Let's just jump right to a quick example.

// File: args.m
// Compile with: gcc -o args args.m -framework Foundation
#import <Foundation/Foundation.h>

int main(int argc, char *argv[]) {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

NSUserDefaults *args = [NSUserDefaults standardUserDefaults];

NSLog(@"boolArg = %d", [args boolForKey:@"boolArg"]);
NSLog(@"intArg = %d", [args integerForKey:@"intArg"]);
NSLog(@"floatArg = %f", [args floatForKey:@"floatArg"]);
NSLog(@"stringArg = %@", [args stringForKey:@"stringArg"]);

[pool release];
return 0;
}

First we get a access to the standard user defaults object, which will actually do all the parsing of the command line. Then we just access command line arguments like they were stored in the user defaults object. Here's a few sample runs:

$ ./args
2006-07-20 22:24:52.996 args[21633] boolArg = 0
2006-07-20 22:24:52.997 args[21633] intArg = 0
2006-07-20 22:24:52.997 args[21633] floatArg = 0.000000
2006-07-20 22:24:52.997 args[21633] stringArg = (null)

$ ./args -intArg 18
2006-07-20 22:25:41.923 args[21640] boolArg = 0
2006-07-20 22:25:41.923 args[21640] intArg = 18
2006-07-20 22:25:41.923 args[21640] floatArg = 0.000000
2006-07-20 22:25:41.923 args[21640] stringArg = (null)

$ ./args -intArg 18 -stringArg "foo bar" -floatArg 3.14159 -boolArg YES
2006-07-20 22:26:15.129 args[21644] boolArg = 1
2006-07-20 22:26:15.129 args[21644] intArg = 18
2006-07-20 22:26:15.129 args[21644] floatArg = 3.141590
2006-07-20 22:26:15.129 args[21644] stringArg = foo bar


Arguments are case-sensitive, they can be specified in any order, and in general, NSUserDefaults is pretty smart about processing them. For example, the a true bool value can be specified as YES, Y, y, 1, 123, etc, whereas a false bool value can be NO, no, n, 0, etc. Also, note that arguments are specified with a single leading - rather than -- which is typical of most "long" Unix command line options.

I haven't verified this, but I'd imagine that NSUserDefaults access argv via the NSProcessInfo class. It also leaves the argv that's passed to main unharmed in case you want to do any additional processing.

Tuesday, July 18, 2006

What I'm reading: Mac OS X Internals

It's been a while so I figured I should post something. It'd be nice if I blogged as often as I read books, so I think I'll try to make "What I'm Reading" posts and talk about some of the tech books that I'm reading.

Anyway, I just finished Amit Singh's book Mac OS X Internals, and it's fantastic. It's great. It's packed full of awesome technical details and it reads very well. You can check out the book and my review of it at Amazon.