Tuesday, December 26, 2006

Lists and Queues

Linked lists are often among the first data structures we learn about in computer science. They're simple in concept as well as implementation; An implementation could be developed in the time allotted to a technical interview, for example. However, computer scientists and engineers are often lazy, meaning that they don't want to do more work than is needed. Thankfully, with 4.4BSD came the sys/queue.h header file which contains many useful functions for dealing with singly- and doubly-linked lists, tail queues, and circular queues. These functions (actually, they are implemented as C macros) are described in the QUEUE(3) man page.

In this post, we'll take a quick look at a few of the functions required to use these structures. We'll work through a small (and virtually useless) example that uses a doubly-linked list to store even numbers. The first thing we'll do is focus on the central data structure that we want to be "linked" together. In our example, we will create a simple data structure for holding one even number.

struct EvenNumber {
int val;
};


In order to add this structure to a list, we need to modify it slightly by adding an entry for holding pointers to the next and previous EvenNumber entries.

struct EvenNumber {
int val;
LIST_ENTRY(EvenNumber) numbers;
};


LIST_ENTRY is a macro that expands to a structure holding the pointers for the next and previous entries. We can see this by having gcc display the output from the preprocessor. (We won't look at the preprocessor output at every step in this post, but studying the output of gcc -E on your own can be very enlightening.)

$ gcc -E list.c | grep "int val" -A2 -B1
struct EvenNumber {
int val;
struct { struct EvenNumber *le_next; struct EvenNumber **le_prev; } numbers;
};


OK, we've now modified our EvenNumber structure so that it can be inserted into a list simply by adding a LIST_ENTRY field to our structure. Next, we need a way to reference the list as a whole. We do this using the LIST_HEAD macro that declares a new structure which holds a pointer to the first element in the list.

LIST_HEAD(EvenNumbersHead, EvenNumber) even_numbers_head;


The preprocessor will expand this to the following (reformatted for readability).

struct EvenNumbersHead {
struct EvenNumber *lh_first;
} even_numbers_head;


We can see that the first argument to LIST_HEAD is the name of the new structure that we're creating, and the second argument is the name of the structure that will be linked together in the list.

Before we can use the list, we need to initialize the head by calling LIST_INIT with a pointer to our "head" instance, which effectively creates an empty list.

LIST_INIT(&even_numbers_head);


At this point we have a doubly-linked list that is referenced by the variable even_numbers_head. We may add to the list instances of the EvenNumber structure using the macros LIST_INSERT_HEAD, LIST_INSERT_BEFORE, or LIST_INSERT_AFTER.

The basic structure of our program is outlined here.

struct EvenNumber {
int val;
LIST_ENTRY(...) numbers;
};

LIST_HEAD(...) even_numbers_head;

int main(void) {
LIST_INIT(&even_numbers_head);

for (int i = 0; i < 100; i++) {
...
LIST_INSERT_HEAD(&even_numbers_head, ...);
}

LIST_FOREACH(..., &even_numbers_head, numbers) {
...
LIST_REMOVE(theNum, numbers);
...
}

return 0;
}


The full source for this example can be seen in list.c. We can compile and run the program as follows.

$ gcc -o list list.c -std=c99 -Wall
$ ./list
[...output omitted...]
10
8
6
4
2
0


We see that the output is in reverse order. This is because we are using LIST_INSERT_HEAD, which adds the new entry at the beginning of the list. If we want to display the list in increasing order (ignoring that we could simply pipe the output through sort -n) we could add each new entry to the end (or tail) of the list rather than the head. However, for this we should use a different data structure; a TAILQ. TAILQs work very similarly to the LISTs already discussed, so here's the full program source using a TAILQ rather than a LIST.

For more information about these convenient functions, see the comments in /usr/include/sys/queue.h as well as the QUEUE(3) man page. Also note that these functions are available (and heavily used) within the kernel. However, when used in the kernel the header file to include is /System/Library/Frameworks/Kernel.framework/Headers/sys/queue.h.

...now, to see if my wife is finished getting ready yet...

Friday, December 15, 2006

A Kernel Extension... by hand

I generally like using Xcode. It typically does the job, and usually even makes the job easier. It shields the user from having to worry about mundane details that are all too common when building software. However, for the same reason it can increase a developer's productivity, it can also make it more difficult to understand what's actually going on. For example, Xcode has a template to create a Generic Kernel Extension. The template compiles with no modifications. The resulting kernel extension (KEXT) can then be loaded and unloaded without having to write one line of code. This is super cool, but it also hides some of the details of what a KEXT really is. So, today we'll write a "Hello, World" KEXT from scratch without the help of Xcode

(note the issues above do not apply to Xcode alone, rather they apply to almost all IDEs)

Well, let's just jump right in. Basically, a KEXT is a bundle on Mac OS X (it's also a "package"), which means it is a directory structure with some predefined form and an Info.plist file. Our sample KEXT will be named "MyKext.kext", and it will be in a directory structure that looks like this:

$ find MyKext.kext
MyKext.kext
MyKext.kext/Contents
MyKext.kext/Contents/Info.plist
MyKext.kext/Contents/MacOS
MyKext.kext/Contents/MacOS/MyKext


To start off, we need to make the basic directory structure.
$ cd
$ mkdir -p MyKext.kext/Contents/MacOS
$ cd MyKext.kext/Contents/MacOS


Now we can start writing our code. Our code will contain two main routines: MyKextStart() and MyKextStop(), which are called when the KEXT is loaded and unloaded respectively. It will also contain some required bookkeeping code that's needed in order to make our compiled binary proper. Our start and stop routines look like:
// File: mykext.c
#include <libkern/libkern.h>
#include <mach/mach_types.h>

kern_return_t MyKextStart(kmod_info_t *ki, void *d) {
printf("Hello, World!\n");
return KERN_SUCCESS;
}

kern_return_t MyKextStop(kmod_info_t *ki, void *d) {
printf("Goodbye, World!\n");
return KERN_SUCCESS;
}

... more to come in a minute


After these two methods (in the same mykext.c file) we need to put the required bookkeeping stuff.
extern kern_return_t _start(kmod_info_t *ki, void *data);
extern kern_return_t _stop(kmod_info_t *ki, void *data);

KMOD_EXPLICIT_DECL(net.unixjunkie.kext.MyKext, "1.0.0d1", _start, _stop)
__private_extern__ kmod_start_func_t *_realmain = MyKextStart;
__private_extern__ kmod_stop_func_t *_antimain = MyKextStop;
__private_extern__ int _kext_apple_cc = __APPLE_CC__;


This stuff basically declares some needed structures, and it also sets up our routines (MyKextStart() and MyKextStop()) so that they're called on load and unload, by assigning them to the _realmain and _antimain symbols respectively.

OK, now comes the tricky part: the compile. KEXTs are compiled statically, they can only use certain headers that are available in the kernel, and they can't link with the standard C library. These requirements basically translate into a gcc command like the following:

$ gcc -static mykext.c -o MyKext -fno-builtin -nostdlib -lkmod -r -mlong-branch -I/System/Library/Frameworks/Kernel.framework/Headers -Wall


If the planets are properly aligned, you won't get any errors or warnings, and you'll end up with a Mach-O object file in the current directory named MyKext. This is your actual compiled KEXT. (At this point, it's fun to inspect this file using otool. For example, otool -hV MyKext, and otool -l MyKext. Read the man page for otool(1) for more details here.)

Now, the last thing we need to do (before we actually load this thing up) is to give our KEXT an Info.plist. The easiest way to do this is to copy another KEXT's Info.plist file, and change the names of a few things. For this example, I'm going to copy /System/Library/Extensions/webdav_fs.kext/Contents/Info.plist.
$ cd ..
$ pwd
/Users/jgm/MyKext.kext/Contents
$ cp /System/Library/Extensions/webdav_fs.kext/Contents/Info.plist .


Now, you'll need to edit the file and change the value of the "CFBundleExecutable" key to MyKext, and the value of "CFBundleIdentifier" to net.unixjunkie.kext.MyKext (or whatever you set that value to in your mykext.c file).

Okay, it's show time. To load any KEXT, all files in the KEXT must be owned by root and be in group wheel. The files must also have certain permissions in order to load. Here's the steps to load the KEXT.

$ cd /tmp
$ sudo -s
# cp -rp ~/MyKext.kext .
# chown -R root:wheel MyKext.kext
# chmod -R 0644 MyKext.kext
# kextload -v MyKext.kext
kextload: extension MyKext.kext appears to be valid
kextload: loading extension MyKext.kext
kextload: sending 1 personality to the kernel
kextload: MyKext.kext loaded successfully
# tail -1 /var/log/system.log
Dec 15 20:15:47 jgm-mac kernel[0]: Hello, World!


We can see that our MyKextStart() was called. Now let's unload it and see what happens.

# kextunload -v MyKext.kext
kextunload: unload kext MyKext.kext succeeded


Wahoo! It looks like we made a kernel extension with our own 10 fingers, and it worked! :-)

That was fun. Check out Amit Singh's Mac OS X Internals book for more cool bits about what that required "bookkeeping" stuff was in our source file, and why it's required.

Monday, November 20, 2006

Too Much Comment Spam Lately

Hey folks. I started getting tons of comment spam on this blog lately, so I had to disable the comments. Requiring a valid blogger login didn't help, nor did a captcha. Hopefully, I'll be able to turn them back on shortly.

In the meantime, if anyone has a burning comment that they can't hold on to, you can email me at:

$(echo tert.havkwhaxvr@arg | tr a-z@. n-za-m.@)

Monday, October 30, 2006

UPDATE: The char *apple[] Argument Vector

Way back in February I posted about the the char *apple[] argument vector, i.e. the "secret" 4th parameter to all binaries executed on Mac OS X. I just wanted to update that post with one small piece of information.

The value of apple[0] isn't always the path to the executed binary image on disk. Specifically, if a symlink points to the file, apple[0] will refer to the symlink. For example:

$ cat -n apple.c
1 #include <stdio.h>
2 int main(int argc, char *argv[], char *envp[], char *apple[]) {
3 printf("apple[0] = %s\n", apple[0]);
4 return 0;
5 }
$ gcc -o apple apple.c
apple[0] = ./apple
$ ln -s apple foo
$ ./foo
apple[0] = ./foo


Ideally, what we'd like is the path to the executing binary image, even if it was executed via a symlink. We can do this using apple[0] and the function realpath(3) to resolve all the symlinks. For example:

$ cat -n apple.c 
1 #include <sys/param.h>
2 #include <stdlib.h>
3 #include <stdio.h>
4 int main(int argc, char *argv[], char *envp[], char *apple[]) {
5 char resolved[PATH_MAX];
6 realpath(apple[0], resolved);
7 printf("resolved apple[0] = %s\n", resolved);
8 return 0;
9 }
$ gcc -o apple apple.c
$ ln -s apple foo
$ ./apple
resolved apple[0] = /Users/jgm/apple
$ ./foo
resolved apple[0] = /Users/jgm/apple


That's about it.

Saturday, October 14, 2006

Google Calculator from the Command Line

I'm not sure if you're aware of Google's built-in calculator, but it's totally awesome. It does all sorts of basic calculations, almost any unit conversion you can dream up, and for you programmers, it's great for doing base conversions. But opening a web browser to www.google.com just to make sure you converted 0x34 to decimal in your head correctly, isn't always convenient (this is of course assuming that no other calculators exist, such as bc ;-) ).

So, I threw together a very simple command line interface to Google Calculator -- aptly named gcalc. It's also a good example of how Cocoa's powerful classes can let you whip up useful tools quickly, much like a scripting language. The help screen shows some example usages:

$ gcalc
gcalc version 0.1 by Greg Miller

Usage: gcalc [-d] <calculator query>

example: gcalc "5+2*2"
example: gcalc 5!
example: gcalc "sqrt(-4)"
example: gcalc "160 pounds * 4000 feet in calories"
example: gcalc avogadros number
example: gcalc 0b110111010 + 0x33 in decimal
example: gcalc 22 lira in yen
example: gcalc 2 to the power of 5


You can download the source and a prebuilt universal binary from the Google code project page at http://code.google.com/p/uj-gcalc/

Or if you just want to glance at the source, you can check it out here:

Saturday, October 07, 2006

LaunchServices From a root Daemon?

At the end of this post I mention that I'd like to find a way to start a process in the console user's session, from a root process in the startup item context. I do not know of a documented way to do this, but to be clear, there is currently (at least as of 10.4.8) a way to do it simply using LaunchServices.

LaunchServices is the recommended way to launch an application on OS X. When you double-click an item in the dock, it's launched using LaunchServices. When you double-click on a PDF file, it is LaunchServices that figures out that Preview.app is the best candidate to handle that file type (because you certainly don't have Adobe Acrobat installed), it opens Preview.app and tells it to open said file.

LaunchServices is a sub-framework the ApplicationServices umbrella framework, which is not daemon safe according to TN2083. "Daemon safe" means that daemon processes running in the root bootstrap context (aka startup item context) are allowed to link with and use the framework. One reason a framework would not be daemon safe, is if it uses the WindowServer process.

The WindowServer is not only in charge of managing windows on screen, it's also intimately involved with process management (with some help from the loginwindow process). As the matter a fact, when you launch an application it will ultimately be the WindowServer that does a fork() and exec() to start your process. You can use ps to see that the WindowServer is indeed the parent for most of your processes.

$ ps jaxww | grep WindowServe[r]
windowse 59 1 59 .../CoreGraphics.framework/Resources/WindowServer -daemon
$ ps jx | awk '{if ($3 == 59) print}'
jgm 132 59 ... /System/Library/CoreServices/Dock.app/Contents/MacOS/Dock -psn_0_393217
jgm 134 59 ... /System/Library/CoreServices/SystemUIServer.app/Contents/MacOS/SystemUIServer -psn_0_524289
jgm 136 59 ... /System/Library/CoreServices/Finder.app/Contents/MacOS/Finder -psn_0_655361
jgm 139 59 ... /Applications/Google Notifier.app/Contents/MacOS/Google Notifier -psn_0_786433
jgm 197 59 ... /Applications/Utilities/Terminal.app/Contents/MacOS/Terminal -psn_0_1572865
jgm 257 59 ... /Applications/Safari.app/Contents/MacOS/Safari -psn_0_2490369

(Note that if you look at the Parent Process field in Activity Monitor, it will incorrectly report the Dock as the parent of processes that were launched by clicking their dock icon. This is incorrect, and can be verified by look at the output with ps.)

OK, back on topic. So we know the WindowServer is important to launching processes. Actually when we launch a process in typical Cocoa fashion using the NSWorkspace class, LaunchServices is invoked behind the scenes to send a message to the WindowServer via mach messages, requesting that the WindowServer fork() up a new process in the console user's session. So in order to use LaunchServices (or NSWorkspace), we need to be able to communicate with the WindowServer, which is the reason why LaunchServices (and therefore ApplicationServices and therefore Cocoa) is not "daemon safe".

However, as TN2083 points out, there is a "global window server" reference available in the startup item context. This window server reference appears to be an artifact from the past as Quinn explains in the Technote:

The reasons for this non-obvious behavior are lost in the depths of history. However, the fact that this works at all is pretty much irrelevant because there are important caveats that prevent it from being truly useful.

But the fact remains that there is a reference, and we can see it with the Bootstrap dump command.
$ sudo /usr/libexec/StartupItemContext ~/bin/BootstrapDump  | grep Window
"com.apple.windowserver" by "/System/Library/.../Resources/WindowServer"


However, the only users allowed to connect to the WindowServer process are root and the console user. (The console user is the "current" user in a fast-user-switched environment. The console user is the owner of the /dev/console device.) But if we're running as root we should be able to connect to the window server using standard LaunchServices calls. Let's try.

$ cat -n launch.m
1 #import <Cocoa/Cocoa.h>
2 int main(void) {
3 [[NSWorkspace sharedWorkspace]
4 launchApplication:@"TextEdit"];
5 return 0;
6 }
$ gcc -o launch launch.m -framework Cocoa
$ sudo /usr/libexec/StartupItemContext ./launch

This will start TextEdit in my current user session. We can use Bootstrap dump again to verify that the TextEdit process is indeed running in my session, but it is. The command /usr/bin/open links against Cocoa, and works the same way, so I can simply type sudo /usr/libexec/StartupItemContext /usr/bin/open -a TextEdit to do the same thing.

As I mentioned above, only the console user and root are allowed to connect to the WindowServer process, so if your daemon is running as nobody (for example), it won't be able to do this.

$ sudo /usr/libexec/StartupItemContext /usr/bin/sudo -u nobody -s
$ open -a TextEdit
kCGErrorRangeCheck : Window Server communications from outside of session allowed for root and console user only
INIT_Processeses(), could not establish the default connection to the WindowServer.Abort trap

(The first command above starts a shell running as user nobody in the startup item context.)

So, we can see that from the startup item context, as root, we are able to launch a process in the console user's session, simply using normal LaunchServices calls. But note that this is undocumented behavior, and it is not guaranteed to work at any point in the future. It merely worked in this example.

Friday, October 06, 2006

Finder's Locum

Jumping right in, let's consider this little example:

$ mkdir -p foo/bar
$ sudo chown -R root:wheel foo
Password:
$ rm -rf foo
rm: foo/bar: Permission denied
rm: foo: Directory not empty


We create two directories, foo/ and a subdir bar/. We change both of these directories to be owned by root and in group wheel. Then, as a non-root user, we try to recursively delete foo/, and not too surprisingly it fails.

Notice that an error is displayed for foo/bar before the error for foo/. This is because the system call to remove a directory -- rmdir(2) -- requires the directory to be empty before it can be removed. This means that directory hierarchies are removed in a depth-first order. In order for foo/ to be removed, it must be empty, so to make it empty we must remove foo/bar/, etc.

As a quick aside, removing a file (or directory) in Unix does not require write permission to the file! Let me repeat that. You can remove a file if you have write access to the directory in which the file resides -- even if root owns the file. Quick example:
$ mkdir test
$ cd test
$ touch hi.txt
$ sudo chown root:wheel hi.txt
Password:
$ ls -al
total 0
drwxr-xr-x 3 jgm jgm 102 Oct 6 23:13 ./
drwxr-xr-x 31 jgm jgm 1054 Oct 6 23:13 ../
-rw-r--r-- 1 root wheel 0 Oct 6 23:13 hi.txt
$ rm hi.txt
$ ls -al
total 0
drwxr-xr-x 2 jgm jgm 68 Oct 6 23:15 ./
drwxr-xr-x 31 jgm jgm 1054 Oct 6 23:13 ../


This shows that I can delete a root owned file as long as I have write access to the directory. (See chmod(2) for details about how the sticky bit on a directory affects this behavior [this is why shared tmp directories generally have the sticky bit set].)

Thinking back to the original issue, this also explains why we were unable to rm -rf foo. Because foo/bar/ needed to be deleted first, but in order to delete that we need write access to its parent dir. But foo/ was owned by root and we didn't have write access to it. So, all that makes sense now.

Interestingly, if we use the Finder and drag the folder foo/ to the trash, we are able to empty the trash. We're not prompted for a password, it just works. How does the Finder do this?

Well, the Finder application links with a private Apple framework named DesktopServicesPriv.framework, which helps with taking out the trash. Bundled as a resource in the framework is a setuid root binary named Locum, which the Finder uses to delete files that it normally wouldn't have access to delete.

$ cd /System/Library/PrivateFrameworks/DesktopServicesPriv.framework/
$ ls -l Resources/Locum
-rwsr-xr-x 1 root wheel 108940 Mar 26 2006 Resources/Locum*


And we can watch what happens when we drag foo/ to the trash then empty it, using fs_usage.
$ sudo fs_usage | grep Locum
...
23:33:48 rmdir /.vol/234881026/2148678/bar 0.000303 Locum
23:33:48 rmdir /.vol/234881026/2147406/foo 0.000264 Locum
...


Now the question is, is this secure and safe? Well, it's probably fine. I imagine that Apple has rigorously tested and reviewed Locum. Conceivably, it has smarts to guarantee that it only deletes things out of a .Trashes folder. It possibly even does a little handshake to guarantee that its parent is a Finder process. This is all speculation, but the point is, I don't immediately see any big holes here (though this is very different than the behavior one would find on a typical Unix system).

I think the only remaining question is, what the hell does Locum mean? Thankfully we have Wikipedia to help us out: http://en.wikipedia.org/wiki/Locum.

Friday, September 29, 2006

User Notification from the Startup Context

Mac OS X is a great Unix, but since it's a combination of Mach and BSD it has some parts that are new to many traditional Unix users. All Unix systems have the concept of sessions with regard to collections of processes (man getsid(2) for more details), but Mac OS X adds an additional, and very different, concept of a session.

For a quick introduction to sessions on Mac OS X, see tn2083, and for the best description see Amit Singh's Mac OS X Internals book (chapters 5 & 9). In a nutshell, when the system first boots it has one session -- the root session, or the startup context. All processes started in this session will themselves be in this session. Launchd, syslogd, configd, and all system daemons run in the startup context. launchd starts loginwindow.app to handle user logins, but the loginwindow app also creates a new session for each user as they log in. So, every user on the system runs in their own session, and every user's session is different from the startup context (a user's session is also different each time they log in).

Apple provides a sample program called BootstrapDump that will show you all the mach services that are visible in a given context. For example, we can download and compile BootstrapDump.c with:

$ curl -s http://developer.apple.com/samplecode/BootstrapDump/BootstrapDump.zip > tmp.zip
$ unzip -q tmp.zip
$ cd BootstrapDump/
$ gcc -o BootstrapDump BootstrapDump.c
$ ./BootstrapDump
...
"com.apple.PowerManagement.control"
"com.apple.SystemConfiguration.PPPController"
"com.apple.network.EAPOLController"
"com.apple.network.IPConfiguration"
"com.apple.windowserver.active"
...


This program is very useful for debugging and exploring the system. We can also see the difference in the services running in my login session vs the services available to the startup context.

$ ./BootstrapDump | wc -l
227
$ sudo /usr/libexec/StartupItemContext ./BootstrapDump | wc -l
42


Daemons are background processes, so they should very rarely need to display a notice to a user, but occasionally the need arises. Since only the user who's currently sitting at the console is allowed access (according to documentation) to the window server, how can daemons (or kernel extensions) notify the user of certain important events? This question really boils down to: how can something running in the startup context display a notification to the console user in the console user's context?

As it turns out, Apple has provided us with three solutions to this problem:

1) CFUserNotifications
2) Libunc
3) KUNC

The first two are very similar (almost copy-n-paste identical), and the third is built on the first one. I won't go into detail about using these APIs, rather I'll talk about how they work. See the available documentation and header files for usage details about the APIs.

These APIs are NOT intended for use in regular applications. Rather applications that the user launches should use normal Carbon/Cocoa methods to display windows or alerts (NSRunAlertPanel, etc).

CFUserNotifications



From the CFUserNotification.h header file:

A CFUserNotification is a notification intended to be presented to a user at the console (if one is present). This is for the use of processes that do not otherwise have user interfaces, but may need occasional interaction with a user.


One of the convenience functions available to work with CFUserNotifications is CFUserNotificationDisplayAlert(), which is a blocking call that simply displays an alert window on the console user's screen. If we look at the source, we can see that the function CFUserNotificationSendRequest() is eventually called and it works by sending a mach message to the mach port named "com.apple.UNCUserNotification". This service may then be responsible for displaying the message. If we look at this service with BootstrapDump we can see that the service indeed exists, and that messages sent to this service will cause the UserNotificationCenter.app application to be launched "on demand".

$ BootstrapDump | grep UserNo
"com.apple.UNCUserNotification" by "/System/Library/CoreServices/UserNotificationCenter.app/Contents/MacOS/UserNotificationCenter" on demand


Let's see all this in action with a test program:
$ cat -n cfu.m
1 #import <CoreFoundation/CoreFoundation.h>
2 int main(void) {
3 CFUserNotificationDisplayAlert(0, 0, NULL, NULL, NULL,
4 CFSTR("header"), CFSTR("message"), CFSTR("default button"),
5 CFSTR("alt button"), CFSTR("other button"), NULL);
6 return 0;
7 }
$ gcc -o cfu cfu.m -framework CoreFoundation
$ sudo /usr/libexec/StartupItemContext ./cfu


Then in another shell use ps and look for a process named UserNotificationCenter, and make note of its PID, then use BootstrapDump to see what its context looks like:

$ ps aux | grep UserNo[t]
jgm 740 0.0 -0.2 230132 4696 ?? Ss 4:39PM 0:00.22 .../Contents/MacOS/UserNotificationCenter
$ sudo BootstrapDump 740 | wc -l
233
$ BootstrapDump $$ | wc -l
233
$ diff <(sudo BootstrapDump 740 ) <(BootstrapDump $$)


We can see that the UserNotificationCenter process was indeed started on demand, and that its mach context looks just like mine. So we see that calling CFUserNotificationDisplayAlert() sends a mach message to the mach port named "com.apple.UNCUserNotification", which causes UserNotificationCenter.app to be launched on demand to handle the request (think of this as how inetd starts servers on demand).

The next question is, how does this work? And who's listening on the mach port named by "com.apple.UNCUserNotification" before UserNotificationCenter.app is launched on demand, i.e., who starts UserNotificationCenter.app? A little poking around on the system indicates that loginwindow.app knows about the "com.apple.UNCUserNotification" mach service, and it also knows the details of my mach session, so loginwindow.app is the most likely candidate.

$ cat /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow | strings | grep UserNo
UNCUserNotification
com.apple.UNCUserNotification
/System/Library/CoreServices/UserNotificationCenter.app/Contents/MacOS/UserNotificationCenter


(See this post to see I use cat before the strings.)

So to recap the whole thing, a daemon process running in the startup context can create a CFUserNotification, which will send a mach message to "com.apple.UNCUserNotification", loginwindow.app will notice this and fork and exec UserNotificationCenter.app in the correct user session to actually handle displaying the user notification window.

Libunc - in libSystem



This is a system level re-implementation of the same functionality provided by CFUserNotification. Note that it does not use CFUserNotification to do its job, rather it's actually a reimplementation of it. The source is available here.

KUNC



As strange as it may seem to some, the situation may even arise where a kernel extension may need to display a notification to a user. And sure enough, Apple has provided us with this functionality too. The KUNC APIs can be used to do just this. Apple has some documentation about these APIs here, and the source is available as part of xnu at here.

Code running within the kernel may call an API, for example, KUNCUserNotificationDisplayNotice(), that will display a notification window on the console user's screen (and running in their context). This works very much like CFUserNotifications, and as the matter a fact, it is built on the CFUserNotification API. The main difference is that KUNC uses another userland daemon (/usr/libexec/kuncd) to transform KUNC API calls to CFUserNotification API calls. Presumably, this is done because kernel extensions can't link against CoreFoundation.

The kernel maintains a special mach port called the user notification port. This port can be get and set using the MIG calls host_set_UNDServer() and host_get_UNDServer(). When launchd is starting up upon boot, it runs the program /usr/libexec/register_mach_bootstrap_servers to register old-style mach servers defined in /etc/mach_init.d. One of these servers is /usr/libexec/kuncd (registered with /etc/mach_init.d/kuncd.plist). When this server is registered, register_mach_bootstrap_servers calls host_set_UNDServer() to set the host user notification port to a port that will start /usr/libexec/kuncd on demand. As we can see from BootstrapDump (and kuncd.plist) the service is named "com.apple.system.Kernel[UNC]Notifications".

$ BootstrapDump | grep kunc
"com.apple.system.Kernel[UNC]Notifications" by "/usr/libexec/kuncd" on demand


After this, KUNC APIs called from within the kernel (note: KUNC can only be called from within the kernel) send a message to the host user notification port, and /usr/libexec/kuncd is started on demand to handle the request. kuncd itself is started by launchd and runs in the startup item context, but it simply makes CFUserNotification calls which then handle the rest of the request exactly as described above.

KUNC actually has one more interesting function: KUNCExecute(), which takes a path to an executable, a UID, and a GID. It doesn't make much sense to have the kernel fork() and exec() like a normal process, so it's interesting to consider how this works. Basically, KUNCExecute() again sends a mach message to the host user notification port, /usr/libexec/kuncd answers that request, but this time CFUserNotification is not used. Rather the execution is handled by a call to _SCDPExecCommand() from the SystemConfiguration framework. This method simply does a fork(), setuid(), and execv(), of the binary. The key point here is that the program run with KUNCExecute() is NOT run in a user's login context, rather it's just run in the startup context like launchd and other daemons.


All of these notification methods can be visualized by considering this figure:

However, note that the libunc/libSystem approach mentioned above is not pictured. If it were, it would show up as another framework (library) in magenta similar to CoreFoundation in the figure.

I was recently asked how a daemon could display a notification to a user, and I wasn't sure. But after a little poking around, it appears that there's a few good options. I had also hoped to find a way to run a command as the console user, but I didn't find that. If someone knows how to do that, I'd love to know. Also, if anyone sees any inaccuracies in what I've written, please let me know.

[UPDATE: 10/7/2006:
Please see this post for more info about launching a process in a user context from the startup item context.]

Monday, September 25, 2006

Interesting behavior of the strings command

I got a little frustrated with strings the other day, so I figured I'd share. strings appears to try to be "smart" about handling fat (universal) binaries by only processing the binary for the host architecture. This means that if you run strings /some/binary/file it may not actually show you all the strings in the file if the binary is fat. An alternative is to instead use cat /some/binary/file | strings or you can use the undocumented -arch option, like strings -arch all /some/binary/file.

For example:

$ cd /Applications/Camino.app/Contents/MacOS/
$ file Camino
Camino: Mach-O fat file with 2 architectures
Camino (for architecture i386): Mach-O executable i386
Camino (for architecture ppc): Mach-O executable ppc
$ strings Camino | wc -l
27224
$ strings -arch ppc Camino | wc -l
27224
$ strings -arch i386 Camino | wc -l
27654
$ strings -arch all Camino | wc -l
54878

Monday, September 04, 2006

Software Design Patterns

Why they are good and how to write them.

read more | digg story

Wednesday, August 30, 2006

Pruning Empty Directories

Just a quickie here. I had a fairly large directory structure that was pretty deep and had many empty directories, where an "empty" directory may still have subdirectories which are themselves empty (i.e. if I did mkdir -p foo/bar/baz, foo/ would be "empty"). The quickest way I found to clean up all the empty directories was to have find do a depth-first traversal (-d) of the directory structure, then use rmdir, which only deletes empty directories. Something like:

$ find -d . -print0 | xargs -0 rmdir 2> /dev/null

Sunday, August 20, 2006

APUE2e Acknowledgement

It's no secret that Advanced Programming in the UNIX Environment, by W. Richard Stevens and Stephen A. Rago, is a staple for all developers who write code for any flavor of Unix. If you don't have have at least one copy already, I'd strongly encourage you to pick one up.

That said, don't forget to check out the book's website at apuebook.com. And while you're there, check out item 18 on the "Additional Acknowledgements" page ;-) Anyway, here's a little bit more detail about that particular issue.

The SUSv3 (Single Unix Specification version 3) states that... "If connect() fails, the state of the socket is unspecified. Conforming applications should close the file descriptor and create a new socket before attempting to reconnect." And as an example, retrying connect() doesn't always work on Darwin 8.6.0 and FreeBSD 6.0-RC1 (the only versions of these OSes that I checked).

The case I found where retrying connect() doesn't work is when I try to connect() to a port that's not listening. The client (calling connect()) sends the SYN, a RST is received (as expected) and connect() returns -1 with errno set to ECONNREFUSED. This is all as expected. However, if that same socket is used to attempt the connect() again, no packets are sent and connect() immediately fails with EINVAL. This code illustrates:

int main(void) {
struct sockaddr_in remote_addr;

bzero(&remote_addr, sizeof(remote_addr));
remote_addr.sin_family = AF_INET;
remote_addr.sin_port = htons(3333);
inet_pton(AF_INET, "127.0.0.1", &remote_addr.sin_addr);

int sock = socket(AF_INET, SOCK_STREAM, 0);

while (connect(sock, (struct sockaddr *)&remote_addr,
sizeof(remote_addr)) == -1) {
perror("failed to connect");
sleep(2);
}
...
}


Again, on Darwin and FreeBSD the second time through the while-loop, EINVAL is immediately returned. And since no packets are actually sent, if the port at 127.0.0.1:3333 ever does open up, it will not be detected.

On the 2.4 Linux kernel I tested, the code does what I initially expected and it returns ECONNREFUSED every time.

Since, the SUSv3 says that a failed connect() leaves the socket in an undefined state, I don't think this is actually a bug. But it looks like it also means that the connect_retry() code in figure 16.9 (of APUE2e) is not portable.

So, to summarize, the issue is that if a connect() call fails for any reason, the state of the socket is undefined. To be portable, you must close the socket and create a new one before calling connect() again.

When I emailed Stephen Rago about this, he was very responsive and nice. He feels that this bug lies with the sockets implementation, but he added an FAQ on the book's website about it anyway.

Again, if you're reading this blog, and you don't already have a copy of this book, you should probably go get one now.

Thursday, August 17, 2006

Old (but useful) Shell Tricks

I used to have a somewhat long list of somewhat interesting Unix/Shell tips-n-tricks on an old version of my website. A few folks asked me where it was, and as it's no longer available, so I figured I'd repost a handful of the tips:




  1. From within an executing script, how do I find the directory where the script lives?


    A simple pwd doesn't work because it gives the Present Working Directory, and we want to know the directory where the script actually resides on disk. I can't remember the previous solution I came up with, but here's another solution that should work:
    #!/bin/bash
    dir=$(dirname $(echo $0 | sed -e "s,^\([^/]\),$(pwd)/\1,"))
    echo I live in $dir

    Here how it works. $0 is the name of the script as it was executed, so this may be foo.sh, ./foo.sh, /tmp/foo.sh, etc.. This gets sent to sed, which then uses a basic regular expression that says "If the first character of $0 was NOT a forward slash, then prepend $0 with my present working directory ($(pwd)) followed by whatever that first character was (\1), finally, take the dirname of this value, then assign it to dir". If the first character in $0 is a forward slash, then we were invoked via an absolute path and so we don't want to change anything.



  2. How do I copy a directory structure from one machine to another?


    $ tar cf - some_directory | ssh kramer "( cd /path/to/destination; tar xf - )"

    tar cf - some_directory creates a tar file of some_directory but the dash (-) tells tar to write to STDOUT instead of writing to an actual file on disk. The STDOUT from the first tar command is piped to STDIN of the next command. The right hand side of the pipe says to log into the host kramer using ssh and run the commands cd and tar xf -. The trick is with the commands "( cd /path/to/destination; tar xf - )". The parens create a subshell, in which the current directory is changed to /path/to/destination, and tar xf - reads from STDIN and extracts the tar file. This STDIN is the same STDIN that was sent to us over the pipe from STDOUT of the first tar command. Thus the directory structure on jerry gets tar'd up, transfered to kramer, then extracted all in one fell swoop.



  3. How do I diff two files on different machines (using Bash)?


    $ diff <(ssh -n george cat /etc/passwd) <(ssh -n kramer cat /etc/passwd)

    This is just the Bash Process Substitution trick.



  4. How can I run a shell script on a remote host without copying the script out?


    One way would be:
    jerry:~$ cat myscript.sh | ssh kramer /bin/sh

    This one is pretty simple how it works, but it is often overlooked as an option for getting things done. The shell script is written to STDOUT. /bin/sh is executed on the remote server, and it reads myscript.sh on STDIN, thus executing the local copy of the script. This is way convient for some things. The only problem I see with this one is that you can't pass command line arguments to the script.



  5. How can I run long pipe lines of commands on a remote host via SSH without escaping all the meta characters?



    jerry:~$ ssh kramer <<EOF
    ps -ef | grep http | awk '{print \$NF}'
    EOF

    The only tricks here are the use of a bash here document, and the fact that the command line is typed directly into ssh's STDIN so there's no need to escape things like pipes and semi-colons, etc. However, notice that you do still need to escape dollar signs because they'll still be interpreted as shell variables.



  6. How do I change every occurance of a string in multiple files?


    Why, use perl pie!
    perl -p -i -e 's/jerry/george/g' *.txt

    See perl -h for a description of the flags.




Friday, August 11, 2006

A quick read of Mac OS X Internals

I often hear people comment on how thick the Mac OS X Internals book is. Well, it is thick, but not too thick. I've already read it cover-to-cover twice, and will show that it's even possible to tackle in one sitting. Check it out. ;-)

Sunday, August 06, 2006

WWDC 2006

I'm getting ready for WWDC which starts tomorrow. I'm very excited, and hopefully it'll fuel some interesting posts coming up.

Friday, August 04, 2006

Tracing Objective-C Messages

Tools like strace, ltrace, truss, ktrace, etc, are very cool, and necessary if you really want to understand how things work. They allow you to watch what a process is doing by showing you when certain functions are called. It would also be really cool if we could see similar information as Objective-C messages are sent.

So, I read through the Objective-C runtime code and discovered a way. A few days later I found a good blog post by Dave Dribin here that outlines the basic idea that I had used. However, his solution requires you to recompile libobjc.dylib, which is undesirable as well as unrealistic in many cases.

Please take a few moments to read his post (again, here), then come back and read the rest of this...

...

OK, as he explains, the symbol that we want access to "_logObjcMessageSends" isn't exported (remember, nm showed it as a little "t") so he rebuilds the libobjc dylib in order to export the symbol. I'd like to propose an alternate solution that doesn't require touching libobjc.dylib.

Rather than looking up the symbol address using dlsym(), we should use the often overlooked nlist(3) function, which will return us the address of "private" symbols. So, in our dylib that we want to insert with DYLD_INSERT_LIBRARIES, we could have code like:

...
typedef int (*ObjCLogProc)(BOOL, const char *, const char *, SEL);
typedef int (*LogObjcMessageSendsFunc)(ObjCLogProc);

struct nlist nl[2];
bzero(&nl, sizeof(struct nlist) * 2);
nl[0].n_un.n_name = "_logObjcMessageSends";

if (nlist("/usr/lib/libobjc.dylib", nl) < 0 || nl[0].n_type == N_UNDF) {
fprintf(stderr, "nlist(%s, %s) failed\n",
"/usr/lib/libobjc.dylib",
nl[0].n_un.n_name);
return;
}

LogObjcMessageSendsFunc fcn = (LogObjcMessageSendsFunc) nl[0].n_value;
(fcn)(&MyLogObjCMessageSendFunction);
...

This code uses nlist() to look up the address of _logObjcMessageSends. The symbol it's looking up happens to be "private", but that's OK. Then once it has the address of the symbol, it casts it to a pointer to a function with the correct signature. Once that's done, the new function pointer is used just like any ol' function.

So, this solution works just like Dave Dribin's, but it doesn't require a recompile of the Objective-C runtime.

Saturday, July 29, 2006

Access argc and argv from Anywhere

Say you're in some random function that's stuck deep down in the middle of some big program you're writing on the Mac. And now assume that you'd like to have access to the arguments that were passed to main() when the program started. How can you do this?

Well, as it turns out, this work has already been done for us. Let's have a look inside /usr/lib/libSystem.B.dylib shall we.

$ nm /usr/lib/libSystem.B.dylib | grep _NSGetArg
...
9003a382 T __NSGetArgc
90020f60 T __NSGetArgv
...

Ahh, so it looks like we found some symbols (functions) whose names look very revealing (man nm to see more details on what the output means above). The C compiler automatically prepends a leading underscore to symbol names, so the actual function names should be: _NSGetArgc(void) and _NSGetArgv(void). Let's pick one and try to use it.

Since the function NSGetArgc() is not declared in any header files (that I know of), we'll need to declare it ourselves. But we want to link against the version of the function that's in libSystem, so we'll declare it extern. Let's take a first stab:
$ cat nsargc.c
#include <stdio.h>
extern int _NSGetArgc(void);
int main(void) {
printf("argc=%d\n", _NSGetArgc());
return 0;
}
$ gcc -o nsargc nsargc.c -Wall -std=c99
$ ./nsargc foo bar
argc=8192

Well, that's certainly not correct. Maybe we're using the function wrong. The real "argc" should be an int, but maybe this function returns a pointer to it instead of returning the actual value. Let's try that:
$ cat nsargc.c
#include <stdio.h>
extern int *_NSGetArgc(void);
int main(void) {
printf("argc=%d\n", *_NSGetArgc());
return 0;
}
$ gcc -o nsargc nsargc.c -Wall -std=c99
$ ./nsargc foo bar
argc=3

Hey! Now that's more like it. So it looks like these functions may return pointers to the values we want rather than the actual values we want. Now let's try this with argv as well.
$ cat NSGetArgs.c
#include <stdio.h>

extern int *_NSGetArgc(void);
extern char ***_NSGetArgv(void);

void DoStuff(void) {
printf("%20s = %d\n", "_NSGetArgc()", *_NSGetArgc());

char **argv = *_NSGetArgv();
for (int i = 0; argv[i] != NULL; ++i)
printf("%15s [%02d] = '%s'\n", "_NSGetArgv()", i, argv[i]);
}

int main(void) {
DoStuff();
return 0;
}
$ gcc -o NSGetArgs NSGetArgs.c -Wall -std=c99
$ ./NSGetArgs foo bar
_NSGetArgc() = 3
_NSGetArgv() [00] = './NSGetArgs'
_NSGetArgv() [01] = 'foo'
_NSGetArgv() [02] = 'bar'


Sweet, it looks like that works. So now we can get access to argc and argv from anywhere within a program. And notice that we didn't even need to declare the arguments in main's signature.

Here's a few similar functions that may be interesting.
$ nm /usr/lib/libSystem.B.dylib | grep _NSGet | grep ' T '
9002a55f T _NSGetNextSearchPathEnumeration
9003a382 T __NSGetArgc
90020f60 T __NSGetArgv
90003074 T __NSGetEnviron
90029e2d T __NSGetMachExecuteHeader
90027506 T __NSGetProgname
9014aa84 T _NSGetSectionDataInObjectFileImage
90036106 T __NSGetExecutablePath

Tuesday, July 25, 2006

The Singleton Smell

I'm a big fan of using Object Oriented design patterns, especially the classics popularized by the GoF. Design patterns are the OO analog of algorithms. Just like a good software engineer needs to be knowledgeable about data structures and algorithms, knowledge of OO patterns is a requirement in todays OO software world (disclaimer: the previous statement is only my opinion). And just like bubblesort can be misapplied to solve a problem, OO patterns can also be used inappropriately.

One of the most (if not the most) misused patterns is the Singleton. The singleton can be very useful when applied correctly, but it can also make for some poorly designed and almost unmaintainable software. The singleton is probably the easiest pattern to understand from the GoF's Design Patterns book (above), which may be why it's so often misused by engineers who are new to patterns. In a nutshell, the singleton pattern attempts to ensure that only one instance of a class is every created. Many callers may use the class, they just end up using the same instance.

Code that uses a lot of singletons gives off a distinct smell. A small that a software engineer should recognize. It's similar to, but more potent than, the smell given off by global variables, because a singleton is effectively a global variable. Singletons are generally global in scope, thus allowing any class, at any level, access to the singleton (read: global variable). This makes for classes that are tightly coupled.

Additionally, singletons let you design classes (or more (in)appropriately, "implementations") without having to think about the class's interface, or how they interact with other classes. The singleton doesn't need to be an argument to the class's constructor, or to methods, so it's often not considered when thinking through the object model for your code. This is similar to the way that C functions don't need to declare global variables in their parameter list, because again, they're globals, so the function's interface doesn't need to consider them. And I think we'd probably all agree that global variables are generally not the best idea.

Singletons are often misused in situations where you have multiple classes that all need to access the same instance of an object. In this case, the singleton is used as a convenience to let all the classes access this one central global variable. It would likely be a better idea to think about the class' interfaces, possibly add an extra parameter here and there, and avoid the singleton altogether. Now, not only do the classes communicate interface to interface, but you may also have a new reusable class (the one that used to be a singleton)!

It's generally possible (and usually a good idea) to replace singletons with non-singletons, but it often requires a little extra though. But this is a good thing and it's one of the best reasons to get rid of singletons. Once you think through your class's interfaces, you'll likely discover that you can loosen the coupling of your classes by pushing their interaction out to their interfaces rather than leaving it buried down in their implementations. This also allows you to add documentation about this interaction in the class's *interface* rather than code comments in its implementation.

Singletons can also make unit testing difficult. If class A uses the singleton class B in A's implementation, then unit testing A requires that B be all setup and able to run correctly. However, if A's constructor were changed to take a B as an argument, then A's unit test could simply create a mock B object, and just focus on the testing of A (which is what a unit test is supposed to do).

Now, singletons aren't always evil. They can be very useful in some situations. I won't go into those examples now, but I just want to be on record as having said that they are not always bad. The main point here is, do not jump to a singleton solution just for convenience. Singletons are convenient because they can be accessed from anywhere (think, global variables), but this should be avoided in favor thinking through class interactions and making your classes communicate via their interfaces. The singleton pattern should be used when you really need to ensure that only one instance of a class is ever created. And don't use a singleton without fully understanding why you really need one.

So, please take a big whiff of your code. Do you smell a lot of singletons? If so, consider refactoring it or make sure you understand why you actually need them. The extra thought you put into your class design now will pay back double when it comes time for maintenance.

Uh, sorry this was a little out of order and jumbled... I just wanted to jot down a few thoughts that were floating around my head on the way home tonight.

Thursday, July 20, 2006

Command Line Processing in Cocoa

Why would you want to process the command line in Cocoa? I mean, Cocoa's all GUI and other cool stuff, right? Well, yes.. but there's really nothing quite as cool as the command line, is there? Good, so let's get to it.

The typical ways to parse command line arguments on Unix systems are to use either getopt(), getopt_long(), or just parse argv yourself. Well now Cocoa (actually, it's Foundation that provides this) offers an even easier alternative. Enter NSUserDefaults. Let's just jump right to a quick example.

// File: args.m
// Compile with: gcc -o args args.m -framework Foundation
#import <Foundation/Foundation.h>

int main(int argc, char *argv[]) {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

NSUserDefaults *args = [NSUserDefaults standardUserDefaults];

NSLog(@"boolArg = %d", [args boolForKey:@"boolArg"]);
NSLog(@"intArg = %d", [args integerForKey:@"intArg"]);
NSLog(@"floatArg = %f", [args floatForKey:@"floatArg"]);
NSLog(@"stringArg = %@", [args stringForKey:@"stringArg"]);

[pool release];
return 0;
}

First we get a access to the standard user defaults object, which will actually do all the parsing of the command line. Then we just access command line arguments like they were stored in the user defaults object. Here's a few sample runs:

$ ./args
2006-07-20 22:24:52.996 args[21633] boolArg = 0
2006-07-20 22:24:52.997 args[21633] intArg = 0
2006-07-20 22:24:52.997 args[21633] floatArg = 0.000000
2006-07-20 22:24:52.997 args[21633] stringArg = (null)

$ ./args -intArg 18
2006-07-20 22:25:41.923 args[21640] boolArg = 0
2006-07-20 22:25:41.923 args[21640] intArg = 18
2006-07-20 22:25:41.923 args[21640] floatArg = 0.000000
2006-07-20 22:25:41.923 args[21640] stringArg = (null)

$ ./args -intArg 18 -stringArg "foo bar" -floatArg 3.14159 -boolArg YES
2006-07-20 22:26:15.129 args[21644] boolArg = 1
2006-07-20 22:26:15.129 args[21644] intArg = 18
2006-07-20 22:26:15.129 args[21644] floatArg = 3.141590
2006-07-20 22:26:15.129 args[21644] stringArg = foo bar


Arguments are case-sensitive, they can be specified in any order, and in general, NSUserDefaults is pretty smart about processing them. For example, the a true bool value can be specified as YES, Y, y, 1, 123, etc, whereas a false bool value can be NO, no, n, 0, etc. Also, note that arguments are specified with a single leading - rather than -- which is typical of most "long" Unix command line options.

I haven't verified this, but I'd imagine that NSUserDefaults access argv via the NSProcessInfo class. It also leaves the argv that's passed to main unharmed in case you want to do any additional processing.

Tuesday, July 18, 2006

What I'm reading: Mac OS X Internals

It's been a while so I figured I should post something. It'd be nice if I blogged as often as I read books, so I think I'll try to make "What I'm Reading" posts and talk about some of the tech books that I'm reading.

Anyway, I just finished Amit Singh's book Mac OS X Internals, and it's fantastic. It's great. It's packed full of awesome technical details and it reads very well. You can check out the book and my review of it at Amazon.

Sunday, May 21, 2006

Fetching Darwin Source the Simple Way

It's been a while since my last post, so I figured I'd post some little thing...

If you're at all like me, you love reading OpenSource source code to see how things are really done. Like, maybe you want to see how bash does process substitution, or you want to see if shell IO redirection is done using dup, dup2, or open. The easy way to answer these questions is to simply read the source code. And since we're all Mac users here, we'll choose to read the Darwin source.

So, I wrote a simple little (no joke, really simple and little) script to let you see what source is available and download and extract it for you. It's called snagdar.sh, and can be found here.

When run with no arguments it simply displays a list of all available Darwin packages. This is useful for grepping to find the package you may want. Then, once you find the package, just run snagdar.sh again passing it a regexp to match the package you want. If multiple packages match the regex, they will all be snagged.

So, say I want to see how lsof(8) works, I can run:

$ snagdar.sh | grep lsof
lsof 20 Other

to see if an lsof package exists. We see that it does, and we'd like the source for it, so we can simply do:
$ snagdar.sh lsof

+++++ Snagging http://darwinsource.opendarwin.org/tarballs/other/lsof-20.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 549k 100 549k 0 0 573k 0 --:--:-- --:--:-- --:--:-- 639k


Then, I'll end up with a directory named lsof-20/ in my current dir which contains the source for lsof.

This is a just a small, simple script that happens to make my daily life a little bit easier.

UPDATE: 9/8/2006
Since Open Darwin has closed its doors, this script no longer works. Darwin source can still be fetched from Apple, but it requires a login to download. I'll update the script when I get time.

UPDATE: 5/10/2007
Snagdar now works again. See the this post

Tuesday, April 18, 2006

MacBook Pro

I got my work supplied MacBook Pro last week, and I've got two words... AWE-SOME! It's waaaay fast. On my dual 2GHz G5, Firefox bounces the dock icon about 5-8 times before opening. With the new universal Firefox binary, it bounces 1-2x on my MBP! That's friggin' fast! Now, I don't actually use Firefox (mostly because it's historically been too slow to start), but it's still an interesting measure.

Here's a few other interesting little tid-bits I came across on my first run through the new system:



  • You can tell if an application is universal by looking at the "Get Info" window in the Finder, or by using the file command from the Terminal. And you can tell if a process is actually running under Rosetta at runtime because /usr/libexec/oah/translate will be mapped into its address space. Just
    lsof -p PID | grep translate




  • You can run an application under Rosetta from the command line by using the command /usr/libexec/oah/translate. For example:
    /usr/libexec/oah/translate /bin/ls



  • On PPC functions arguments are passed in CPU registers starting with $r3. So, the function call foo(1, 2, 3) would have 0x1 in $r3, 0x2 in $r4, etc. On Intel function arguments are passed on the stack, so the function call foo(1, 2, 3) would have 0x1 at $ebp+8, 0x2 at $ebp+12, etc.



  • In Objective-C, a method call like [foo add:5] actually gets compiled into a C function call like
    objc_msgSend(self, @selector(add:), 5)
    And as we just saw, Intel Macs pass function arguments on the stack. So, the standard way to print "self" in gdb on a PPC Mac is
    po $r3
    (remember, $r3 has the first argument on PPC -- "self"), but on Intel it turns into
    po *(int *)($ebp+8)
    (po is print-object).



  • If you need to debug (using gdb) a PPC binary on an Intel Mac, you can do some basic stuff by setting the OAH_GDB environment variable to YES, then starting the application. Then in a new window, start gdb like
    gdb --oah
    then use gdb's attach command to attach to the running process like normal. This will even show you PPC style registers and stuff in gdb. Pretty cool for basic debugging.


Saturday, April 08, 2006

Fantastic Darwin Code Browser Online

The source for Darwin is available online here, but there's not a good way to browse and search the code. Until now. Enter the OpenGrok source browser for Darwin. It's friggin' sweet and "wicked fast".

Thursday, April 06, 2006

The little hidden built-in calculator

I just learned this little trick a few days ago, but I guess it's not really all that new. Almost all Cocoa text areas can perform calculations right inline. You just have to type a calculation, highlight it, hit Command-Shift-8, then the calculation will be evaluated and replaced by the answer. Pretty cool. For example, if I type 3^(2*pi), then highlight it and hit Command-Shift-8, it will be replaced with 995.041644892855.

You'll probably notice that the Sript Editor application opens when you do this. That's because the feature is implemented as a Service provided by the Script Editor. What's really happening is that the highlighted text is being executed as AppleScript. So, as I sit here typing into this text area in Safari, I could highlight and execute

tell application "Safari" to display dialog "Hello"
(don't ask me why I'd want to run that AppleScript from right here).

I'm sure there's much cooler stuff you can do wit this trick, but regardless it's pretty cool.

Tuesday, March 28, 2006

Ignore case in vim searches

Ahhh, there it is! I've often wanted to know how to make "/" searches in vim case insensitive, and today somebody at work enlightened me. If \c appears anywhere in a pattern the whole pattern is assumed to be case insensitive. So to search for the string "root" while ignoring case you'd use

/\croot
Take a look at :help ignorecase in vim for more info.

Saturday, March 25, 2006

The difference between foo() and foo(void)

We often see two different ways to declare a method that takes no arguments. The two common forms are:


  1. int foo();

  2. int foo(void);


Which one is correct? And what's the difference?

Well, in C++ the two forms are equivalent and they both declare a function that takes no arguments. If you try to call the function and pass in an argument, the compile will give an error.

On the other hand, C and Objective-C both treat the two forms differently. In these languages the first form declares a function that takes an unknown number of arguments, whereas the second form declares a function that takes no arguments at all. So, in C the following is valid code:
int foo() {
return 5;
}
int main() {
return foo(1, 2, 3);
}
The compiler doesn't complain, and the code runs fine (a C++ compiler would give an error on this same code).

Generally what you want in C and Objective-C is to use the second form and include the explicit void to indicate that the function should take no arguments. However, it's more common in C++ to use the first form because it's equivalent and shorter.

Wednesday, March 22, 2006

Strange difference in ps output

I was asked an interesting question today. It was basically something like, "when I type ps -ef on my Linux box it displays nice wide output, but when I pipe it through grep [or pipe through anything] I have to use the -l option to ps in order for it to display long output".

To illustrate the problem we need a command with a long command line that will run long enough for us to see it. This sounds like a job for sleep(1) with a reeeeaaaallly long time argument.

$ sleep 1000000000000000000000000000000 &
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
[... output omitted ...]
jgm 8264 8156 0 01:03 pts/0 00:00:00 sleep 1000000000000000000000000000000
[... output omitted ...]
$ ps -ef | grep sleep
jgm 8264 8156 0 01:03 pts/0 00:00:00 sleep 10000000000000000000000000


So, it's clear that when we grep ps output, some of the zeros are truncated from the command. But why?

Well, my assumption is that ps checks if its output descriptor (FD 1) is a terminal, and if so, it detects the width of the terminal so that the output is nicely formatted on the terminal and the lines do not wrap. When we use grep the output file descriptor for ps is *not* a terminal (it's a pipe) so it has no "width". In this case ps has to guess a width. And it appears to pick the standard 80 columns wide.

$ ps -ef | grep sleep | wc
1 9 81


Yep, 80 chars (plus the newline character).

The typical C function used to determine if a file descriptor refers to a terminal is isatty(3), which likely translates to an ioctl(2) system call. Let's see if we can verify our hypothesis using strace(1), which is a Linux tool that allows us to see system calls.

(strace writes output to STDERR so we need to grep STDERR)
$ strace ps -ef 2>&1 | grep ioctl
ioctl(1, 0x5413, 0xbffff100) = -1 EINVAL (Invalid argument)
read(7, "grep\0ioctl\0", 2047) = 11
write(1, " 00:00:00 grep ioctl\njgm "..., 78 00:00:00 grep ioctl


Hmm, we do see a failed ioctl() call. The first argument is 1, which is STDOUT, the second arg is 0x5413, and the third is just something on my stack. After reading the the ioctl(2) man page I see that the second argument is the "request type". So, let's grep through some of the standard system headers to figure out what this 0x5413 thing is.

$ grep -r 0x5413 /usr/include/*/ioctl*
/usr/include/asm/ioctls.h:#define TIOCGWINSZ 0x5413

Ahhh! 0x5413 indicates a request to get the window size. Just like we thought. Looking back at the strace output we see that the ioctl() call to get the window size failed (EINVAL) so it couldn't get the window size, and must have just used a default value.

Now, the one last check we can do is take a look at the strace output when the output actually does go to a terminal.

$ strace ps -ef 2> ps.out
[... regular ps output omittied ...]
$ grep ioctl ps.out
ioctl(1, 0x5413, {ws_row=24, ws_col=141, ws_xpixel=846, ws_ypixel=336}) = 0


OK, so it looks like our assumption was correct, and I think we verified it thoroughly enough.

New Gmail Notifier

Google released a new Gmail Notifier for Mac OS X today. It just has a few small updates, like automatic self updating and it's now a universal binary, and best of all it finally has non-ugly icons!

Sunday, February 26, 2006

Google Page Creator (pages.google.com)

Wow! If you haven't checked out Google's new Page Creator product, drop what you're doing and check it out now. It's a WYSIWYG web page creator, done online in AJAX, that generates XHTML strict sites very easily. It's really cool and fun to play with. And again, it amazes me what Google can do with JavaScript.

Sunday, February 19, 2006

Change __MyCompanyName__ in Xcode

If you're sick of seeing __MyCompanyName__ in the header comments of all your Xcode files, you can set the default company name in Xcode with:


defaults write com.apple.Xcode PBXCustomTemplateMacroDefinitions
-dict ORGANIZATIONNAME "Blah, Inc"

(all entered on one line, of course)

Thursday, February 16, 2006

Squashing a Real Bug on Darwin

I previously talked about a really cool bash trick called process substitution, which allows you to use a process almost anywhere you can use a file. For example, diff <(ls dir1) <(ls dir2) would allow me to diff the contents of dir1 and dir2.

The Problem



For the most part, process substitution works great on Mac OS X, but there are cases where it doesn't work. For example:

$ diff <(echo foo) <(echo bar)

produces no output, when clearly the string "foo" differs from the string "bar". But why?


Troubleshooting



Well, let's check out the tools in our toolbox: gdb, gcc, vm_stat, vmmap, nm, otool, stat, etc. Hmm, let's try a few more experiments first.

$ diff <(echo foo) <(echo bar)
$ diff <(echo foo) <(echo barX)
1c1
< foo
---
> barX

$ diff <(echo foo) <(echo bar)
$ diff <(echo foo) <(sleep 1; echo bar)
1c1
< foo
---
> bar

Interesting...

$ stat <(echo foo) <(echo bar)
520093697 0 prw-rw---- 1 jgm jgm 0 4 "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" 512 8 0 /dev/fd/63
520093697 0 prw-rw---- 1 jgm jgm 0 4 "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" 512 8 0 /dev/fd/62

Very, interesting! According to stat(1) the two named pipes created have identical attributes except for the file name. What's more is that the two files have the same inode number!? But what's that you say? How can a two different files (i.e. not hard-links) on the same filesystem have the same inode number? According to POSIX this isn't allowed. So maybe diff is trying to do a quick short circuit and saying "hey, the files have the same inode number and are on the same filesystem, so they must be the same". Maybe.

Well, let's get the source code for diff and check it out.

$ curl http://darwinsource.opendarwin.org/tarballs/other/gnudiff-13.tar.gz > gnudiff-13.tar.gz
$ tar -zxvf gnudiff-13.tar.g
$ cd gnudiff-13
$ make

OK, now let's test our freshly built diff.

$ /tmp/gnudiff/Build/src/diff <(echo foo) <(echo bar)
$

Yep, our new version has the same problem that we want to fix.

OK, so let's take a look at some source. gnudiff/src/diff.c looks like a good place to start. Just search for the word "main" and we can quickly check out the main function to get an idea of how diff starts to do what it does. Around line 713 we see the call
int status = compare_files ((struct comparison *) 0, from_file, argv[optind]);
which looks very promising. We find the definition of this function at line 1047. Now just skim through this function to get an idea what it does. Around line 1214 we see a comment that looks very promising!

if ((same_files = (cmp.file[0].desc != NONEXISTENT
&& cmp.file[1].desc != NONEXISTENT
&& 0 < same_file (&cmp.file[0].stat, &cmp.file[1].stat)
&& same_file_attributes (&cmp.file[0].stat, &cmp.file[1].stat)))
&& no_diff_means_no_output) {
/* The two named files are actually the same physical file.
We know they are identical without actually reading them. */

}

Oh, I bet this has something to do with the problem! What do those two "same_file*" functions do?

Armed with grep we find them defined as macros in gnudiff/src/system.h around line 361. These two macros basically check some file data returned by stat(2) to see if two files are identical. The attributes checked are things like inode number, uid, gid, size, mtime, ctime, etc. All attributes that were identical when we checked the stat output of our two named pipes. Take a second and glance back up at the output from stat <(echo foo) <(echo bar). I'll wait... back? OK. So, it sorta makes sense why that diff may have failed. And it also makes sense why diff <(echo foo) <(sleep 1; echo bar) would have worked. Can you guess why? (hint: think about the modification times for each fifo)


A Fix



What's the best fix for this? One could argue that the problem is that the HFS+ filesystem allows two files on the same filesystem to have the same inode number, but HFS+ really isn't an inode based filesystem. On HFS+, inode numbers are really just the volume's catalog node ID. Plus, it's probably a big pain to modify the Darwin Kernel or the HFS+ filesystem code.

Maybe we can instead fix the problem in diff. According to this technote a CNID of zero is never used and indicates nil, so maybe diff should not shortcut any files with an inode of 0? Let's try it. In gnudiff/src/system.h, make the following modification to the same_files(s, t) macro:

# define same_file(s, t)
((((s)->st_ino == (t)->st_ino)
&& ((s)->st_dev == (t)->st_dev))
&& ((s)->st_ino != 0) && ((t)->st_ino != 0) \
|| same_special_file (s, t))

Then recompile with make (if necessary, type make clean; make). Now, let's see if we fixed the problem:

$ /tmp/gnudiff/Build/src/diff <(echo foo) <(echo bar)
1c1
< foo
---
> bar


YAY! That seems to have fixed the problem!


Conclusion



I possibly skipped the most important first step here, and that is use Google to see if someone else already figured out my problem! I'll probably go do that now! ;-)

In the meantime, I haven't tested this solution thoroughly, but I imagine it's safe. Hopefully, this (or some other) fix will make it into the diff code soon. G'nite.

Tuesday, February 14, 2006

The char *apple[] Argument Vector

We're all familiar with the arguments passed to the main function by the OS:


  1. int argc

  2. char *argv[]

  3. char *envp[]


But programs started on Mac OS X (i.e. Darwin) actually have access to another argument - the apple vector. The apple vector is defined as char *apple[] and it's passed as the 4th argument to the main() function (it's actually stored right after envp on the stack).

But what is it used for? Well, Apple can use the apple vector to pass whatever "hidden" parameters they want to any program. And they do actually use it, too. Currently, apple[0] contains the path where the executing binary was found on disk. What's that you say? How is apple[0] different from argv[0]? The difference is that argv[0] can be set to any arbitrary value when execve(2) is called. For example, shells often differentiate a login shell from a regular shell by starting login shells with the first character in argv[0] being a -. For example:

$ ps aux | grep -- -bash
jgm 262 0.0 0.1 27820 752 p1 S 5Feb06 0:01.58 -bash

So, we can see that the bash login shell on my Mac was started with a dash in its name. In this example, bash's argv[0] would equal -bash, but its apple[0] would contain the path to where the bash binary was actually found (likely apple[0] would be /bin/bash).

Let's write a simple program to see all this in action:

// Compile with: gcc -o apple apple.c
#include <stdio.h>
int main(int argc, char *argv[], char *envp[], char *apple[]) {
printf("argv[0] = %s\n", argv[0]);
printf("apple[0] = %s\n", apple[0]);
return 0;
}

And here's a few runs:

$ ./apple
argv[0] = ./apple
apple[0] = ./apple


$ PATH=. apple
argv[0] = apple
apple[0] = ./apple


$ PATH=/Users/jgm apple
argv[0] = apple
apple[0] = /Users/jgm/apple


So, we can see that apple[0] is not the same as argv[0] and that it contains the path to where the executing image was found on disk (taking into account the $PATH).

Now, if want to test the bash example above (where argv[0] doesn't match the binary name), we can write another small test program:

// Compile with: gcc -o exec_apple exec_apple.c
#include <unistd.h>
int main() {
char *theArgv[] = {"-apple", NULL};
execve("./apple", theArgv, NULL);
return 1;
}

And a run:

$ ./exec_apple
argv[0] = -apple
apple[0] = ./apple

So, just as we expected; argv[0] can really be set to anything by execve(2) but apple[0] should always contain the real path to the executing binary image.

Pretty neat huh?

UPDATE 10/30/2006 here

Monday, February 13, 2006

Nil and nil

Objective-C has some very interesting data types that often are misunderstood. Many of them can be found in /usr/include/objc/objc.h, or other files in that same directory. Below is a snippet taken from objc.h that shows the declaration of some of these types:


// objc.h
#import <objc/objc-api.h>

typedef struct objc_class *Class;

typedef struct objc_object {
Class isa;
} *id;

typedef struct objc_selector *SEL;
typedef id (*IMP)(id, SEL, ...);
typedef signed char BOOL;

#define YES (BOOL)1
#define NO (BOOL)0

#ifndef Nil
#define Nil 0 /* id of Nil class */
#endif

#ifndef nil
#define nil 0 /* id of Nil instance */
#endif


Let's cover some of them in a little more detail here:



id

This is not equivalent to void *. As the snippet from the header above indicates, id is a pointer to a struct objc_object, which is basically a pointer to any class derived from the Object (or NSObject) base class. Notice, that id is a pointer, so you do not need the asterisk when using id. For example: id foo = nil declares a nil pointer to any subclass of NSObject, whereas id *foo = nil declares a pointer to a pointer to a subclass of NSObject.


nil

This is equivalent to the C language's NULL value. It is defined in objc/objc.h and is used to refer to an Objective-C object instance pointer that points to nothing.


Nil

Yes, this is sort-of different than nil but they're defined in the same file. Nil (with a capital 'N') is used to define a pointer to an Objective-C class (type Class) that points to nothing.


SEL

Now this one is fun and interesting. SEL is the type of a "selector" which identifies the name of a method (not the implementation). So, for example, the methods -[Foo count] and -[Bar count] both share a selector, namely the selector "count". A SEL is a pointer to a struct objc_selector, but what the heck is an objc_selector? Well, it's defined differently depending on if you're using the GNU Objective-C runtime, or the NeXT Objective-C Runtime (like Mac OS X). Well, it ends up that Mac OS X maps SELs to simple C strings. For example, if we define a Foo class with a - (int)blah method, the code NSLog(@"SEL = %s", @selector(blah)); would output SEL = blah.


IMP

From the header above IMP is declared as id (*IMP)(id, SEL, ...), so it's a pointer to a function that takes an id (the "self" pointer), the SEL that was called, and some other variable arguments.


Method

The Method type is defined in objc/objc-class.h as:

typedef struct objc_method *Method;
struct objc_method {
SEL method_name;
char *method_types;
IMP method_imp;
};

So, this kind of ties together some of the other types that we talked about. So, a method is a type that relates selectors and implementations.


Class

From above, Class is defined to be a pointer to a struct objc_class, which is declared in objc/objc-class.h as:

struct objc_class {
struct objc_class *isa;
struct objc_class *super_class;
const char *name;
long version;
long info;
long instance_size;
struct objc_ivar_list *ivars;
struct objc_method_list **methodLists;
struct objc_cache *cache;
struct objc_protocol_list *protocols;
};

I'm not going to get into much detail here, other than to show the declaration. We'll talk more about this in a future post.





Well, that's about it for now. These are all important types and concepts in Objective-C and I thought they would be good to talk about. More later...

Saturday, February 11, 2006

Messaging nil in Objective-C

Sending a message to a nil object doesn't make much sense in many programming languages. For example, if you do this in Java you'll get the dreaded NullPointerException. But sending a message ("sending a message" in Objective-C is similar to "calling a method" in other OO languages) to a nil object is defined, okay, and incredibly useful in Objective-C. Actually, one of the most common coding idioms in objective-C is:


Foo *foo = [[Foo alloc] init];

which creates a Foo instance by sending the +alloc message to the Foo class, then sending the -init method to the returned instance. However, if +alloc fails and returns nil, the -init method will be sent to a nil object which simply ends up setting foo to nil (which is probably exactly what we'd want to happen anyway).


I'd like to see an example


OK, let's write some sample code to test this.

// Compile with: gcc -o nil nil.m -framework Foundation
#import <Foundation/Foundation.h>

@interface Foo : NSObject
- (NSString *)sayHi;
@end

@implementation Foo
- (NSString *)sayHi {
return @"Hello, World!";
}
@end

int main() {
Foo *foo = nil;
NSLog(@"Greeting = %@", [foo sayHi]);
return 0;
}

2006-02-11 20:49:26.372 nil[3406] Greeting = (null)

So, we can see that when we send the message -sayHi to a nil pointer the return value is nil.


How does this work?


The compiler turns message calls like [targetObject someSelector] into a C function call like objc_msgSend(targetObject, someSelector). So, to figure out what this returns we simply need to figure out what objc_msgSend() does when its first argument is nil. Well, we can download the source for the Objective-C runtime from Apple here. The file we're interested in is objc-msg-ppc.s (yes, it's in PPC assembly). If we search for "ENTRY _objc_msgSend" we'll see the function we're looking for. The comments are very useful in this file and we can pretty easily see that it checks if its first argument (passed in register r3), which happens to be the target object, is nil and if so it does a few other things and eventually returns nil. And since C functions on PowerPC chips return integer and pointer values in register r3 nothing needs to be done; the function simply returns and the result is that the caller thinks the function (or "message") returned nil. And since integers are returned the same way as pointers, sending a message that returns an int will return 0, simply because nil is #define'd to be 0 (/usr/include/objc/objc.h).


But what if the method returns a float?


Let's see...

#import <Foundation/Foundation.h>

@interface Foo : NSObject
- (float)blah;
@end

@implementation Foo
- (float)blah {
return 5.0;
}
@end

int main() {
Foo *foo = nil;
NSLog(@"blah = %f", [foo blah]);
return 0;
}

2006-02-11 21:20:20.948 nil[3441] blah = 0.000000

So, it looks like messages that return a float return 0.0 like we'd expect. Wrong! Change the test code as indicated:

void g(float f) {}
int main() {
g(2.0);

2006-02-11 21:22:47.094 nil[3452] blah = 2.000000

Ah-ha! Now the return value for messaging our nil object was 2.0! So, it looks like the return value in this case is whatever value happens to be in the appropriate floating point register.


Interesting! So what does it mean?


All this neat stuff means that it *is* safe to send a message to nil when:

  • The method is declared to return a pointer

  • The method is declared to return any integer value less than or equal to sizeof(void *) (32 on a 32-bit machine)



and it is NOT safe when

  • The method returns any floating point value

  • An integer value > sizeof(void *)



Also, it's usually *not* safe to message nil when the message returns a structure.

Conclusion



The ability to send messages to nil is an incredibly cool and powerful feature of Objective-C, but it may not always do what you intend. I've read that Apple is trying to standardize the behavior of messaging nil (they'll likely guarantee that it will "always" return a zero value), but this is currently not the case.

For more info check out these docs.

*DISCLAIMER: I've simplified a few things here to make this more understandable. I also did not cover issues related to messaging nil on Intel chips. Maybe I'll leave some of these things for future posts. If you have questions about any of this, or simply think I'm wrong about something, please post a comment. I'll get back to you as soon as possible. I love to discuss this stuff :-)