Tuesday, December 26, 2006

Lists and Queues

Linked lists are often among the first data structures we learn about in computer science. They're simple in concept as well as implementation; An implementation could be developed in the time allotted to a technical interview, for example. However, computer scientists and engineers are often lazy, meaning that they don't want to do more work than is needed. Thankfully, with 4.4BSD came the `sys/queue.h` header file which contains many useful functions for dealing with singly- and doubly-linked lists, tail queues, and circular queues. These functions (actually, they are implemented as C macros) are described in the `QUEUE(3)` man page.

In this post, we'll take a quick look at a few of the functions required to use these structures. We'll work through a small (and virtually useless) example that uses a doubly-linked list to store even numbers. The first thing we'll do is focus on the central data structure that we want to be "linked" together. In our example, we will create a simple data structure for holding one even number.

`struct EvenNumber {  int val;};`

In order to add this structure to a list, we need to modify it slightly by adding an entry for holding pointers to the next and previous `EvenNumber` entries.

`struct EvenNumber {  int val;  LIST_ENTRY(EvenNumber) numbers;};`

`LIST_ENTRY` is a macro that expands to a structure holding the pointers for the next and previous entries. We can see this by having `gcc` display the output from the preprocessor. (We won't look at the preprocessor output at every step in this post, but studying the output of `gcc -E` on your own can be very enlightening.)

`\$ gcc -E list.c | grep "int val" -A2 -B1struct EvenNumber {  int val;  struct { struct EvenNumber *le_next; struct EvenNumber **le_prev; } numbers;};`

OK, we've now modified our `EvenNumber` structure so that it can be inserted into a list simply by adding a `LIST_ENTRY` field to our structure. Next, we need a way to reference the list as a whole. We do this using the `LIST_HEAD` macro that declares a new structure which holds a pointer to the first element in the list.

`LIST_HEAD(EvenNumbersHead, EvenNumber) even_numbers_head;`

The preprocessor will expand this to the following (reformatted for readability).

`struct EvenNumbersHead {  struct EvenNumber *lh_first;} even_numbers_head;`

We can see that the first argument to `LIST_HEAD` is the name of the new structure that we're creating, and the second argument is the name of the structure that will be linked together in the list.

Before we can use the list, we need to initialize the head by calling `LIST_INIT` with a pointer to our "head" instance, which effectively creates an empty list.

`LIST_INIT(&even_numbers_head);`

At this point we have a doubly-linked list that is referenced by the variable `even_numbers_head`. We may add to the list instances of the `EvenNumber` structure using the macros `LIST_INSERT_HEAD`, `LIST_INSERT_BEFORE`, or `LIST_INSERT_AFTER`.

The basic structure of our program is outlined here.

`struct EvenNumber {  int val;  LIST_ENTRY(...) numbers;};LIST_HEAD(...) even_numbers_head;int main(void) {  LIST_INIT(&even_numbers_head);  for (int i = 0; i < 100; i++) {    ...    LIST_INSERT_HEAD(&even_numbers_head, ...);  }  LIST_FOREACH(..., &even_numbers_head, numbers) {    ...    LIST_REMOVE(theNum, numbers);    ...  }  return 0;}`

The full source for this example can be seen in list.c. We can compile and run the program as follows.

`\$ gcc -o list list.c -std=c99 -Wall\$ ./list[...output omitted...]1086420`

We see that the output is in reverse order. This is because we are using `LIST_INSERT_HEAD`, which adds the new entry at the beginning of the list. If we want to display the list in increasing order (ignoring that we could simply pipe the output through `sort -n`) we could add each new entry to the end (or tail) of the list rather than the head. However, for this we should use a different data structure; a `TAILQ`. TAILQs work very similarly to the LISTs already discussed, so here's the full program source using a TAILQ rather than a LIST.

For more information about these convenient functions, see the comments in `/usr/include/sys/queue.h` as well as the `QUEUE(3)` man page. Also note that these functions are available (and heavily used) within the kernel. However, when used in the kernel the header file to include is `/System/Library/Frameworks/Kernel.framework/Headers/sys/queue.h`.

...now, to see if my wife is finished getting ready yet...

Friday, December 15, 2006

A Kernel Extension... by hand

I generally like using Xcode. It typically does the job, and usually even makes the job easier. It shields the user from having to worry about mundane details that are all too common when building software. However, for the same reason it can increase a developer's productivity, it can also make it more difficult to understand what's actually going on. For example, Xcode has a template to create a Generic Kernel Extension. The template compiles with no modifications. The resulting kernel extension (KEXT) can then be loaded and unloaded without having to write one line of code. This is super cool, but it also hides some of the details of what a KEXT really is. So, today we'll write a "Hello, World" KEXT from scratch without the help of Xcode

(note the issues above do not apply to Xcode alone, rather they apply to almost all IDEs)

Well, let's just jump right in. Basically, a KEXT is a bundle on Mac OS X (it's also a "package"), which means it is a directory structure with some predefined form and an Info.plist file. Our sample KEXT will be named "MyKext.kext", and it will be in a directory structure that looks like this:

`\$ find MyKext.kextMyKext.kextMyKext.kext/ContentsMyKext.kext/Contents/Info.plistMyKext.kext/Contents/MacOSMyKext.kext/Contents/MacOS/MyKext`

To start off, we need to make the basic directory structure.
`\$ cd\$ mkdir -p MyKext.kext/Contents/MacOS\$ cd MyKext.kext/Contents/MacOS`

Now we can start writing our code. Our code will contain two main routines: `MyKextStart()` and `MyKextStop()`, which are called when the KEXT is loaded and unloaded respectively. It will also contain some required bookkeeping code that's needed in order to make our compiled binary proper. Our start and stop routines look like:
`// File: mykext.c#include <libkern/libkern.h>#include <mach/mach_types.h>kern_return_t MyKextStart(kmod_info_t *ki, void *d) {  printf("Hello, World!\n");  return KERN_SUCCESS;}kern_return_t MyKextStop(kmod_info_t *ki, void *d) {  printf("Goodbye, World!\n");  return KERN_SUCCESS;}... more to come in a minute`

After these two methods (in the same `mykext.c` file) we need to put the required bookkeeping stuff.
`extern kern_return_t _start(kmod_info_t *ki, void *data);extern kern_return_t _stop(kmod_info_t *ki, void *data);KMOD_EXPLICIT_DECL(net.unixjunkie.kext.MyKext, "1.0.0d1", _start, _stop)__private_extern__ kmod_start_func_t *_realmain = MyKextStart;__private_extern__ kmod_stop_func_t *_antimain = MyKextStop;__private_extern__ int _kext_apple_cc = __APPLE_CC__;`

This stuff basically declares some needed structures, and it also sets up our routines (`MyKextStart()` and `MyKextStop()`) so that they're called on load and unload, by assigning them to the `_realmain` and `_antimain` symbols respectively.

OK, now comes the tricky part: the compile. KEXTs are compiled statically, they can only use certain headers that are available in the kernel, and they can't link with the standard C library. These requirements basically translate into a `gcc` command like the following:

`\$ gcc -static mykext.c -o MyKext -fno-builtin -nostdlib -lkmod -r -mlong-branch -I/System/Library/Frameworks/Kernel.framework/Headers -Wall`

If the planets are properly aligned, you won't get any errors or warnings, and you'll end up with a Mach-O object file in the current directory named `MyKext`. This is your actual compiled KEXT. (At this point, it's fun to inspect this file using `otool`. For example, `otool -hV MyKext`, and `otool -l MyKext`. Read the man page for `otool(1)` for more details here.)

Now, the last thing we need to do (before we actually load this thing up) is to give our KEXT an Info.plist. The easiest way to do this is to copy another KEXT's Info.plist file, and change the names of a few things. For this example, I'm going to copy `/System/Library/Extensions/webdav_fs.kext/Contents/Info.plist`.
`\$ cd ..\$ pwd/Users/jgm/MyKext.kext/Contents\$ cp /System/Library/Extensions/webdav_fs.kext/Contents/Info.plist .`

Now, you'll need to edit the file and change the value of the "CFBundleExecutable" key to MyKext, and the value of "CFBundleIdentifier" to net.unixjunkie.kext.MyKext (or whatever you set that value to in your `mykext.c` file).

Okay, it's show time. To load any KEXT, all files in the KEXT must be owned by `root` and be in group `wheel`. The files must also have certain permissions in order to load. Here's the steps to load the KEXT.

`\$ cd /tmp\$ sudo -s# cp -rp ~/MyKext.kext .# chown -R root:wheel MyKext.kext# chmod -R 0644 MyKext.kext# kextload -v MyKext.kextkextload: extension MyKext.kext appears to be validkextload: loading extension MyKext.kextkextload: sending 1 personality to the kernelkextload: MyKext.kext loaded successfully# tail -1 /var/log/system.logDec 15 20:15:47 jgm-mac kernel[0]: Hello, World!`

We can see that our `MyKextStart()` was called. Now let's unload it and see what happens.

`# kextunload -v MyKext.kextkextunload: unload kext MyKext.kext succeeded`

Wahoo! It looks like we made a kernel extension with our own 10 fingers, and it worked! :-)

That was fun. Check out Amit Singh's Mac OS X Internals book for more cool bits about what that required "bookkeeping" stuff was in our source file, and why it's required.

Monday, November 20, 2006

Too Much Comment Spam Lately

Hey folks. I started getting tons of comment spam on this blog lately, so I had to disable the comments. Requiring a valid blogger login didn't help, nor did a captcha. Hopefully, I'll be able to turn them back on shortly.

In the meantime, if anyone has a burning comment that they can't hold on to, you can email me at:

`\$(echo tert.havkwhaxvr@arg | tr a-z@. n-za-m.@)`

Monday, October 30, 2006

UPDATE: The `char *apple[]` Argument Vector

Way back in February I posted about the the `char *apple[]` argument vector, i.e. the "secret" 4th parameter to all binaries executed on Mac OS X. I just wanted to update that post with one small piece of information.

The value of `apple[0]` isn't always the path to the executed binary image on disk. Specifically, if a symlink points to the file, `apple[0]` will refer to the symlink. For example:

`\$ cat -n apple.c     1  #include <stdio.h>     2  int main(int argc, char *argv[], char *envp[], char *apple[]) {     3    printf("apple[0] = %s\n", apple[0]);     4    return 0;     5  }\$ gcc -o apple apple.capple[0] = ./apple\$ ln -s apple foo\$ ./fooapple[0] = ./foo`

Ideally, what we'd like is the path to the executing binary image, even if it was executed via a symlink. We can do this using `apple[0]` and the function `realpath(3)` to resolve all the symlinks. For example:

`\$ cat -n apple.c      1  #include <sys/param.h>     2  #include <stdlib.h>     3  #include <stdio.h>     4  int main(int argc, char *argv[], char *envp[], char *apple[]) {     5    char resolved[PATH_MAX];     6    realpath(apple[0], resolved);     7    printf("resolved apple[0] = %s\n", resolved);     8    return 0;     9  }\$ gcc -o apple apple.c \$ ln -s apple foo\$ ./apple resolved apple[0] = /Users/jgm/apple\$ ./foo resolved apple[0] = /Users/jgm/apple`

Saturday, October 14, 2006

Google Calculator from the Command Line

I'm not sure if you're aware of Google's built-in calculator, but it's totally awesome. It does all sorts of basic calculations, almost any unit conversion you can dream up, and for you programmers, it's great for doing base conversions. But opening a web browser to www.google.com just to make sure you converted `0x34` to decimal in your head correctly, isn't always convenient (this is of course assuming that no other calculators exist, such as `bc` ;-) ).

So, I threw together a very simple command line interface to Google Calculator -- aptly named `gcalc`. It's also a good example of how Cocoa's powerful classes can let you whip up useful tools quickly, much like a scripting language. The help screen shows some example usages:

`\$ gcalcgcalc version 0.1 by Greg MillerUsage: gcalc [-d] <calculator query>example:  gcalc "5+2*2"example:  gcalc 5!example:  gcalc "sqrt(-4)"example:  gcalc "160 pounds * 4000 feet in calories"example:  gcalc avogadros numberexample:  gcalc 0b110111010 + 0x33 in decimalexample:  gcalc 22 lira in yenexample:  gcalc 2 to the power of 5`

Or if you just want to glance at the source, you can check it out here:

Saturday, October 07, 2006

LaunchServices From a `root` Daemon?

At the end of this post I mention that I'd like to find a way to start a process in the console user's session, from a root process in the startup item context. I do not know of a documented way to do this, but to be clear, there is currently (at least as of 10.4.8) a way to do it simply using LaunchServices.

LaunchServices is the recommended way to launch an application on OS X. When you double-click an item in the dock, it's launched using LaunchServices. When you double-click on a PDF file, it is LaunchServices that figures out that Preview.app is the best candidate to handle that file type (because you certainly don't have Adobe Acrobat installed), it opens Preview.app and tells it to open said file.

LaunchServices is a sub-framework the ApplicationServices umbrella framework, which is not daemon safe according to TN2083. "Daemon safe" means that daemon processes running in the root bootstrap context (aka startup item context) are allowed to link with and use the framework. One reason a framework would not be daemon safe, is if it uses the WindowServer process.

The WindowServer is not only in charge of managing windows on screen, it's also intimately involved with process management (with some help from the `loginwindow` process). As the matter a fact, when you launch an application it will ultimately be the WindowServer that does a `fork()` and `exec()` to start your process. You can use `ps` to see that the `WindowServer` is indeed the parent for most of your processes.

`\$ ps jaxww | grep WindowServe[r]windowse    59     1    59  .../CoreGraphics.framework/Resources/WindowServer -daemon\$ ps jx | awk '{if (\$3 == 59) print}'jgm    132    59    ... /System/Library/CoreServices/Dock.app/Contents/MacOS/Dock -psn_0_393217jgm    134    59    ... /System/Library/CoreServices/SystemUIServer.app/Contents/MacOS/SystemUIServer -psn_0_524289jgm    136    59    ... /System/Library/CoreServices/Finder.app/Contents/MacOS/Finder -psn_0_655361jgm    139    59    ... /Applications/Google Notifier.app/Contents/MacOS/Google Notifier -psn_0_786433jgm    197    59    ... /Applications/Utilities/Terminal.app/Contents/MacOS/Terminal -psn_0_1572865jgm    257    59    ... /Applications/Safari.app/Contents/MacOS/Safari -psn_0_2490369`

(Note that if you look at the Parent Process field in `Activity Monitor`, it will incorrectly report the `Dock` as the parent of processes that were launched by clicking their dock icon. This is incorrect, and can be verified by look at the output with `ps`.)

OK, back on topic. So we know the `WindowServer` is important to launching processes. Actually when we launch a process in typical Cocoa fashion using the `NSWorkspace` class, LaunchServices is invoked behind the scenes to send a message to the `WindowServer` via mach messages, requesting that the WindowServer `fork()` up a new process in the console user's session. So in order to use LaunchServices (or `NSWorkspace`), we need to be able to communicate with the WindowServer, which is the reason why LaunchServices (and therefore ApplicationServices and therefore Cocoa) is not "daemon safe".

However, as TN2083 points out, there is a "global window server" reference available in the startup item context. This window server reference appears to be an artifact from the past as Quinn explains in the Technote:

The reasons for this non-obvious behavior are lost in the depths of history. However, the fact that this works at all is pretty much irrelevant because there are important caveats that prevent it from being truly useful.

But the fact remains that there is a reference, and we can see it with the Bootstrap dump command.
`\$ sudo /usr/libexec/StartupItemContext ~/bin/BootstrapDump  | grep Window"com.apple.windowserver" by "/System/Library/.../Resources/WindowServer"`

However, the only users allowed to connect to the WindowServer process are `root` and the console user. (The console user is the "current" user in a fast-user-switched environment. The console user is the owner of the `/dev/console` device.) But if we're running as `root` we should be able to connect to the window server using standard LaunchServices calls. Let's try.

`\$ cat -n launch.m     1  #import <Cocoa/Cocoa.h>     2  int main(void) {     3    [[NSWorkspace sharedWorkspace]     4      launchApplication:@"TextEdit"];     5    return 0;     6  }\$ gcc -o launch launch.m -framework Cocoa\$ sudo /usr/libexec/StartupItemContext ./launch`

This will start TextEdit in my current user session. We can use Bootstrap dump again to verify that the TextEdit process is indeed running in my session, but it is. The command `/usr/bin/open` links against Cocoa, and works the same way, so I can simply type `sudo /usr/libexec/StartupItemContext /usr/bin/open -a TextEdit` to do the same thing.

As I mentioned above, only the console user and root are allowed to connect to the `WindowServer` process, so if your daemon is running as `nobody` (for example), it won't be able to do this.

`\$ sudo /usr/libexec/StartupItemContext /usr/bin/sudo -u nobody -s\$ open -a TextEditkCGErrorRangeCheck : Window Server communications from outside of session allowed for root and console user onlyINIT_Processeses(), could not establish the default connection to the WindowServer.Abort trap`

(The first command above starts a shell running as user `nobody` in the startup item context.)

So, we can see that from the startup item context, as root, we are able to launch a process in the console user's session, simply using normal LaunchServices calls. But note that this is undocumented behavior, and it is not guaranteed to work at any point in the future. It merely worked in this example.

Friday, October 06, 2006

Finder's Locum

Jumping right in, let's consider this little example:

`\$ mkdir -p foo/bar\$ sudo chown -R root:wheel fooPassword:\$ rm -rf foorm: foo/bar: Permission deniedrm: foo: Directory not empty`

We create two directories, `foo/` and a subdir `bar/`. We change both of these directories to be owned by `root` and in group `wheel`. Then, as a non-root user, we try to recursively delete `foo/`, and not too surprisingly it fails.

Notice that an error is displayed for `foo/bar` before the error for `foo/`. This is because the system call to remove a directory -- `rmdir(2)` -- requires the directory to be empty before it can be removed. This means that directory hierarchies are removed in a depth-first order. In order for `foo/` to be removed, it must be empty, so to make it empty we must remove `foo/bar/`, etc.

As a quick aside, removing a file (or directory) in Unix does not require write permission to the file! Let me repeat that. You can remove a file if you have write access to the directory in which the file resides -- even if `root` owns the file. Quick example:
`\$ mkdir test\$ cd test\$ touch hi.txt\$ sudo chown root:wheel hi.txtPassword:\$ ls -altotal 0drwxr-xr-x    3 jgm   jgm     102 Oct  6 23:13 ./drwxr-xr-x   31 jgm   jgm    1054 Oct  6 23:13 ../-rw-r--r--    1 root  wheel     0 Oct  6 23:13 hi.txt\$ rm hi.txt\$ ls -altotal 0drwxr-xr-x    2 jgm  jgm    68 Oct  6 23:15 ./drwxr-xr-x   31 jgm  jgm  1054 Oct  6 23:13 ../`

This shows that I can delete a `root` owned file as long as I have write access to the directory. (See `chmod(2)` for details about how the sticky bit on a directory affects this behavior [this is why shared tmp directories generally have the sticky bit set].)

Thinking back to the original issue, this also explains why we were unable to `rm -rf foo`. Because `foo/bar/` needed to be deleted first, but in order to delete that we need write access to its parent dir. But `foo/` was owned by `root` and we didn't have write access to it. So, all that makes sense now.

Interestingly, if we use the Finder and drag the folder `foo/` to the trash, we are able to empty the trash. We're not prompted for a password, it just works. How does the Finder do this?

Well, the Finder application links with a private Apple framework named `DesktopServicesPriv.framework`, which helps with taking out the trash. Bundled as a resource in the framework is a setuid root binary named `Locum`, which the Finder uses to delete files that it normally wouldn't have access to delete.

`\$ cd /System/Library/PrivateFrameworks/DesktopServicesPriv.framework/\$ ls -l Resources/Locum-rwsr-xr-x   1 root  wheel  108940 Mar 26  2006 Resources/Locum*`

And we can watch what happens when we drag `foo/` to the trash then empty it, using `fs_usage`.
`\$ sudo fs_usage | grep Locum...23:33:48  rmdir   /.vol/234881026/2148678/bar   0.000303   Locum         23:33:48  rmdir   /.vol/234881026/2147406/foo   0.000264   Locum  ...`

Now the question is, is this secure and safe? Well, it's probably fine. I imagine that Apple has rigorously tested and reviewed Locum. Conceivably, it has smarts to guarantee that it only deletes things out of a `.Trashes` folder. It possibly even does a little handshake to guarantee that its parent is a Finder process. This is all speculation, but the point is, I don't immediately see any big holes here (though this is very different than the behavior one would find on a typical Unix system).

I think the only remaining question is, what the hell does Locum mean? Thankfully we have Wikipedia to help us out: http://en.wikipedia.org/wiki/Locum.

Friday, September 29, 2006

User Notification from the Startup Context

Mac OS X is a great Unix, but since it's a combination of Mach and BSD it has some parts that are new to many traditional Unix users. All Unix systems have the concept of sessions with regard to collections of processes (man getsid(2) for more details), but Mac OS X adds an additional, and very different, concept of a session.

For a quick introduction to sessions on Mac OS X, see tn2083, and for the best description see Amit Singh's Mac OS X Internals book (chapters 5 & 9). In a nutshell, when the system first boots it has one session -- the root session, or the startup context. All processes started in this session will themselves be in this session. `Launchd`, `syslogd`, `configd`, and all system daemons run in the startup context. `launchd` starts `loginwindow.app` to handle user logins, but the loginwindow app also creates a new session for each user as they log in. So, every user on the system runs in their own session, and every user's session is different from the startup context (a user's session is also different each time they log in).

Apple provides a sample program called BootstrapDump that will show you all the mach services that are visible in a given context. For example, we can download and compile `BootstrapDump.c` with:

`\$ curl -s http://developer.apple.com/samplecode/BootstrapDump/BootstrapDump.zip > tmp.zip\$ unzip -q tmp.zip\$ cd BootstrapDump/\$ gcc -o BootstrapDump BootstrapDump.c\$ ./BootstrapDump..."com.apple.PowerManagement.control""com.apple.SystemConfiguration.PPPController""com.apple.network.EAPOLController""com.apple.network.IPConfiguration""com.apple.windowserver.active"...`

This program is very useful for debugging and exploring the system. We can also see the difference in the services running in my login session vs the services available to the startup context.

`\$ ./BootstrapDump | wc -l     227\$ sudo /usr/libexec/StartupItemContext ./BootstrapDump | wc -l      42`

Daemons are background processes, so they should very rarely need to display a notice to a user, but occasionally the need arises. Since only the user who's currently sitting at the console is allowed access (according to documentation) to the window server, how can daemons (or kernel extensions) notify the user of certain important events? This question really boils down to: how can something running in the startup context display a notification to the console user in the console user's context?

As it turns out, Apple has provided us with three solutions to this problem:

2) Libunc
3) KUNC

The first two are very similar (almost copy-n-paste identical), and the third is built on the first one. I won't go into detail about using these APIs, rather I'll talk about how they work. See the available documentation and header files for usage details about the APIs.

These APIs are NOT intended for use in regular applications. Rather applications that the user launches should use normal Carbon/Cocoa methods to display windows or alerts (NSRunAlertPanel, etc).

A CFUserNotification is a notification intended to be presented to a user at the console (if one is present). This is for the use of processes that do not otherwise have user interfaces, but may need occasional interaction with a user.

One of the convenience functions available to work with CFUserNotifications is `CFUserNotificationDisplayAlert()`, which is a blocking call that simply displays an alert window on the console user's screen. If we look at the source, we can see that the function `CFUserNotificationSendRequest()` is eventually called and it works by sending a mach message to the mach port named "com.apple.UNCUserNotification". This service may then be responsible for displaying the message. If we look at this service with `BootstrapDump` we can see that the service indeed exists, and that messages sent to this service will cause the `UserNotificationCenter.app` application to be launched "on demand".

`\$ BootstrapDump | grep UserNo"com.apple.UNCUserNotification" by "/System/Library/CoreServices/UserNotificationCenter.app/Contents/MacOS/UserNotificationCenter" on demand`

Let's see all this in action with a test program:
`\$ cat -n cfu.m     1  #import <CoreFoundation/CoreFoundation.h>     2  int main(void) {     3    CFUserNotificationDisplayAlert(0, 0, NULL, NULL, NULL,     4      CFSTR("header"), CFSTR("message"), CFSTR("default button"),     5      CFSTR("alt button"), CFSTR("other button"), NULL);     6    return 0;     7  }\$ gcc -o cfu cfu.m -framework CoreFoundation\$ sudo /usr/libexec/StartupItemContext ./cfu`

Then in another shell use `ps` and look for a process named UserNotificationCenter, and make note of its PID, then use BootstrapDump to see what its context looks like:

`\$ ps aux | grep UserNo[t]jgm        740   0.0 -0.2   230132   4696  ??  Ss    4:39PM   0:00.22 .../Contents/MacOS/UserNotificationCenter\$ sudo BootstrapDump 740 | wc -l     233\$ BootstrapDump \$\$ | wc -l     233\$ diff <(sudo BootstrapDump 740 ) <(BootstrapDump \$\$)`

We can see that the UserNotificationCenter process was indeed started on demand, and that its mach context looks just like mine. So we see that calling `CFUserNotificationDisplayAlert()` sends a mach message to the mach port named "com.apple.UNCUserNotification", which causes `UserNotificationCenter.app` to be launched on demand to handle the request (think of this as how inetd starts servers on demand).

The next question is, how does this work? And who's listening on the mach port named by "com.apple.UNCUserNotification" before `UserNotificationCenter.app` is launched on demand, i.e., who starts `UserNotificationCenter.app`? A little poking around on the system indicates that `loginwindow.app` knows about the "com.apple.UNCUserNotification" mach service, and it also knows the details of my mach session, so `loginwindow.app` is the most likely candidate.

`\$ cat /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow | strings | grep UserNoUNCUserNotificationcom.apple.UNCUserNotification/System/Library/CoreServices/UserNotificationCenter.app/Contents/MacOS/UserNotificationCenter`

(See this post to see I use `cat` before the strings.)

So to recap the whole thing, a daemon process running in the startup context can create a CFUserNotification, which will send a mach message to "com.apple.UNCUserNotification", `loginwindow.app` will notice this and fork and exec `UserNotificationCenter.app` in the correct user session to actually handle displaying the user notification window.

Libunc - in libSystem

This is a system level re-implementation of the same functionality provided by CFUserNotification. Note that it does not use CFUserNotification to do its job, rather it's actually a reimplementation of it. The source is available here.

KUNC

As strange as it may seem to some, the situation may even arise where a kernel extension may need to display a notification to a user. And sure enough, Apple has provided us with this functionality too. The KUNC APIs can be used to do just this. Apple has some documentation about these APIs here, and the source is available as part of xnu at here.

Code running within the kernel may call an API, for example, `KUNCUserNotificationDisplayNotice()`, that will display a notification window on the console user's screen (and running in their context). This works very much like CFUserNotifications, and as the matter a fact, it is built on the CFUserNotification API. The main difference is that KUNC uses another userland daemon (`/usr/libexec/kuncd`) to transform KUNC API calls to CFUserNotification API calls. Presumably, this is done because kernel extensions can't link against CoreFoundation.

The kernel maintains a special mach port called the user notification port. This port can be get and set using the MIG calls `host_set_UNDServer()` and `host_get_UNDServer()`. When `launchd` is starting up upon boot, it runs the program `/usr/libexec/register_mach_bootstrap_servers` to register old-style mach servers defined in `/etc/mach_init.d`. One of these servers is `/usr/libexec/kuncd` (registered with `/etc/mach_init.d/kuncd.plist`). When this server is registered, `register_mach_bootstrap_servers` calls `host_set_UNDServer()` to set the host user notification port to a port that will start `/usr/libexec/kuncd` on demand. As we can see from `BootstrapDump` (and `kuncd.plist`) the service is named "com.apple.system.Kernel[UNC]Notifications".

`\$ BootstrapDump | grep kunc"com.apple.system.Kernel[UNC]Notifications" by "/usr/libexec/kuncd" on demand`

After this, KUNC APIs called from within the kernel (note: KUNC can only be called from within the kernel) send a message to the host user notification port, and `/usr/libexec/kuncd` is started on demand to handle the request. `kuncd` itself is started by `launchd` and runs in the startup item context, but it simply makes CFUserNotification calls which then handle the rest of the request exactly as described above.

KUNC actually has one more interesting function: `KUNCExecute()`, which takes a path to an executable, a UID, and a GID. It doesn't make much sense to have the kernel `fork()` and `exec()` like a normal process, so it's interesting to consider how this works. Basically, `KUNCExecute()` again sends a mach message to the host user notification port, `/usr/libexec/kuncd` answers that request, but this time CFUserNotification is not used. Rather the execution is handled by a call to `_SCDPExecCommand()` from the SystemConfiguration framework. This method simply does a `fork()`, `setuid()`, and `execv()`, of the binary. The key point here is that the program run with `KUNCExecute()` is NOT run in a user's login context, rather it's just run in the startup context like `launchd` and other daemons.

All of these notification methods can be visualized by considering this figure:

However, note that the libunc/libSystem approach mentioned above is not pictured. If it were, it would show up as another framework (library) in magenta similar to CoreFoundation in the figure.

I was recently asked how a daemon could display a notification to a user, and I wasn't sure. But after a little poking around, it appears that there's a few good options. I had also hoped to find a way to run a command as the console user, but I didn't find that. If someone knows how to do that, I'd love to know. Also, if anyone sees any inaccuracies in what I've written, please let me know.

[UPDATE: 10/7/2006:

Monday, September 25, 2006

Interesting behavior of the `strings` command

I got a little frustrated with `strings` the other day, so I figured I'd share. `strings` appears to try to be "smart" about handling fat (universal) binaries by only processing the binary for the host architecture. This means that if you run `strings /some/binary/file` it may not actually show you all the strings in the file if the binary is fat. An alternative is to instead use `cat /some/binary/file | strings` or you can use the undocumented `-arch` option, like `strings -arch all /some/binary/file`.

For example:

`\$ cd /Applications/Camino.app/Contents/MacOS/\$ file Camino Camino: Mach-O fat file with 2 architecturesCamino (for architecture i386): Mach-O executable i386Camino (for architecture ppc):  Mach-O executable ppc\$ strings Camino | wc -l   27224\$ strings -arch ppc Camino | wc -l   27224\$ strings -arch i386 Camino | wc -l   27654\$ strings -arch all Camino | wc -l   54878`

Monday, September 04, 2006

Software Design Patterns

Why they are good and how to write them.

Wednesday, August 30, 2006

Pruning Empty Directories

Just a quickie here. I had a fairly large directory structure that was pretty deep and had many empty directories, where an "empty" directory may still have subdirectories which are themselves empty (i.e. if I did `mkdir -p foo/bar/baz`, `foo/` would be "empty"). The quickest way I found to clean up all the empty directories was to have `find` do a depth-first traversal (`-d`) of the directory structure, then use `rmdir`, which only deletes empty directories. Something like:

`\$ find -d . -print0 | xargs -0 rmdir 2> /dev/null`

Sunday, August 20, 2006

APUE2e Acknowledgement

It's no secret that Advanced Programming in the UNIX Environment, by W. Richard Stevens and Stephen A. Rago, is a staple for all developers who write code for any flavor of Unix. If you don't have have at least one copy already, I'd strongly encourage you to pick one up.

That said, don't forget to check out the book's website at apuebook.com. And while you're there, check out item 18 on the "Additional Acknowledgements" page ;-) Anyway, here's a little bit more detail about that particular issue.

The SUSv3 (Single Unix Specification version 3) states that... "If `connect()` fails, the state of the socket is unspecified. Conforming applications should close the file descriptor and create a new socket before attempting to reconnect." And as an example, retrying `connect()` doesn't always work on Darwin 8.6.0 and FreeBSD 6.0-RC1 (the only versions of these OSes that I checked).

The case I found where retrying `connect()` doesn't work is when I try to `connect()` to a port that's not listening. The client (calling `connect()`) sends the SYN, a RST is received (as expected) and `connect()` returns -1 with errno set to ECONNREFUSED. This is all as expected. However, if that same socket is used to attempt the `connect()` again, no packets are sent and `connect()` immediately fails with EINVAL. This code illustrates:

`int main(void) { struct sockaddr_in remote_addr; bzero(&remote_addr, sizeof(remote_addr)); remote_addr.sin_family = AF_INET; remote_addr.sin_port = htons(3333); inet_pton(AF_INET, "127.0.0.1", &remote_addr.sin_addr); int sock = socket(AF_INET, SOCK_STREAM, 0); while (connect(sock, (struct sockaddr *)&remote_addr,sizeof(remote_addr)) == -1) {   perror("failed to connect");   sleep(2); } ...}`

Again, on Darwin and FreeBSD the second time through the while-loop, EINVAL is immediately returned. And since no packets are actually sent, if the port at 127.0.0.1:3333 ever does open up, it will not be detected.

On the 2.4 Linux kernel I tested, the code does what I initially expected and it returns ECONNREFUSED every time.

Since, the SUSv3 says that a failed `connect()` leaves the socket in an undefined state, I don't think this is actually a bug. But it looks like it also means that the `connect_retry()` code in figure 16.9 (of APUE2e) is not portable.

So, to summarize, the issue is that if a `connect()` call fails for any reason, the state of the socket is undefined. To be portable, you must close the socket and create a new one before calling `connect()` again.

When I emailed Stephen Rago about this, he was very responsive and nice. He feels that this bug lies with the sockets implementation, but he added an FAQ on the book's website about it anyway.

Again, if you're reading this blog, and you don't already have a copy of this book, you should probably go get one now.

Thursday, August 17, 2006

Old (but useful) Shell Tricks

I used to have a somewhat long list of somewhat interesting Unix/Shell tips-n-tricks on an old version of my website. A few folks asked me where it was, and as it's no longer available, so I figured I'd repost a handful of the tips:

1. From within an executing script, how do I find the directory where the script lives?

A simple `pwd` doesn't work because it gives the Present Working Directory, and we want to know the directory where the script actually resides on disk. I can't remember the previous solution I came up with, but here's another solution that should work:
`#!/bin/bashdir=\$(dirname \$(echo \$0 | sed -e "s,^\([^/]\),\$(pwd)/\1,"))echo I live in \$dir`

Here how it works. `\$0` is the name of the script as it was executed, so this may be `foo.sh`, `./foo.sh`, `/tmp/foo.sh`, etc.. This gets sent to `sed`, which then uses a basic regular expression that says "If the first character of `\$0` was NOT a forward slash, then prepend `\$0` with my present working directory (`\$(pwd)`) followed by whatever that first character was (`\1`), finally, take the `dirname` of this value, then assign it to `dir`". If the first character in `\$0` is a forward slash, then we were invoked via an absolute path and so we don't want to change anything.

2. How do I copy a directory structure from one machine to another?

`\$ tar cf - some_directory | ssh kramer "( cd /path/to/destination; tar xf - )"`

`tar cf - some_directory` creates a tar file of `some_directory` but the dash (`-`) tells tar to write to STDOUT instead of writing to an actual file on disk. The STDOUT from the first tar command is piped to STDIN of the next command. The right hand side of the pipe says to log into the host `kramer` using `ssh` and run the commands `cd` and `tar xf -`. The trick is with the commands `"( cd /path/to/destination; tar xf - )"`. The parens create a subshell, in which the current directory is changed to `/path/to/destination`, and `tar xf -` reads from STDIN and extracts the tar file. This STDIN is the same STDIN that was sent to us over the pipe from STDOUT of the first tar command. Thus the directory structure on `jerry` gets `tar`'d up, transfered to `kramer`, then extracted all in one fell swoop.

3. How do I diff two files on different machines (using Bash)?

`\$ diff <(ssh -n george cat /etc/passwd) <(ssh -n kramer cat /etc/passwd)`

This is just the Bash Process Substitution trick.

4. How can I run a shell script on a remote host without copying the script out?

One way would be:
`jerry:~\$ cat myscript.sh | ssh kramer /bin/sh`

This one is pretty simple how it works, but it is often overlooked as an option for getting things done. The shell script is written to STDOUT. `/bin/sh` is executed on the remote server, and it reads `myscript.sh` on STDIN, thus executing the local copy of the script. This is way convient for some things. The only problem I see with this one is that you can't pass command line arguments to the script.

5. How can I run long pipe lines of commands on a remote host via SSH without escaping all the meta characters?

`jerry:~\$ ssh kramer <<EOFps -ef | grep http | awk '{print \\$NF}'EOF`

The only tricks here are the use of a bash here document, and the fact that the command line is typed directly into `ssh`'s STDIN so there's no need to escape things like pipes and semi-colons, etc. However, notice that you do still need to escape dollar signs because they'll still be interpreted as shell variables.

6. How do I change every occurance of a string in multiple files?

Why, use perl pie!
`perl -p -i -e 's/jerry/george/g' *.txt`

See `perl -h` for a description of the flags.

Friday, August 11, 2006

A quick read of Mac OS X Internals

I often hear people comment on how thick the Mac OS X Internals book is. Well, it is thick, but not too thick. I've already read it cover-to-cover twice, and will show that it's even possible to tackle in one sitting. Check it out. ;-)

Sunday, August 06, 2006

WWDC 2006

I'm getting ready for WWDC which starts tomorrow. I'm very excited, and hopefully it'll fuel some interesting posts coming up.

Friday, August 04, 2006

Tracing Objective-C Messages

Tools like `strace`, `ltrace`, `truss`, `ktrace`, etc, are very cool, and necessary if you really want to understand how things work. They allow you to watch what a process is doing by showing you when certain functions are called. It would also be really cool if we could see similar information as Objective-C messages are sent.

So, I read through the Objective-C runtime code and discovered a way. A few days later I found a good blog post by Dave Dribin here that outlines the basic idea that I had used. However, his solution requires you to recompile `libobjc.dylib`, which is undesirable as well as unrealistic in many cases.

Please take a few moments to read his post (again, here), then come back and read the rest of this...

...

OK, as he explains, the symbol that we want access to "`_logObjcMessageSends`" isn't exported (remember, `nm` showed it as a little "t") so he rebuilds the libobjc dylib in order to export the symbol. I'd like to propose an alternate solution that doesn't require touching `libobjc.dylib`.

Rather than looking up the symbol address using `dlsym()`, we should use the often overlooked `nlist(3)` function, which will return us the address of "private" symbols. So, in our dylib that we want to insert with `DYLD_INSERT_LIBRARIES`, we could have code like:

`...typedef int (*ObjCLogProc)(BOOL, const char *, const char *, SEL);typedef int (*LogObjcMessageSendsFunc)(ObjCLogProc);struct nlist nl[2];bzero(&nl, sizeof(struct nlist) * 2);nl[0].n_un.n_name = "_logObjcMessageSends";if (nlist("/usr/lib/libobjc.dylib", nl) < 0 || nl[0].n_type == N_UNDF) {  fprintf(stderr, "nlist(%s, %s) failed\n",          "/usr/lib/libobjc.dylib",          nl[0].n_un.n_name);  return;}LogObjcMessageSendsFunc fcn = (LogObjcMessageSendsFunc) nl[0].n_value;(fcn)(&MyLogObjCMessageSendFunction);...`

This code uses `nlist()` to look up the address of `_logObjcMessageSends`. The symbol it's looking up happens to be "private", but that's OK. Then once it has the address of the symbol, it casts it to a pointer to a function with the correct signature. Once that's done, the new function pointer is used just like any ol' function.

So, this solution works just like Dave Dribin's, but it doesn't require a recompile of the Objective-C runtime.

Saturday, July 29, 2006

Access `argc` and `argv` from Anywhere

Say you're in some random function that's stuck deep down in the middle of some big program you're writing on the Mac. And now assume that you'd like to have access to the arguments that were passed to `main()` when the program started. How can you do this?

Well, as it turns out, this work has already been done for us. Let's have a look inside `/usr/lib/libSystem.B.dylib` shall we.

`\$ nm /usr/lib/libSystem.B.dylib | grep _NSGetArg...9003a382 T __NSGetArgc90020f60 T __NSGetArgv...`

Ahh, so it looks like we found some symbols (functions) whose names look very revealing (man `nm` to see more details on what the output means above). The C compiler automatically prepends a leading underscore to symbol names, so the actual function names should be: `_NSGetArgc(void)` and `_NSGetArgv(void)`. Let's pick one and try to use it.

Since the function `NSGetArgc()` is not declared in any header files (that I know of), we'll need to declare it ourselves. But we want to link against the version of the function that's in libSystem, so we'll declare it `extern`. Let's take a first stab:
`\$ cat nsargc.c#include <stdio.h>extern int _NSGetArgc(void);int main(void) {  printf("argc=%d\n", _NSGetArgc());  return 0;}\$ gcc -o nsargc nsargc.c -Wall -std=c99\$ ./nsargc foo barargc=8192`

Well, that's certainly not correct. Maybe we're using the function wrong. The real "argc" should be an int, but maybe this function returns a pointer to it instead of returning the actual value. Let's try that:
`\$ cat nsargc.c#include <stdio.h>extern int *_NSGetArgc(void);int main(void) {  printf("argc=%d\n", *_NSGetArgc());  return 0;}\$ gcc -o nsargc nsargc.c -Wall -std=c99\$ ./nsargc foo barargc=3`

Hey! Now that's more like it. So it looks like these functions may return pointers to the values we want rather than the actual values we want. Now let's try this with argv as well.
`\$ cat NSGetArgs.c#include <stdio.h>extern int *_NSGetArgc(void);extern char ***_NSGetArgv(void);void DoStuff(void) {  printf("%20s =  %d\n", "_NSGetArgc()", *_NSGetArgc());  char **argv = *_NSGetArgv();  for (int i = 0; argv[i] != NULL; ++i)    printf("%15s [%02d] = '%s'\n", "_NSGetArgv()", i, argv[i]);}int main(void) {  DoStuff();  return 0;}\$ gcc -o NSGetArgs NSGetArgs.c -Wall -std=c99\$ ./NSGetArgs foo bar        _NSGetArgc() =  3   _NSGetArgv() [00] = './NSGetArgs'   _NSGetArgv() [01] = 'foo'   _NSGetArgv() [02] = 'bar'`

Sweet, it looks like that works. So now we can get access to argc and argv from anywhere within a program. And notice that we didn't even need to declare the arguments in main's signature.

Here's a few similar functions that may be interesting.
`\$ nm /usr/lib/libSystem.B.dylib | grep _NSGet | grep ' T '9002a55f T _NSGetNextSearchPathEnumeration9003a382 T __NSGetArgc90020f60 T __NSGetArgv90003074 T __NSGetEnviron90029e2d T __NSGetMachExecuteHeader90027506 T __NSGetProgname9014aa84 T _NSGetSectionDataInObjectFileImage90036106 T __NSGetExecutablePath`

Tuesday, July 25, 2006

The Singleton Smell

I'm a big fan of using Object Oriented design patterns, especially the classics popularized by the GoF. Design patterns are the OO analog of algorithms. Just like a good software engineer needs to be knowledgeable about data structures and algorithms, knowledge of OO patterns is a requirement in todays OO software world (disclaimer: the previous statement is only my opinion). And just like bubblesort can be misapplied to solve a problem, OO patterns can also be used inappropriately.

One of the most (if not the most) misused patterns is the Singleton. The singleton can be very useful when applied correctly, but it can also make for some poorly designed and almost unmaintainable software. The singleton is probably the easiest pattern to understand from the GoF's Design Patterns book (above), which may be why it's so often misused by engineers who are new to patterns. In a nutshell, the singleton pattern attempts to ensure that only one instance of a class is every created. Many callers may use the class, they just end up using the same instance.

Code that uses a lot of singletons gives off a distinct smell. A small that a software engineer should recognize. It's similar to, but more potent than, the smell given off by global variables, because a singleton is effectively a global variable. Singletons are generally global in scope, thus allowing any class, at any level, access to the singleton (read: global variable). This makes for classes that are tightly coupled.

Additionally, singletons let you design classes (or more (in)appropriately, "implementations") without having to think about the class's interface, or how they interact with other classes. The singleton doesn't need to be an argument to the class's constructor, or to methods, so it's often not considered when thinking through the object model for your code. This is similar to the way that C functions don't need to declare global variables in their parameter list, because again, they're globals, so the function's interface doesn't need to consider them. And I think we'd probably all agree that global variables are generally not the best idea.

Singletons are often misused in situations where you have multiple classes that all need to access the same instance of an object. In this case, the singleton is used as a convenience to let all the classes access this one central global variable. It would likely be a better idea to think about the class' interfaces, possibly add an extra parameter here and there, and avoid the singleton altogether. Now, not only do the classes communicate interface to interface, but you may also have a new reusable class (the one that used to be a singleton)!

It's generally possible (and usually a good idea) to replace singletons with non-singletons, but it often requires a little extra though. But this is a good thing and it's one of the best reasons to get rid of singletons. Once you think through your class's interfaces, you'll likely discover that you can loosen the coupling of your classes by pushing their interaction out to their interfaces rather than leaving it buried down in their implementations. This also allows you to add documentation about this interaction in the class's *interface* rather than code comments in its implementation.

Singletons can also make unit testing difficult. If class A uses the singleton class B in A's implementation, then unit testing A requires that B be all setup and able to run correctly. However, if A's constructor were changed to take a B as an argument, then A's unit test could simply create a mock B object, and just focus on the testing of A (which is what a unit test is supposed to do).

Now, singletons aren't always evil. They can be very useful in some situations. I won't go into those examples now, but I just want to be on record as having said that they are not always bad. The main point here is, do not jump to a singleton solution just for convenience. Singletons are convenient because they can be accessed from anywhere (think, global variables), but this should be avoided in favor thinking through class interactions and making your classes communicate via their interfaces. The singleton pattern should be used when you really need to ensure that only one instance of a class is ever created. And don't use a singleton without fully understanding why you really need one.

So, please take a big whiff of your code. Do you smell a lot of singletons? If so, consider refactoring it or make sure you understand why you actually need them. The extra thought you put into your class design now will pay back double when it comes time for maintenance.

Uh, sorry this was a little out of order and jumbled... I just wanted to jot down a few thoughts that were floating around my head on the way home tonight.

Thursday, July 20, 2006

Command Line Processing in Cocoa

Why would you want to process the command line in Cocoa? I mean, Cocoa's all GUI and other cool stuff, right? Well, yes.. but there's really nothing quite as cool as the command line, is there? Good, so let's get to it.

The typical ways to parse command line arguments on Unix systems are to use either `getopt()`, `getopt_long()`, or just parse argv yourself. Well now Cocoa (actually, it's Foundation that provides this) offers an even easier alternative. Enter `NSUserDefaults`. Let's just jump right to a quick example.

`// File: args.m// Compile with: gcc -o args args.m -framework Foundation#import <Foundation/Foundation.h>int main(int argc, char *argv[]) {  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];  NSUserDefaults *args = [NSUserDefaults standardUserDefaults];  NSLog(@"boolArg   = %d", [args boolForKey:@"boolArg"]);  NSLog(@"intArg    = %d", [args integerForKey:@"intArg"]);  NSLog(@"floatArg  = %f", [args floatForKey:@"floatArg"]);  NSLog(@"stringArg = %@", [args stringForKey:@"stringArg"]);  [pool release];  return 0;}`

First we get a access to the standard user defaults object, which will actually do all the parsing of the command line. Then we just access command line arguments like they were stored in the user defaults object. Here's a few sample runs:

`\$ ./args2006-07-20 22:24:52.996 args[21633] boolArg   = 02006-07-20 22:24:52.997 args[21633] intArg    = 02006-07-20 22:24:52.997 args[21633] floatArg  = 0.0000002006-07-20 22:24:52.997 args[21633] stringArg = (null)\$ ./args -intArg 182006-07-20 22:25:41.923 args[21640] boolArg   = 02006-07-20 22:25:41.923 args[21640] intArg    = 182006-07-20 22:25:41.923 args[21640] floatArg  = 0.0000002006-07-20 22:25:41.923 args[21640] stringArg = (null)\$ ./args -intArg 18 -stringArg "foo bar" -floatArg 3.14159 -boolArg YES2006-07-20 22:26:15.129 args[21644] boolArg   = 12006-07-20 22:26:15.129 args[21644] intArg    = 182006-07-20 22:26:15.129 args[21644] floatArg  = 3.1415902006-07-20 22:26:15.129 args[21644] stringArg = foo bar`

Arguments are case-sensitive, they can be specified in any order, and in general, NSUserDefaults is pretty smart about processing them. For example, the a true bool value can be specified as YES, Y, y, 1, 123, etc, whereas a false bool value can be NO, no, n, 0, etc. Also, note that arguments are specified with a single leading `-` rather than `--` which is typical of most "long" Unix command line options.

I haven't verified this, but I'd imagine that `NSUserDefaults` access argv via the `NSProcessInfo` class. It also leaves the argv that's passed to main unharmed in case you want to do any additional processing.

Tuesday, July 18, 2006

What I'm reading: Mac OS X Internals

It's been a while so I figured I should post something. It'd be nice if I blogged as often as I read books, so I think I'll try to make "What I'm Reading" posts and talk about some of the tech books that I'm reading.

Anyway, I just finished Amit Singh's book Mac OS X Internals, and it's fantastic. It's great. It's packed full of awesome technical details and it reads very well. You can check out the book and my review of it at Amazon.

Sunday, May 21, 2006

Fetching Darwin Source the Simple Way

It's been a while since my last post, so I figured I'd post some little thing...

If you're at all like me, you love reading OpenSource source code to see how things are really done. Like, maybe you want to see how bash does process substitution, or you want to see if shell IO redirection is done using `dup`, `dup2`, or `open`. The easy way to answer these questions is to simply read the source code. And since we're all Mac users here, we'll choose to read the Darwin source.

So, I wrote a simple little (no joke, really simple and little) script to let you see what source is available and download and extract it for you. It's called `snagdar.sh`, and can be found here.

When run with no arguments it simply displays a list of all available Darwin packages. This is useful for grepping to find the package you may want. Then, once you find the package, just run `snagdar.sh` again passing it a regexp to match the package you want. If multiple packages match the regex, they will all be snagged.

So, say I want to see how lsof(8) works, I can run:

`\$ snagdar.sh | grep lsoflsof    20      Other`

to see if an lsof package exists. We see that it does, and we'd like the source for it, so we can simply do:
`\$ snagdar.sh lsof+++++ Snagging http://darwinsource.opendarwin.org/tarballs/other/lsof-20.tar.gz  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current                                 Dload  Upload   Total   Spent    Left  Speed100  549k  100  549k    0     0   573k      0 --:--:-- --:--:-- --:--:--  639k`

Then, I'll end up with a directory named `lsof-20/` in my current dir which contains the source for lsof.

This is a just a small, simple script that happens to make my daily life a little bit easier.

UPDATE: 9/8/2006
Since Open Darwin has closed its doors, this script no longer works. Darwin source can still be fetched from Apple, but it requires a login to download. I'll update the script when I get time.

UPDATE: 5/10/2007
Snagdar now works again. See the this post

Tuesday, April 18, 2006

MacBook Pro

I got my work supplied MacBook Pro last week, and I've got two words... AWE-SOME! It's waaaay fast. On my dual 2GHz G5, Firefox bounces the dock icon about 5-8 times before opening. With the new universal Firefox binary, it bounces 1-2x on my MBP! That's friggin' fast! Now, I don't actually use Firefox (mostly because it's historically been too slow to start), but it's still an interesting measure.

Here's a few other interesting little tid-bits I came across on my first run through the new system:

• You can tell if an application is universal by looking at the "Get Info" window in the Finder, or by using the `file` command from the Terminal. And you can tell if a process is actually running under Rosetta at runtime because `/usr/libexec/oah/translate` will be mapped into its address space. Just
`lsof -p PID | grep translate`

• You can run an application under Rosetta from the command line by using the command `/usr/libexec/oah/translate`. For example:
`/usr/libexec/oah/translate /bin/ls`

• On PPC functions arguments are passed in CPU registers starting with \$r3. So, the function call `foo(1, 2, 3)` would have `0x1` in `\$r3`, `0x2` in `\$r4`, etc. On Intel function arguments are passed on the stack, so the function call `foo(1, 2, 3)` would have `0x1` at `\$ebp+8`, `0x2` at `\$ebp+12`, etc.

• In Objective-C, a method call like `[foo add:5]` actually gets compiled into a C function call like
`objc_msgSend(self, @selector(add:), 5)`
And as we just saw, Intel Macs pass function arguments on the stack. So, the standard way to print "`self`" in `gdb` on a PPC Mac is
`po \$r3`
(remember, `\$r3` has the first argument on PPC -- "self"), but on Intel it turns into
`po *(int *)(\$ebp+8)`
(`po` is `print-object`).

• If you need to debug (using `gdb`) a PPC binary on an Intel Mac, you can do some basic stuff by setting the `OAH_GDB` environment variable to `YES`, then starting the application. Then in a new window, start gdb like
`gdb --oah`
then use gdb's `attach` command to attach to the running process like normal. This will even show you PPC style registers and stuff in gdb. Pretty cool for basic debugging.

Saturday, April 08, 2006

Fantastic Darwin Code Browser Online

The source for Darwin is available online here, but there's not a good way to browse and search the code. Until now. Enter the OpenGrok source browser for Darwin. It's friggin' sweet and "wicked fast".

Thursday, April 06, 2006

The little hidden built-in calculator

I just learned this little trick a few days ago, but I guess it's not really all that new. Almost all Cocoa text areas can perform calculations right inline. You just have to type a calculation, highlight it, hit `Command-Shift-8`, then the calculation will be evaluated and replaced by the answer. Pretty cool. For example, if I type `3^(2*pi)`, then highlight it and hit `Command-Shift-8`, it will be replaced with `995.041644892855`.

You'll probably notice that the Sript Editor application opens when you do this. That's because the feature is implemented as a Service provided by the Script Editor. What's really happening is that the highlighted text is being executed as AppleScript. So, as I sit here typing into this text area in Safari, I could highlight and execute

`tell application "Safari" to display dialog "Hello"`
(don't ask me why I'd want to run that AppleScript from right here).

I'm sure there's much cooler stuff you can do wit this trick, but regardless it's pretty cool.

Tuesday, March 28, 2006

Ignore case in vim searches

Ahhh, there it is! I've often wanted to know how to make "/" searches in vim case insensitive, and today somebody at work enlightened me. If `\c` appears anywhere in a pattern the whole pattern is assumed to be case insensitive. So to search for the string "root" while ignoring case you'd use

`/\croot`
Take a look at `:help ignorecase` in vim for more info.

Saturday, March 25, 2006

The difference between `foo()` and `foo(void)`

We often see two different ways to declare a method that takes no arguments. The two common forms are:

1. `int foo();`

2. `int foo(void);`

Which one is correct? And what's the difference?

Well, in C++ the two forms are equivalent and they both declare a function that takes no arguments. If you try to call the function and pass in an argument, the compile will give an error.

On the other hand, C and Objective-C both treat the two forms differently. In these languages the first form declares a function that takes an unknown number of arguments, whereas the second form declares a function that takes no arguments at all. So, in C the following is valid code:
`int foo() {  return 5;}int main() {  return foo(1, 2, 3);}`
The compiler doesn't complain, and the code runs fine (a C++ compiler would give an error on this same code).

Generally what you want in C and Objective-C is to use the second form and include the explicit `void` to indicate that the function should take no arguments. However, it's more common in C++ to use the first form because it's equivalent and shorter.

Wednesday, March 22, 2006

Strange difference in `ps` output

I was asked an interesting question today. It was basically something like, "when I type `ps -ef` on my Linux box it displays nice wide output, but when I pipe it through `grep` [or pipe through anything] I have to use the `-l` option to `ps` in order for it to display long output".

To illustrate the problem we need a command with a long command line that will run long enough for us to see it. This sounds like a job for `sleep(1)` with a reeeeaaaallly long time argument.

`\$ sleep 1000000000000000000000000000000 &\$ ps -efUID        PID  PPID  C STIME TTY          TIME CMD[... output omitted ...]jgm       8264  8156  0 01:03 pts/0    00:00:00 sleep 1000000000000000000000000000000[... output omitted ...]\$ ps -ef | grep sleepjgm       8264  8156  0 01:03 pts/0    00:00:00 sleep 10000000000000000000000000`

So, it's clear that when we grep `ps` output, some of the zeros are truncated from the command. But why?

Well, my assumption is that `ps` checks if its output descriptor (FD 1) is a terminal, and if so, it detects the width of the terminal so that the output is nicely formatted on the terminal and the lines do not wrap. When we use grep the output file descriptor for `ps` is *not* a terminal (it's a pipe) so it has no "width". In this case `ps` has to guess a width. And it appears to pick the standard 80 columns wide.

`\$ ps -ef | grep sleep | wc      1       9      81`

Yep, 80 chars (plus the newline character).

The typical C function used to determine if a file descriptor refers to a terminal is `isatty(3)`, which likely translates to an `ioctl(2)` system call. Let's see if we can verify our hypothesis using `strace(1)`, which is a Linux tool that allows us to see system calls.

(strace writes output to STDERR so we need to grep STDERR)
`\$ strace ps -ef 2>&1 | grep ioctlioctl(1, 0x5413, 0xbffff100)            = -1 EINVAL (Invalid argument)read(7, "grep\0ioctl\0", 2047)          = 11write(1, "   00:00:00 grep ioctl\njgm      "..., 78   00:00:00 grep ioctl`

Hmm, we do see a failed `ioctl()` call. The first argument is 1, which is STDOUT, the second arg is 0x5413, and the third is just something on my stack. After reading the the `ioctl(2)` man page I see that the second argument is the "request type". So, let's grep through some of the standard system headers to figure out what this 0x5413 thing is.

`\$ grep -r 0x5413 /usr/include/*/ioctl*/usr/include/asm/ioctls.h:#define TIOCGWINSZ    0x5413`

Ahhh! 0x5413 indicates a request to get the window size. Just like we thought. Looking back at the strace output we see that the `ioctl()` call to get the window size failed (`EINVAL`) so it couldn't get the window size, and must have just used a default value.

Now, the one last check we can do is take a look at the strace output when the output actually does go to a terminal.

`\$ strace ps -ef 2> ps.out[... regular ps output omittied ...]\$ grep ioctl ps.outioctl(1, 0x5413, {ws_row=24, ws_col=141, ws_xpixel=846, ws_ypixel=336}) = 0`

OK, so it looks like our assumption was correct, and I think we verified it thoroughly enough.

New Gmail Notifier

Google released a new Gmail Notifier for Mac OS X today. It just has a few small updates, like automatic self updating and it's now a universal binary, and best of all it finally has non-ugly icons!

Sunday, February 26, 2006

Wow! If you haven't checked out Google's new Page Creator product, drop what you're doing and check it out now. It's a WYSIWYG web page creator, done online in AJAX, that generates XHTML strict sites very easily. It's really cool and fun to play with. And again, it amazes me what Google can do with JavaScript.

Sunday, February 19, 2006

Change `__MyCompanyName__` in Xcode

If you're sick of seeing `__MyCompanyName__` in the header comments of all your Xcode files, you can set the default company name in Xcode with:

`defaults write com.apple.Xcode PBXCustomTemplateMacroDefinitions-dict ORGANIZATIONNAME "Blah, Inc"`

(all entered on one line, of course)

Thursday, February 16, 2006

Squashing a Real Bug on Darwin

I previously talked about a really cool bash trick called process substitution, which allows you to use a process almost anywhere you can use a file. For example, `diff <(ls dir1) <(ls dir2)` would allow me to diff the contents of `dir1` and `dir2`.

The Problem

For the most part, process substitution works great on Mac OS X, but there are cases where it doesn't work. For example:
`\$ diff <(echo foo) <(echo bar)`

produces no output, when clearly the string "foo" differs from the string "bar". But why?

Troubleshooting

Well, let's check out the tools in our toolbox: gdb, gcc, vm_stat, vmmap, nm, otool, stat, etc. Hmm, let's try a few more experiments first.
`\$ diff <(echo foo) <(echo bar)\$ diff <(echo foo) <(echo barX)1c1< foo---> barX\$ diff <(echo foo) <(echo bar) \$ diff <(echo foo) <(sleep 1; echo bar)1c1< foo---> bar`

Interesting...
`\$ stat <(echo foo) <(echo bar)520093697 0 prw-rw---- 1 jgm jgm 0 4 "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" 512 8 0 /dev/fd/63520093697 0 prw-rw---- 1 jgm jgm 0 4 "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" "Feb 16 21:12:21 2006" 512 8 0 /dev/fd/62`

Very, interesting! According to `stat(1)` the two named pipes created have identical attributes except for the file name. What's more is that the two files have the same inode number!? But what's that you say? How can a two different files (i.e. not hard-links) on the same filesystem have the same inode number? According to POSIX this isn't allowed. So maybe `diff` is trying to do a quick short circuit and saying "hey, the files have the same inode number and are on the same filesystem, so they must be the same". Maybe.

Well, let's get the source code for diff and check it out.
`\$ curl http://darwinsource.opendarwin.org/tarballs/other/gnudiff-13.tar.gz > gnudiff-13.tar.gz\$ tar -zxvf gnudiff-13.tar.g\$ cd gnudiff-13\$ make`

OK, now let's test our freshly built `diff`.
`\$ /tmp/gnudiff/Build/src/diff <(echo foo) <(echo bar)\$`

Yep, our new version has the same problem that we want to fix.

OK, so let's take a look at some source. gnudiff/src/diff.c looks like a good place to start. Just search for the word "main" and we can quickly check out the main function to get an idea of how diff starts to do what it does. Around line 713 we see the call
`int status = compare_files ((struct comparison *) 0, from_file, argv[optind]);`
which looks very promising. We find the definition of this function at line 1047. Now just skim through this function to get an idea what it does. Around line 1214 we see a comment that looks very promising!
`if ((same_files = (cmp.file[0].desc != NONEXISTENT     && cmp.file[1].desc != NONEXISTENT     && 0 < same_file (&cmp.file[0].stat, &cmp.file[1].stat)     && same_file_attributes (&cmp.file[0].stat, &cmp.file[1].stat)))     && no_diff_means_no_output) {       /* The two named files are actually the same physical file.   We know they are identical without actually reading them.  */}`

Oh, I bet this has something to do with the problem! What do those two "same_file*" functions do?

Armed with grep we find them defined as macros in gnudiff/src/system.h around line 361. These two macros basically check some file data returned by `stat(2)` to see if two files are identical. The attributes checked are things like inode number, uid, gid, size, mtime, ctime, etc. All attributes that were identical when we checked the stat output of our two named pipes. Take a second and glance back up at the output from `stat <(echo foo) <(echo bar)`. I'll wait... back? OK. So, it sorta makes sense why that diff may have failed. And it also makes sense why `diff <(echo foo) <(sleep 1; echo bar)` would have worked. Can you guess why? (hint: think about the modification times for each fifo)

A Fix

What's the best fix for this? One could argue that the problem is that the HFS+ filesystem allows two files on the same filesystem to have the same inode number, but HFS+ really isn't an inode based filesystem. On HFS+, inode numbers are really just the volume's catalog node ID. Plus, it's probably a big pain to modify the Darwin Kernel or the HFS+ filesystem code.

Maybe we can instead fix the problem in diff. According to this technote a CNID of zero is never used and indicates nil, so maybe diff should not shortcut any files with an inode of 0? Let's try it. In `gnudiff/src/system.h`, make the following modification to the `same_files(s, t)` macro:
`# define same_file(s, t)    ((((s)->st_ino == (t)->st_ino)    && ((s)->st_dev == (t)->st_dev))    && ((s)->st_ino != 0) && ((t)->st_ino != 0) \    || same_special_file (s, t))`

Then recompile with `make` (if necessary, type `make clean; make`). Now, let's see if we fixed the problem:
`\$ /tmp/gnudiff/Build/src/diff <(echo foo) <(echo bar)1c1< foo---> bar`

YAY! That seems to have fixed the problem!

Conclusion

I possibly skipped the most important first step here, and that is use Google to see if someone else already figured out my problem! I'll probably go do that now! ;-)

In the meantime, I haven't tested this solution thoroughly, but I imagine it's safe. Hopefully, this (or some other) fix will make it into the diff code soon. G'nite.

Tuesday, February 14, 2006

The `char *apple[]` Argument Vector

We're all familiar with the arguments passed to the `main` function by the OS:

1. `int argc`

2. `char *argv[]`

3. `char *envp[]`

But programs started on Mac OS X (i.e. Darwin) actually have access to another argument - the `apple` vector. The `apple` vector is defined as `char *apple[]` and it's passed as the 4th argument to the `main()` function (it's actually stored right after `envp` on the stack).

But what is it used for? Well, Apple can use the `apple` vector to pass whatever "hidden" parameters they want to any program. And they do actually use it, too. Currently, `apple[0]` contains the path where the executing binary was found on disk. What's that you say? How is `apple[0]` different from `argv[0]`? The difference is that `argv[0]` can be set to any arbitrary value when `execve(2)` is called. For example, shells often differentiate a login shell from a regular shell by starting login shells with the first character in `argv[0]` being a `-`. For example:
`\$ ps aux | grep -- -bashjgm 262 0.0 0.1 27820 752 p1 S 5Feb06 0:01.58 -bash`

So, we can see that the bash login shell on my Mac was started with a dash in its name. In this example, bash's `argv[0]` would equal `-bash`, but its `apple[0]` would contain the path to where the bash binary was actually found (likely `apple[0]` would be `/bin/bash`).

Let's write a simple program to see all this in action:
`// Compile with: gcc -o apple apple.c#include <stdio.h>int main(int argc, char *argv[], char *envp[], char *apple[]) {  printf("argv[0] = %s\n", argv[0]);  printf("apple[0] = %s\n", apple[0]);  return 0;}`

And here's a few runs:
`\$ ./apple argv[0] = ./appleapple[0] = ./apple\$ PATH=. apple argv[0] = appleapple[0] = ./apple\$ PATH=/Users/jgm apple argv[0] = appleapple[0] = /Users/jgm/apple`

So, we can see that `apple[0]` is not the same as `argv[0]` and that it contains the path to where the executing image was found on disk (taking into account the `\$PATH`).

Now, if want to test the bash example above (where `argv[0]` doesn't match the binary name), we can write another small test program:
`// Compile with: gcc -o exec_apple exec_apple.c#include <unistd.h>int main() {  char *theArgv[] = {"-apple", NULL};  execve("./apple", theArgv, NULL);  return 1;}`

And a run:
`\$ ./exec_apple argv[0] = -appleapple[0] = ./apple`

So, just as we expected; `argv[0]` can really be set to anything by `execve(2)` but `apple[0]` should always contain the real path to the executing binary image.

Pretty neat huh?

UPDATE 10/30/2006 here

Monday, February 13, 2006

Nil and nil

Objective-C has some very interesting data types that often are misunderstood. Many of them can be found in `/usr/include/objc/objc.h`, or other files in that same directory. Below is a snippet taken from `objc.h` that shows the declaration of some of these types:

`// objc.h#import <objc/objc-api.h>typedef struct objc_class *Class;typedef struct objc_object {  Class isa;} *id;typedef struct objc_selector  *SEL;typedef id      (*IMP)(id, SEL, ...);typedef signed char   BOOL;#define YES             (BOOL)1#define NO              (BOOL)0#ifndef Nil#define Nil 0   /* id of Nil class */#endif#ifndef nil#define nil 0   /* id of Nil instance */#endif`

Let's cover some of them in a little more detail here:

`id`

This is not equivalent to `void *`. As the snippet from the header above indicates, `id` is a pointer to a `struct objc_object`, which is basically a pointer to any class derived from the `Object` (or `NSObject`) base class. Notice, that `id` is a pointer, so you do not need the asterisk when using `id`. For example: `id foo = nil` declares a `nil` pointer to any subclass of `NSObject`, whereas `id *foo = nil` declares a pointer to a pointer to a subclass of `NSObject`.

`nil`

This is equivalent to the C language's `NULL` value. It is defined in `objc/objc.h` and is used to refer to an Objective-C object instance pointer that points to nothing.

`Nil`

Yes, this is sort-of different than `nil` but they're defined in the same file. `Nil` (with a capital 'N') is used to define a pointer to an Objective-C class (type `Class`) that points to nothing.

`SEL`

Now this one is fun and interesting. `SEL` is the type of a "selector" which identifies the name of a method (not the implementation). So, for example, the methods `-[Foo count]` and` -[Bar count]` both share a selector, namely the selector "count". A `SEL` is a pointer to a `struct objc_selector`, but what the heck is an `objc_selector`? Well, it's defined differently depending on if you're using the GNU Objective-C runtime, or the NeXT Objective-C Runtime (like Mac OS X). Well, it ends up that Mac OS X maps `SEL`s to simple C strings. For example, if we define a `Foo` class with a `- (int)blah` method, the code `NSLog(@"SEL = %s", @selector(blah));` would output SEL = blah.

`IMP`

From the header above `IMP` is declared as `id (*IMP)(id, SEL, ...)`, so it's a pointer to a function that takes an `id` (the "`self`" pointer), the `SEL` that was called, and some other variable arguments.

`Method`

The `Method` type is defined in `objc/objc-class.h` as:
`typedef struct objc_method *Method;struct objc_method {  SEL method_name;  char *method_types;  IMP method_imp;};`

So, this kind of ties together some of the other types that we talked about. So, a method is a type that relates selectors and implementations.

`Class`

From above, `Class` is defined to be a pointer to a `struct objc_class`, which is declared in `objc/objc-class.h` as:
`struct objc_class {  struct objc_class *isa;  struct objc_class *super_class;  const char *name;  long version;  long info;  long instance_size;  struct objc_ivar_list *ivars;  struct objc_method_list **methodLists;  struct objc_cache *cache;  struct objc_protocol_list *protocols;};`

I'm not going to get into much detail here, other than to show the declaration. We'll talk more about this in a future post.

Well, that's about it for now. These are all important types and concepts in Objective-C and I thought they would be good to talk about. More later...

Saturday, February 11, 2006

Messaging `nil` in Objective-C

Sending a message to a `nil` object doesn't make much sense in many programming languages. For example, if you do this in Java you'll get the dreaded `NullPointerException`. But sending a message ("sending a message" in Objective-C is similar to "calling a method" in other OO languages) to a `nil` object is defined, okay, and incredibly useful in Objective-C. Actually, one of the most common coding idioms in objective-C is:

`Foo *foo = [[Foo alloc] init];`

which creates a `Foo` instance by sending the `+alloc` message to the `Foo` class, then sending the `-init` method to the returned instance. However, if `+alloc` fails and returns `nil`, the `-init` method will be sent to a `nil` object which simply ends up setting `foo` to `nil` (which is probably exactly what we'd want to happen anyway).

I'd like to see an example

OK, let's write some sample code to test this.
`// Compile with: gcc -o nil nil.m -framework Foundation#import <Foundation/Foundation.h>@interface Foo : NSObject- (NSString *)sayHi;@end@implementation Foo- (NSString *)sayHi {  return @"Hello, World!";}@endint main() {  Foo *foo = nil;  NSLog(@"Greeting = %@", [foo sayHi]);  return 0;}`

2006-02-11 20:49:26.372 nil[3406] Greeting = (null)

So, we can see that when we send the message `-sayHi` to a nil pointer the return value is `nil`.

How does this work?

The compiler turns message calls like `[targetObject someSelector]` into a C function call like `objc_msgSend(targetObject, someSelector)`. So, to figure out what this returns we simply need to figure out what `objc_msgSend()` does when its first argument is nil. Well, we can download the source for the Objective-C runtime from Apple here. The file we're interested in is objc-msg-ppc.s (yes, it's in PPC assembly). If we search for "ENTRY _objc_msgSend" we'll see the function we're looking for. The comments are very useful in this file and we can pretty easily see that it checks if its first argument (passed in register `r3`), which happens to be the target object, is nil and if so it does a few other things and eventually returns `nil`. And since C functions on PowerPC chips return integer and pointer values in register `r3` nothing needs to be done; the function simply returns and the result is that the caller thinks the function (or "message") returned `nil`. And since integers are returned the same way as pointers, sending a message that returns an `int` will return `0`, simply because `nil` is `#define`'d to be `0` (`/usr/include/objc/objc.h`).

But what if the method returns a float?

Let's see...
`#import <Foundation/Foundation.h>@interface Foo : NSObject- (float)blah;@end@implementation Foo- (float)blah {  return 5.0;}@endint main() {  Foo *foo = nil;  NSLog(@"blah = %f", [foo blah]);  return 0;}`

2006-02-11 21:20:20.948 nil[3441] blah = 0.000000

So, it looks like messages that return a float return 0.0 like we'd expect. Wrong! Change the test code as indicated:
`void g(float f) {}int main() {  g(2.0);`

2006-02-11 21:22:47.094 nil[3452] blah = 2.000000

Ah-ha! Now the return value for messaging our `nil` object was `2.0`! So, it looks like the return value in this case is whatever value happens to be in the appropriate floating point register.

Interesting! So what does it mean?

All this neat stuff means that it *is* safe to send a message to `nil` when:

• The method is declared to return a pointer

• The method is declared to return any integer value less than or equal to `sizeof(void *)` (32 on a 32-bit machine)

and it is NOT safe when

• The method returns any floating point value

• An integer value > `sizeof(void *)`

Also, it's usually *not* safe to message `nil` when the message returns a structure.

Conclusion

The ability to send messages to `nil` is an incredibly cool and powerful feature of Objective-C, but it may not always do what you intend. I've read that Apple is trying to standardize the behavior of messaging `nil` (they'll likely guarantee that it will "always" return a zero value), but this is currently not the case.

*DISCLAIMER: I've simplified a few things here to make this more understandable. I also did not cover issues related to messaging `nil` on Intel chips. Maybe I'll leave some of these things for future posts. If you have questions about any of this, or simply think I'm wrong about something, please post a comment. I'll get back to you as soon as possible. I love to discuss this stuff :-)