Tuesday, December 26, 2006

Lists and Queues

Linked lists are often among the first data structures we learn about in computer science. They're simple in concept as well as implementation; An implementation could be developed in the time allotted to a technical interview, for example. However, computer scientists and engineers are often lazy, meaning that they don't want to do more work than is needed. Thankfully, with 4.4BSD came the sys/queue.h header file which contains many useful functions for dealing with singly- and doubly-linked lists, tail queues, and circular queues. These functions (actually, they are implemented as C macros) are described in the QUEUE(3) man page.

In this post, we'll take a quick look at a few of the functions required to use these structures. We'll work through a small (and virtually useless) example that uses a doubly-linked list to store even numbers. The first thing we'll do is focus on the central data structure that we want to be "linked" together. In our example, we will create a simple data structure for holding one even number.

struct EvenNumber {
int val;
};


In order to add this structure to a list, we need to modify it slightly by adding an entry for holding pointers to the next and previous EvenNumber entries.

struct EvenNumber {
int val;
LIST_ENTRY(EvenNumber) numbers;
};


LIST_ENTRY is a macro that expands to a structure holding the pointers for the next and previous entries. We can see this by having gcc display the output from the preprocessor. (We won't look at the preprocessor output at every step in this post, but studying the output of gcc -E on your own can be very enlightening.)

$ gcc -E list.c | grep "int val" -A2 -B1
struct EvenNumber {
int val;
struct { struct EvenNumber *le_next; struct EvenNumber **le_prev; } numbers;
};


OK, we've now modified our EvenNumber structure so that it can be inserted into a list simply by adding a LIST_ENTRY field to our structure. Next, we need a way to reference the list as a whole. We do this using the LIST_HEAD macro that declares a new structure which holds a pointer to the first element in the list.

LIST_HEAD(EvenNumbersHead, EvenNumber) even_numbers_head;


The preprocessor will expand this to the following (reformatted for readability).

struct EvenNumbersHead {
struct EvenNumber *lh_first;
} even_numbers_head;


We can see that the first argument to LIST_HEAD is the name of the new structure that we're creating, and the second argument is the name of the structure that will be linked together in the list.

Before we can use the list, we need to initialize the head by calling LIST_INIT with a pointer to our "head" instance, which effectively creates an empty list.

LIST_INIT(&even_numbers_head);


At this point we have a doubly-linked list that is referenced by the variable even_numbers_head. We may add to the list instances of the EvenNumber structure using the macros LIST_INSERT_HEAD, LIST_INSERT_BEFORE, or LIST_INSERT_AFTER.

The basic structure of our program is outlined here.

struct EvenNumber {
int val;
LIST_ENTRY(...) numbers;
};

LIST_HEAD(...) even_numbers_head;

int main(void) {
LIST_INIT(&even_numbers_head);

for (int i = 0; i < 100; i++) {
...
LIST_INSERT_HEAD(&even_numbers_head, ...);
}

LIST_FOREACH(..., &even_numbers_head, numbers) {
...
LIST_REMOVE(theNum, numbers);
...
}

return 0;
}


The full source for this example can be seen in list.c. We can compile and run the program as follows.

$ gcc -o list list.c -std=c99 -Wall
$ ./list
[...output omitted...]
10
8
6
4
2
0


We see that the output is in reverse order. This is because we are using LIST_INSERT_HEAD, which adds the new entry at the beginning of the list. If we want to display the list in increasing order (ignoring that we could simply pipe the output through sort -n) we could add each new entry to the end (or tail) of the list rather than the head. However, for this we should use a different data structure; a TAILQ. TAILQs work very similarly to the LISTs already discussed, so here's the full program source using a TAILQ rather than a LIST.

For more information about these convenient functions, see the comments in /usr/include/sys/queue.h as well as the QUEUE(3) man page. Also note that these functions are available (and heavily used) within the kernel. However, when used in the kernel the header file to include is /System/Library/Frameworks/Kernel.framework/Headers/sys/queue.h.

...now, to see if my wife is finished getting ready yet...

4 comments:

Mathias De Belder said...

Very nice, I hadn't heard or read about sys/queue.h before and just assumed everyone rolled their own or used something like glib's GList. However, aren't you freeing the `theNum' pointer prematurely in your tailq example code ? As far as I can understand the macro implementations for TAILQ_FOREACH and TAILQ_FOREACHREVERSE, they reference the just deleted field during their for-loop processing ((var) = ((var)->field.tqe_next) for instance), so merely removing an element from the list prior to deletion isn't enough.

So isn't this more correct ?

#include sys/types.h
#include sys/queue.h
#include stdlib.h
#include stdio.h

typedef struct EvenNumbers {
TAILQ_ENTRY(EvenNumbers) next;
ssize_t val;
} EvenNumbers;

typedef TAILQ_HEAD(EvenNumbersHead, EvenNumbers) EvenNumbersHead;

int
main(int argc, char *argv[])
{
ssize_t i;
EvenNumbers *np, *prevp;
EvenNumbersHead *numbers_head;

numbers_head = malloc(sizeof(EvenNumbersHead));
TAILQ_INIT(numbers_head);

for (i = 2; i < 11; ++i) {
if (i % 2) { continue; }
np = malloc(sizeof(EvenNumbers));
np->val = i;
TAILQ_INSERT_TAIL(numbers_head, np, next);
}

TAILQ_FOREACH(np, numbers_head, next) {
fprintf(stdout, "%d ", np->val);
}
fprintf(stdout, "\n");

prevp = NULL;

TAILQ_FOREACH_REVERSE(np, numbers_head, EvenNumbersHead, next) {
if (prevp) { free(prevp); }
fprintf(stdout, "%d ", np->val);
TAILQ_REMOVE(numbers_head, np, next);
prevp = np;
}

if (prevp) { free(prevp); }

free(numbers_head);

fprintf(stdout, "\n");

return EXIT_SUCCESS;
}

Mary Leigh Burke said...

Thank you so much for explaining the "entries" parameter in TAILQ_ENTRY. I'm a tech writer
documenting some sample code that uses this macro, and I swear I've been everywhere in the Internet universe looking for some explanation. Whoever wrote the man pages obviously didn't have a clue, nor did the umpteen other tech writers who copied the man pages. You are a life saver!

Anonymous said...

when removing from the middle of the queue use TAILQ_FOREACH_MUTABLE rather than TAILQ_FOREACH, MUTABLE has check that queue does not break in some cases.

M M W said...

Dear Mary Leigh Burke,

for a complete (and clean) example of use, i do recommend to take a look here (freebsd src tree):

src/sys/kern/vfs_mount.c
sys/dev/ata/ata-queue.c

you can get a web source preview from the Robert's website,

http://fxr.watson.org/

(FreeBSD and Linux Kernel Cross-Reference)

Cheers!