Wednesday, March 22, 2006

Strange difference in ps output

I was asked an interesting question today. It was basically something like, "when I type ps -ef on my Linux box it displays nice wide output, but when I pipe it through grep [or pipe through anything] I have to use the -l option to ps in order for it to display long output".

To illustrate the problem we need a command with a long command line that will run long enough for us to see it. This sounds like a job for sleep(1) with a reeeeaaaallly long time argument.

$ sleep 1000000000000000000000000000000 &
$ ps -ef
[... output omitted ...]
jgm 8264 8156 0 01:03 pts/0 00:00:00 sleep 1000000000000000000000000000000
[... output omitted ...]
$ ps -ef | grep sleep
jgm 8264 8156 0 01:03 pts/0 00:00:00 sleep 10000000000000000000000000

So, it's clear that when we grep ps output, some of the zeros are truncated from the command. But why?

Well, my assumption is that ps checks if its output descriptor (FD 1) is a terminal, and if so, it detects the width of the terminal so that the output is nicely formatted on the terminal and the lines do not wrap. When we use grep the output file descriptor for ps is *not* a terminal (it's a pipe) so it has no "width". In this case ps has to guess a width. And it appears to pick the standard 80 columns wide.

$ ps -ef | grep sleep | wc
1 9 81

Yep, 80 chars (plus the newline character).

The typical C function used to determine if a file descriptor refers to a terminal is isatty(3), which likely translates to an ioctl(2) system call. Let's see if we can verify our hypothesis using strace(1), which is a Linux tool that allows us to see system calls.

(strace writes output to STDERR so we need to grep STDERR)
$ strace ps -ef 2>&1 | grep ioctl
ioctl(1, 0x5413, 0xbffff100) = -1 EINVAL (Invalid argument)
read(7, "grep\0ioctl\0", 2047) = 11
write(1, " 00:00:00 grep ioctl\njgm "..., 78 00:00:00 grep ioctl

Hmm, we do see a failed ioctl() call. The first argument is 1, which is STDOUT, the second arg is 0x5413, and the third is just something on my stack. After reading the the ioctl(2) man page I see that the second argument is the "request type". So, let's grep through some of the standard system headers to figure out what this 0x5413 thing is.

$ grep -r 0x5413 /usr/include/*/ioctl*
/usr/include/asm/ioctls.h:#define TIOCGWINSZ 0x5413

Ahhh! 0x5413 indicates a request to get the window size. Just like we thought. Looking back at the strace output we see that the ioctl() call to get the window size failed (EINVAL) so it couldn't get the window size, and must have just used a default value.

Now, the one last check we can do is take a look at the strace output when the output actually does go to a terminal.

$ strace ps -ef 2> ps.out
[... regular ps output omittied ...]
$ grep ioctl ps.out
ioctl(1, 0x5413, {ws_row=24, ws_col=141, ws_xpixel=846, ws_ypixel=336}) = 0

OK, so it looks like our assumption was correct, and I think we verified it thoroughly enough.

1 comment:

Mike Doel said...

Nicely done. Interestingly, I see that this isn't standard behavior across all implementations of ps. On my Powerbook, using iTerm, it appears that ps doesn't vary output using that same technique.