[sf-lug] ps and grep
Kristian Erik Hermansen
kristian.hermansen at gmail.com
Sat Mar 8 13:48:00 PST 2008
Michael,
You need to write a book titled 'Ridiculous Shell Scripting'. While
this example doesn't really show off your most impressive examples I
have seen, it is still definitely ridiculous and appreciated! I
always enjoy your posts :-)
On Fri, Mar 7, 2008 at 3:55 AM, Michael Paoli
<Michael.Paoli at cal.berkeley.edu> wrote:
> regarding:
> > From: Tom Haddon <tom at greenleaftech.net>
> > Subject: ps and grep
>
> > Bit of an elementary question, this, but can someone remind me why:
> > ps fuwxx | grep <something>
> > returns "grep <something>" in the list if finds? Intuition would suggest
> > that the ps is happening first, and so the grep command wouldn't show in
> > the list. One of those things that was explained to me once, but seems
> > to have slipped through my sieve-like memory...
> et. seq. (see:
> http://linuxmafia.com/pipermail/sf-lug/2008q1/date.html
> )
>
> So, what *really* happens?
>
> When in doubt test :-). With strace or the like, this is fairly easy to
> do. So, ... we run our little test, and use strace:
>
> $ echo $$
> 8070
> $ >strace.out 2>&1 strace -fv -eall -s2048 -p $$ &
> [13] 7916
> $ ps fuwx | grep something
> michael 7918 0.0 0.0 1536 448 pts/1 S+ 00:50 0:00 \_ grep
> something
> $ kill -2 %13
> [13]- Done strace -fv -eall -s2048 -p $$ >strace.out 2>&1
> $ cut -c-72 strace.out | fgrep 'pipe
> > exec
> > fork'
> pipe([3, 4]) = 0
> fork(Process 7917 attached
> [pid 7917] execve("/bin/ps", ["ps", "fuwx"], ["SSH_AGENT_PID=8063", "GP
> [pid 8070] fork( <unfinished ...>
> [pid 8070] <... fork resumed> ) = 7918
> [pid 7918] execve("/bin/grep", ["grep", "something"], ["SSH_AGENT_PID=8
> [pid 7918] <... execve resumed> ) = 0
> $
>
> So, ... what *really* happened? First we see our original parent shell
> PID is 8070. That shell uses pipe(2) to open a FIFO. The shell then
> forks and execs ps. The shell then forks and execs grep. So, if ps is
> execed before grep, why does ps show the grep command in its output?
> There are two likely contributing factors as to why that is the case -
> at least in our test. First of all, in using a pipe, the scheduler may
> block a process - e.g. if ps has opened the pipe for writing, but
> nothing is yet reading the pipe, the scheduler may not give ps any CPU
> cycles until something opens the pipe for reading. Likewise, something
> reading the pipe - grep in our case - may not be given any CPU cycles
> by the scheduler until there's something writing the pipe or until
> there's data that's been written to the pipe. The scheduler will also
> block the writing process if the pipe's buffer (probably somewhere
> between 512 bytes and 4KiB) is filled - likewise blocking the reading
> process if the FIFO is empty. That's one factor that may significantly
> impact the timing. Another is that Linux (and Unix, etc.) is a
> multiuser multiprocessing operating system. In such a general case,
> there isn't generally specific guarantee as to which PIDs will get CPU
> execution cycles first - they may even both get CPU cycles concurrently
> on an SMP system. So, generally speaking, it's a bit of a race as to
> which one gets CPU cycles first, and how far each is along in doing its
> tasks under its respective PID. Once grep has been execed, it's in the
> process table (and in the /proc file system which ps may read for that
> information), but it becomes a race as to whether ps will read the
> process table before or after grep is execed - and the PIDs being
> connected by a pipe, and/or the specific scheduling that otherwise
> occurs by the scheduler, will eventually determine whether or not our
> ps picks up our grep PID or not.
>
> Does it always work this way? Even for the same system and such?
>
> Let's try a bit, and take a count:
>
> $ h=0; n=1; while [ $n -le 100 ]; do
> > ps fuwxx | >>/dev/null 2>&1 grep something && h=`expr $h + 1`
> > n=`expr $n + 1`
> > done; unset n; echo $h; unset h
> 30
> $
>
> So, ... we find in our quick little test sample, 30% of the time ps
> picked up our grep process, and 70% of the time it didn't ... so,
> obviously it's a race as to whether or not ps reads the process table -
> or even the process table slot with the grep process in it - before, or
> after grep is execed.
>
> Not necessarily every shell, operating system flavor and scheduler will
> necessarily do it quite as we've observed above, but the above is at
> least what I found with some quick tests on the Linux distribution
> at my fingertips, and under the bash shell.
>
> _______________________________________________
> sf-lug mailing list
> sf-lug at linuxmafia.com
> http://linuxmafia.com/mailman/listinfo/sf-lug
>
--
Kristian Erik Hermansen
--
"It has been just so in all my inventions. The first step is an
intuition--and comes with a burst, then difficulties arise. This thing
gives out and then that--'Bugs'--as such little faults and
difficulties are called--show themselves and months of anxious
watching, study and labor are requisite before commercial success--or
failure--is certainly reached" -- Thomas Edison in a letter to
Theodore Puskas on November 18, 1878
More information about the sf-lug
mailing list