klogd problem: questions about busybox behavior (1.15.3)

Paul Smith paul at mad-scientist.net
Wed May 26 16:34:20 UTC 2010


On Sat, 2010-05-22 at 21:43 +0200, Denys Vlasenko wrote:
> Because dmesg does not need to sit and wait for a new protion of data.
> It just dumps entire buffer. This opration does not require per-process
> pointers to buffer.

Ah.  That makes sense.

> > Trying to determine why I have two klogd processes I see this: the
> > bring-up of the system is managed by busybox init.  To start the system
> > it runs:
> > 
> >         ::sysinit:/etc/init.d/rcS start
> >         ::once:/etc/rootfs start
> > 
> > In the "rootfs" script is where klogd is started, then it does some
> > things such as change the system time, timezone, hostname, etc.  The way
> > things work we restart the logging system multiple times as these
> > changes are implemented, so the log messages have the right timezone and
> > hostname.
> > 
> > The first time I restart I kill klogd, but init (apparently) never reaps
> > the child so it's running as a zombie.
> 
> Please show me the process *tree*. In this tree, is this zombie a child
> of init, or a child of some other process? If former, then it's
> a bug in init. Try newer busybox. If it doesn't help, let me know.

Definitely it's a child of init.  Sorry, I should have made that clear.
My build of busybox ps doesn't have options to show PPID, but I looked
at /proc/pid/status to find it; here's some output I threw into my
'rootfs' script referenced about:

Here is the config after I've started the logging, before anything odd
happens:

> 1
>  1084 root      2196 S    syslogd -D -s 200 -b 1 
>  1086 root      2196 S    klogd 
>  2049 root      2200 S    grep logd 
> Name:   klogd
> Pid:    1086
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    1084
> PPid:   1
> TracerPid:      0

Now I do some stuff, then I "killall -q -TERM klogd" and ditto for syslogd:

> 2
>  1086 root         0 Z    [klogd]
>  2078 root      2196 S    syslogd -D -s 200 -b 1 
>  2080 root      2196 S    klogd 
>  2082 root      2200 S    grep logd 
> Name:   klogd
> Pid:    2080
> PPid:   1
> TracerPid:      0
> Name:   klogd
> Pid:    1086
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2078
> PPid:   1
> TracerPid:      0

Note how syslogd was reaped, but not klogd!  Crazy.  Now a bit later in
that same script, klogd is finally reaped:

> 4
>  2078 root      2196 S    syslogd -D -s 200 -b 1 
>  2080 root      2196 S    klogd 
>  2128 root      2200 S    grep logd 
> Name:   klogd
> Pid:    2080
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2078
> PPid:   1
> TracerPid:      0

But then I restart them and this time I see both daemons as zombies:

> 5
>  2078 root         0 Z    [syslogd]
>  2080 root         0 Z    [klogd]
>  2138 root      2196 S    syslogd -D -s 200 -b 1 
>  2140 root      2196 S    klogd 
>  2142 root      2200 S    grep logd 
> Name:   klogd
> Pid:    2140
> PPid:   1
> TracerPid:      0
> Name:   klogd
> Pid:    2080
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2138
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2078
> PPid:   1
> TracerPid:      0

and that state persists for a while, while I start/stop dropbear and do
various other things:

> 7
>  2078 root         0 Z    [syslogd]
>  2080 root         0 Z    [klogd]
>  2138 root      2196 S    syslogd -D -s 200 -b 1 
>  2140 root      2196 S    klogd 
>  2183 root      2204 S    grep logd 
> Name:   klogd
> Pid:    2140
> PPid:   1
> TracerPid:      0
> Name:   klogd
> Pid:    2080
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2138
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2078
> PPid:   1
> TracerPid:      0

but eventually it all clears up BEFORE I exit my script (so it's
definitely not just waiting until the script is done):

> 8
>  2138 root      2196 S    syslogd -D -s 200 -b 1 
>  2140 root      2196 S    klogd 
>  2207 root      2200 S    grep logd 
> Name:   klogd
> Pid:    2140
> PPid:   1
> TracerPid:      0
> Name:   syslogd
> Pid:    2138
> PPid:   1
> TracerPid:      0

Getting a new busybox into my environment takes some finagling but I'll
give it a try.

> init must reap zombies. But it can only reap _its children_.

Sure, of course.  Just FYI I've been hacking UNIX systems for a LOT of
years... let's just say I was familiar with all this before Linux was
even a gleam in Linus' eye :-).  I don't mind the extra explanation,
just letting you know you don't need to make that effort on my account.

Cheers!



More information about the busybox mailing list