[BusyBox] init does not kill reparented processes (MYSTERY SOLVED)

Erik Andersen andersen at codepoet.org
Fri Aug 27 15:10:13 MDT 2004


On Fri Aug 27, 2004 at 08:07:40PM +0200, Ignacio Garc?a P?rez wrote:
> Hi,
> 
> In a previous message, I was complaining about init not doing its duty of
> reaping zombie processes. I''ve found out why. I tried busybox-1.00-pre2 and
> found it does not exibit the problem, and then noticed that init.c has
> undergone a rewrite somewhere in between pre2 and pre10. The important
> difference is that in pre2 init.c is using a SIGCHLD handler to handle child
> termination, while in pre10, it just uses an infinite loop in main, with a
> sleep(1) inside.
[-----------snip-----------]
> While I understand my approach is not the most correct one, I feel that
> SYSINIT processes not finishing should not interfer with the task of zombie
> process reaping, so I suggest init.c is reverted to the SIGCHLD approach.

Sigh.  As you no doubt know, as a special case SIG_IGN for
SIGCHLD means the signal isn't actually ignored, instead the
kernel does automatic child reaping.  We used to do that, and
then we received a fairly steady stream of bug reports about init
deadlocking, so we reverted the SIGCHLD stuff.

    http://www.uclibc.org/cgi-bin/cvsweb/busybox/init/init.c?r1=1.171&r2=1.172

With that reverted, the theory was the kernel should reparent all
zombie processes to init, which should be sleeping in wait(2)
waiting for just such a thing to happen so it can reap any
zombies.  If that is not happening the I am truely at a loss to
explain why.

Anyway, doing what you suggest I _do_ see init deadlocking.  Take
the following patch and apply it, init will deadlock.  Then
disable "signal(SIGCHLD, chld_handler)" and it will run as
expected...

--- init/init.c	16 Aug 2004 09:29:42 -0000	1.204
+++ init/init.c	27 Aug 2004 21:08:41 -0000
@@ -1053,12 +1053,34 @@
 	run_actions(RESPAWN);
         return;
 }
-                                                            
-extern int init_main(int argc, char **argv)
+
+void chld_handler(int sig)
 {
-	struct init_action *a;
 	pid_t wpid;
 	int status;
+	struct init_action *a;
+
+	/* Wait for a child process to exit */
+	wpid = wait(&status);
+	while (wpid > 0) {
+		/* Find out who died and clean up their corpse */
+		for (a = init_action_list; a; a = a->next) {
+			if (a->pid == wpid) {
+				/* Set the pid to 0 so that the process gets
+				 * restarted by run_actions() */
+				a->pid = 0;
+				message(LOG, "Process '%s' (pid %d) exited.  "
+						"Scheduling it for restart.",
+						a->command, wpid);
+			}
+		}
+		/* see if anyone else is waiting to be reaped */
+		wpid = waitpid (-1, &status, WNOHANG);
+	}
+}
+
+extern int init_main(int argc, char **argv)
+{
 
 	if (argc > 1 && !strcmp(argv[1], "-q")) {
 		return kill_init(SIGHUP);
@@ -1082,6 +1104,7 @@
 	signal(SIGCONT, cont_handler);
 	signal(SIGSTOP, stop_handler);
 	signal(SIGTSTP, stop_handler);
+	signal(SIGCHLD, chld_handler);
 
 	/* Turn off rebooting via CTL-ALT-DEL -- we get a
 	 * SIGINT on CAD so we can shut things down gracefully... */
@@ -1158,23 +1181,8 @@
 		/* Don't consume all CPU time -- sleep a bit */
 		sleep(1);
 
-		/* Wait for a child process to exit */
-		wpid = wait(&status);
-		while (wpid > 0) {
-			/* Find out who died and clean up their corpse */
-			for (a = init_action_list; a; a = a->next) {
-				if (a->pid == wpid) {
-					/* Set the pid to 0 so that the process gets
-					 * restarted by run_actions() */
-					a->pid = 0;
-					message(LOG, "Process '%s' (pid %d) exited.  "
-							"Scheduling it for restart.",
-							a->command, wpid);
-				}
-			}
-			/* see if anyone else is waiting to be reaped */
-			wpid = waitpid (-1, &status, WNOHANG);
-		}
+		/* See if any children have exited */
+		chld_handler(0);
 	}
 }
 
 -Erik

--
Erik B. Andersen             http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--


More information about the busybox mailing list