Race in reboot/poweroff path at init?

Denys Vlasenko vda.linux at googlemail.com
Thu Nov 2 14:33:48 UTC 2017


On Tue, Oct 10, 2017 at 10:52 AM, Jeremy Kerr <jk at ozlabs.org> wrote:
> Hi all,
>
> I've been debugging an issue where we can't reboot or poweroff a machine
> in the early stages of busybox init. Using the poweroff case as an
> example:
>
>  - kernel starts /sbin/init
>
>  - kernel receives a poweroff event, so calls __orderly_poweroff.
>    Effectively, these will just call out to the /sbin/poweroff usermode
>    helper.
>
>  - /sbin/poweroff just does a:
>
>      kill(1, SIGUSR2);
>
>  - However, /sbin/init has not yet installed a signal handler for
>    SIGUSR2. Because we're PID 1, this means the signal is ignored, and
>    so the command to poweroff the machine is dropped.
>
>  - init keeps booting rather than powering off.
>
> In our particular case, the "poweroff event" is an IPMI soft shutdown
> message. However, the same would apply for any other path that involves
> orderly_poweroff or orderly_reboot.
>
> Even though the signal handlers are installed fairly early in init, we
> can still hit the race between this and the SIGUSR2 being sent fairly
> reliably.
>
> I see a couple of options for resolving this:
>
>  - installing the signal handlers even earlier in init_main(). However,
>    this will only reduce the window for lost events, rather than
>    eliminating it; or

Sure, this should be done. How about this:

--- a/init/init.c
+++ b/init/init.c
@@ -1064,6 +1064,12 @@ int init_main(int argc UNUSED_PARAM, char **argv)
 #endif

     if (!DEBUG_INIT) {
+        /* Some users send poweroff signals to init VERY early.
+         * To handle this, mask signals early,
+         * and unmask them only after signal handlers are installed.
+         */
+        sigprocmask_allsigs(SIG_BLOCK);
+
         /* Expect to be invoked as init with PID=1 or be invoked as linuxrc */
         if (getpid() != 1
          && (!ENABLE_LINUXRC || applet_name[0] != 'l') /* not linuxrc? */
@@ -1204,6 +1187,8 @@ int init_main(int argc UNUSED_PARAM, char **argv)
             + (1 << SIGHUP)  /* reread /etc/inittab */
 #endif
             , record_signo);
+
+        sigprocmask_allsigs(SIG_UNBLOCK);
     }

     /* Now run everything that needs to be run */

This covers code which opens and parses /etc/inittab,
which can be slow (if storage is slow), and can make
race realistic in real world.

Can you test whether this change makes the race go away in your case?

>  - using a synchronous channel to send the shutdown/reboot message
>    between the poweroff/reboot helpers, rather than an asynchronous
>    signal. Say, have init listening on a socket, allowing the poweroff
>    binary to wait and/or retry.
>
> However, before I go down the wrong path here: does anyone have other
> ideas that might help eliminating dropped poweroff/reboot events?

The test that processes are being reaped is a good idea.


More information about the busybox mailing list