Race in reboot/poweroff path at init?
Denys Vlasenko
vda.linux at googlemail.com
Thu Nov 2 14:33:48 UTC 2017
On Tue, Oct 10, 2017 at 10:52 AM, Jeremy Kerr <jk at ozlabs.org> wrote:
> Hi all,
>
> I've been debugging an issue where we can't reboot or poweroff a machine
> in the early stages of busybox init. Using the poweroff case as an
> example:
>
> - kernel starts /sbin/init
>
> - kernel receives a poweroff event, so calls __orderly_poweroff.
> Effectively, these will just call out to the /sbin/poweroff usermode
> helper.
>
> - /sbin/poweroff just does a:
>
> kill(1, SIGUSR2);
>
> - However, /sbin/init has not yet installed a signal handler for
> SIGUSR2. Because we're PID 1, this means the signal is ignored, and
> so the command to poweroff the machine is dropped.
>
> - init keeps booting rather than powering off.
>
> In our particular case, the "poweroff event" is an IPMI soft shutdown
> message. However, the same would apply for any other path that involves
> orderly_poweroff or orderly_reboot.
>
> Even though the signal handlers are installed fairly early in init, we
> can still hit the race between this and the SIGUSR2 being sent fairly
> reliably.
>
> I see a couple of options for resolving this:
>
> - installing the signal handlers even earlier in init_main(). However,
> this will only reduce the window for lost events, rather than
> eliminating it; or
Sure, this should be done. How about this:
--- a/init/init.c
+++ b/init/init.c
@@ -1064,6 +1064,12 @@ int init_main(int argc UNUSED_PARAM, char **argv)
#endif
if (!DEBUG_INIT) {
+ /* Some users send poweroff signals to init VERY early.
+ * To handle this, mask signals early,
+ * and unmask them only after signal handlers are installed.
+ */
+ sigprocmask_allsigs(SIG_BLOCK);
+
/* Expect to be invoked as init with PID=1 or be invoked as linuxrc */
if (getpid() != 1
&& (!ENABLE_LINUXRC || applet_name[0] != 'l') /* not linuxrc? */
@@ -1204,6 +1187,8 @@ int init_main(int argc UNUSED_PARAM, char **argv)
+ (1 << SIGHUP) /* reread /etc/inittab */
#endif
, record_signo);
+
+ sigprocmask_allsigs(SIG_UNBLOCK);
}
/* Now run everything that needs to be run */
This covers code which opens and parses /etc/inittab,
which can be slow (if storage is slow), and can make
race realistic in real world.
Can you test whether this change makes the race go away in your case?
> - using a synchronous channel to send the shutdown/reboot message
> between the poweroff/reboot helpers, rather than an asynchronous
> signal. Say, have init listening on a socket, allowing the poweroff
> binary to wait and/or retry.
>
> However, before I go down the wrong path here: does anyone have other
> ideas that might help eliminating dropped poweroff/reboot events?
The test that processes are being reaped is a good idea.
More information about the busybox
mailing list