[PATCH] crond: add LOG_NDELAY to openlog

Mon Dec 11 07:58:36 UTC 2023

Hello List,

After digging further into textbook APUE 2/e and TLPI in this weekend,
it looks like more clear to explain what is happening.
In vfork() implementation, 
1. The process address space is shared among parent and child processes.
The process address space, including data, stack, and heap, which means any 
modification to a static variable (in data segment) is visible to others.
That's why vfork() is much faster and less memory footprint than fork(),
which is very efficient for entry-level cpu in embedded system.
2. The file descriptors table is not shared among parent and child processes. 
Parent and children have its own fd table, because vfork() finally call 
clone() syscall without CLONE_FILES.

After vfork() and before exec(), just in the middle of vfork() and exec(), 
once the syslog() is launched and this 1st-time-call created the DGRAM socket 
to the /dev/log, 
there is an inconsistent state or a mismatched state between these 2 resources:
1. 'LogFile' in data segment. 'LogFile' is a glibc syslog.c internal static 
variable, since the address space is shared, child's modification to it 
would be visible to the parent once parent is resumed.
2. File descriptors table. Since the fd table is not shared and each process 
has its own copy, child's new fd is only resident in its own fd table, it
would not affect the parent fd table; any newly created fd in child is not 
visible to the parent once parent is resumed.

Prepare three steps to intentionally trigger this issue,
1. This home directory is not existed. for example:
# ls /share/homes/admin
ls: cannot access '/share/homes/admin': No such file or directory
2. An invalid line with a prefix '<' is added to config file, for example:
< 1 * * * /etc/init.d/flush_memory.sh >/dev/null 2>&1
3. The crond is launched as daemon when system bootup, for example:
/usr/sbin/crond -l 9 -c /tmp/cron/crontabs

When crond (parent, pid=21156) enters start_jobs() loop and vfork() each job,
in my case would be two children processes: 24657 and 24658.
vfork()                                 = 24657
vfork()                                 = 24658

Since vfork() guaranteed child to be scheduled for CPU before parent,
let us see these children first.

The 1st child pid=24657 enters change_user() > (chdir_or_warn()) > chdir(),
found home dir is not existed and call syslog() for logging,
since the DGRAM socket to /dev/log is not created by parent yet,
child pid=24657 would create it by 1st-time-call syslog(),
and get the fd=4 into its fd table, in the mean while,
the libc.so (syslog.c at glibc) internal static variable set to 'LogFile=4'.
https://elixir.bootlin.com/glibc/glibc-2.21/source/misc/syslog.c#L64

chdir("/share/homes/admin")             = -1 ENOENT (No such file or directory)
getpid()                                = 24657
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
connect(4, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
sendto(4, "<75>Dec 11 10:00:01 crond[24657]: can't change directory to '/share/homes/admin'\n", 81, MSG_NOSIGNAL, NULL, 0) = 81
chdir("/tmp/cron")
setpgid(0, 0)
execve("/bin/sh", ["/bin/sh", "-c", ...) = 0

The 2nd child pid=24658 goes through the same path as 1st child,
but a bit different output; when 2nd child found home dir is not existed,
it directly calls 'sendto(4, ...' for logging and failed with EBADF.
Why sendto() knows it is '4'? From glibc static var 'LogFile'.
This '4' is came from the 'LogFile=4', in the libc.so (syslog.c at glibc),
its value '4' had already been set by the 1st child pid=24657.
With vfork() all processes shared the address space including data segment,
since 'LogFile=4' is a static var and is resident in the data segment,
pid=24657 and pid=24658 has the same copy of it, which means 1st child
pid=24657's modification to 'LogFile=4' is visible to 2nd child pid=24658.
Why sendto() returned EBADF? 'Not found' in the fd table.
Because 2nd child pid=24658's fd table actually does not have this fd=4,
so sendto()/close() with fd=4 returned EBADF is as expected.
With vfork() the CLONE_FILES is not passed to clone() so the file descriptor 
table is not shared among processes, each process has its own copy of fd 
table.

chdir("/share/homes/admin")             = -1 ENOENT (No such file or directory)
getpid()                                = 24658
sendto(4, "<75>Dec 11 10:00:01 crond[24658]: can't change directory to '/share/homes/admin'\n", 81, MSG_NOSIGNAL, NULL, 0) = -1 EBADF (Bad file descriptor)
close(4)                                = -1 EBADF (Bad file descriptor)
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
connect(4, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
sendto(4, "<75>Dec 11 10:00:01 crond[24658]: can't change directory to '/share/homes/admin'\n", 81, MSG_NOSIGNAL, NULL, 0) = 81
chdir("/tmp/cron")
setpgid(0, 0)
execve("/bin/sh", ["/bin/sh", "-c", ...) = 0

Now let us move on next to see the parent and here is the problem,
while crond goes to load_crontab()'s loop for next parsing config file,
config_read() -> get_line_with_continuation() -> ... -> read() would be 
blocked on fd=4, and whole crond would be blocked, not responding any more.
Why sendto() knows it is '4'? From glibc static var 'LogFile'.
Parent use the same copy of staic variable 'LogFile=4'.
Why sendto() returned ENOTSOCK?
Because 'fd=4' is a regular file fd in parent's its own fd table, 
and it is not a socket fd.
By glibc syslog.c logic would close it and re-create it again,
then fd=4 is a DGRAM socket connected to /dev/log for daemon logging, 
not a regular file for parsing jobs any more as earlier mentioned.
Last line 'read(4, ...' is expecting to read a regular file and got an EOF 
to exit the load_crontab()'s loop, but actually it would be blocked on a 
unexpected DGRAM socket fd.

read(4, "# m h dom m dow cmd\n0-59/20 3 * * * /sbin/adjust_time\n0 1 * * * /etc/init.d/flush_"..., 4096) = 2293
 > /lib/libc-2.21.so(__read_nocancel+0x7) [0xdb570]
 > /lib/libc-2.21.so(_IO_file_underflow+0x150) [0x73c90]
 > /lib/libc-2.21.so(getdelim+0x2a0) [0x69740]
 > /bin/busybox(get_line_with_continuation+0x2e) [0xa55f9]
sendto(4, "<75>Dec 11 10:01:00 crond[21156]: user admin: parse error at <\n", 63, MSG_NOSIGNAL, NULL, 0) = -1 ENOTSOCK (Socket operation on non-socket)
 > /lib/libc-2.21.so(__send+0x1d) [0xe92fd]
 > /lib/libc-2.21.so(__vsyslog_chk+0x26b) [0xe3abb]
 > /lib/libc-2.21.so(syslog+0x8f) [0xe3e2f]
 > /bin/busybox(bb_verror_msg+0x27c) [0x8711]
close(4)                                = 0
 > /lib/libc-2.21.so(__close_nocancel+0x7) [0xdbb90]
 > /lib/libc-2.21.so(closelog_internal.part.0+0xf) [0xe382f]
 > /lib/libc-2.21.so(__vsyslog_chk+0x493) [0xe3ce3]
 > /lib/libc-2.21.so(syslog+0x8f) [0xe3e2f]
 > /bin/busybox(bb_verror_msg+0x27c) [0x8711]
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
 > /lib/libc-2.21.so(socket+0x7) [0xe94c7]
 > /lib/libc-2.21.so(openlog_internal+0x1a0) [0xe37f0]
 > /lib/libc-2.21.so(__vsyslog_chk+0x4aa) [0xe3cfa]
 > /lib/libc-2.21.so(syslog+0x8f) [0xe3e2f]
 > /bin/busybox(bb_verror_msg+0x27c) [0x8711]
connect(4, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
 > /lib/libc-2.21.so(__connect_nocancel+0x7) [0xe9050]
 > /lib/libc-2.21.so(openlog_internal+0x62) [0xe36b2]
 > /lib/libc-2.21.so(__vsyslog_chk+0x4aa) [0xe3cfa]
 > /lib/libc-2.21.so(syslog+0x8f) [0xe3e2f]
 > /bin/busybox(bb_verror_msg+0x27c) [0x8711]
sendto(4, "<75>Dec 11 10:01:00 crond[21156]: user admin: parse error at <\n", 63, MSG_NOSIGNAL, NULL, 0) = 63
 > /lib/libc-2.21.so(__send+0x1d) [0xe92fd]
 > /lib/libc-2.21.so(__vsyslog_chk+0x4d2) [0xe3d22]
 > /lib/libc-2.21.so(syslog+0x8f) [0xe3e2f]
 > /bin/busybox(bb_verror_msg+0x27c) [0x8711]
read(4,

--

Regards,
Jones Syue | 薛懷宗
QNAP Systems, Inc.