[PATCH 1/1] hwclock: Verify RTC file descriptor; use reentrant functions
Bryan Evenson
bevenson at melinkcorp.com
Mon May 5 15:45:28 UTC 2014
Issac,
> -----Original Message-----
> From: Isaac Dunham [mailto:ibid.ag at gmail.com]
> Sent: Friday, May 02, 2014 8:39 PM
> To: Bryan Evenson
> Cc: Denys Vlasenko; busybox; griebl at gmx.de
> Subject: Re: [PATCH 1/1] hwclock: Verify RTC file descriptor; use reentrant
> functions
>
> On Fri, May 02, 2014 at 06:11:41PM +0000, Bryan Evenson wrote:
> > Denys,
> >
> > > -----Original Message-----
> > > From: Denys Vlasenko [mailto:vda.linux at googlemail.com]
> > > Sent: Friday, May 02, 2014 3:10 AM
> > > To: Bryan Evenson
> > > Cc: Ralf.Friedl at online.de; busybox; griebl at gmx.de
> > > Subject: Re: [PATCH 1/1] hwclock: Verify RTC file descriptor; use
> > > reentrant functions
> > >
> > >
> > > I don't see how this can happen. Do you have evidence rtc_xopen()
> > > did return a result less than zero?
> >
> > I don't have evidence, but that seemed to be the most likely result through
> code inspection. rtc_xopen() returns the result of xopen(), which returns
> the result of open(). open() returns -1 if an error occurred, meaning
> rtc_xopen() could return -1 if an error occurred opening the rtc file.
> >
>
> ...except that xopen() does not just return the result of open().
> xopen() calls xopen3(), which has this:
> ret = open(pathname, flags, mode);
> if (ret < 0) {
> bb_perror_msg_and_die("can't open '%s'", pathname);
> }
>
> bb_perror_msg_and_die should do exactly what it says: exit verbosely.
>
> So no, xopen() cannot ever return -1 unless you have a broken compiler.
> rtc_xopen() similarly cannot return -1.
>
> As a rule of thumb, all the x*() functions will abort if they do not succeed.
Thanks for the explanation. I misread that section and was thinking there was a way for it to return the result of open().
I did get strace loaded on my system and I used that to get a better understanding of where the error occurs. I modified my test script as follows:
----------
#!/bin/sh
i=0
while [ 1 ]; do
strace -f -o /home/root/hwclock_output_"$$_$i".txt /sbin/hwclock -w -u
: $((i++))
sleep 1;
done
----------
Again, on my system I was able to get an instance of hwclock to enter an uninterruptable sleep and hang within seconds when I ran two instances of the test script. By checking the PID of the hung process and the PIDs in the output files, I found the strace of the hung process, which is shown below.
----------
1808 execve("/sbin/hwclock", ["/sbin/hwclock", "-w", "-u"], [/* 13 vars */]) = 0
1808 brk(0) = 0x9c000
1808 uname({sys="Linux", node="at91sam9x5ek", ...}) = 0
1808 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
1808 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
1808 fstat64(3, {st_mode=S_IFREG|0644, st_size=7053, ...}) = 0
1808 mmap2(NULL, 7053, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f8b000
1808 close(3) = 0
1808 open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
1808 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\354\225JJ4\0\0\0"..., 512) = 512
1808 fstat64(3, {st_mode=S_IFREG|0755, st_size=1245200, ...}) = 0
1808 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f8a000
1808 mmap2(0x4a490000, 1284496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4a490000
1808 mprotect(0x4a5bc000, 32768, PROT_NONE) = 0
1808 mmap2(0x4a5c4000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12c) = 0x4a5c4000
1808 mmap2(0x4a5c8000, 6544, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4a5c8000
1808 close(3) = 0
1808 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f89000
1808 set_tls(0xb6f894c0, 0xb6f89b98, 0x4a487058, 0xb6f894c0, 0x4a487058) = 0
1808 mprotect(0x4a5c4000, 8192, PROT_READ) = 0
1808 mprotect(0x4a486000, 4096, PROT_READ) = 0
1808 munmap(0xb6f8b000, 7053) = 0
1808 getuid32() = 0
1808 open("/dev/rtc", O_WRONLY|O_LARGEFILE) = 3
1808 gettimeofday({1399298707, 765884}, NULL) = 0
1808 brk(0) = 0x9c000
1808 brk(0xbd000) = 0xbd000
1808 open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4
1808 fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
1808 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1808 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1808 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f8c000
1808 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 4096) = 3519
1808 _llseek(4, -24, [3495], SEEK_CUR) = 0
1808 read(4, "\nEST5EDT,M3.2.0,M11.1.0\n", 4096) = 24
1808 close(4) = 0
1808 munmap(0xb6f8c000, 4096) = 0
1808 ioctl(3, RTC_SET_TIME, {tm_sec=7, tm_min=5, tm_hour=14, tm_mday=5, tm_mon=4, tm_year=114, ...}
----------
And here is the output from the hwclock instance that occurred from the other test script just before this one:
----------
1799 execve("/sbin/hwclock", ["/sbin/hwclock", "-w", "-u"], [/* 13 vars */]) = 0
1799 brk(0) = 0x9c000
1799 uname({sys="Linux", node="at91sam9x5ek", ...}) = 0
1799 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
1799 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
1799 fstat64(3, {st_mode=S_IFREG|0644, st_size=7053, ...}) = 0
1799 mmap2(NULL, 7053, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f98000
1799 close(3) = 0
1799 open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
1799 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\354\225JJ4\0\0\0"..., 512) = 512
1799 fstat64(3, {st_mode=S_IFREG|0755, st_size=1245200, ...}) = 0
1799 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f97000
1799 mmap2(0x4a490000, 1284496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4a490000
1799 mprotect(0x4a5bc000, 32768, PROT_NONE) = 0
1799 mmap2(0x4a5c4000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12c) = 0x4a5c4000
1799 mmap2(0x4a5c8000, 6544, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4a5c8000
1799 close(3) = 0
1799 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f96000
1799 set_tls(0xb6f964c0, 0xb6f96b98, 0x4a487058, 0xb6f964c0, 0x4a487058) = 0
1799 mprotect(0x4a5c4000, 8192, PROT_READ) = 0
1799 mprotect(0x4a486000, 4096, PROT_READ) = 0
1799 munmap(0xb6f98000, 7053) = 0
1799 getuid32() = 0
1799 open("/dev/rtc", O_WRONLY|O_LARGEFILE) = 3
1799 gettimeofday({1399298707, 387953}, NULL) = 0
1799 brk(0) = 0x9c000
1799 brk(0xbd000) = 0xbd000
1799 open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4
1799 fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
1799 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1799 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1799 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f99000
1799 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 4096) = 3519
1799 _llseek(4, -24, [3495], SEEK_CUR) = 0
1799 read(4, "\nEST5EDT,M3.2.0,M11.1.0\n", 4096) = 24
1799 close(4) = 0
1799 munmap(0xb6f99000, 4096) = 0
1799 ioctl(3, RTC_SET_TIME, {tm_sec=7, tm_min=5, tm_hour=14, tm_mday=5, tm_mon=4, tm_year=114, ...}) = 0
1799 exit_group(0) = ?
1799 +++ exited with 0 +++
----------
So on my system, hwclock is hanging on the ioctl to set the time. Both the instance that set the time and the one that could not set the time are trying to set the RTC to the same time. So I'm assuming from this output that there is something going wrong with the ioctl on my RTC driver when two separate processes attempt to set the time at the same time. If I'm interpreting this correctly, I need to track down what the RTC driver is doing wrong with the RTC_SET_TIME ioctl.
>
> HTH,
> Isaac Dunham
Thanks,
Bryan
More information about the busybox
mailing list