Re: nosh: service-dt-scanner gets repeatedly killed by SIGABRT

From: Jonathan de Boyne Pollard <J.deBoynePollard-newsgroups_at_NTLWorld.com>
Date: Sun, 12 Jul 2015 13:45:20 +0100

Guillermo:
> Jonathan de Boyne Pollard:
>> If there's no error output, crank up strace and see what the last few system calls are. It's probably worthwhile doing that anyway, in fact.
> [...]
>
> a read() call on the file descriptor returned by the inotify_init() that produces an EINVAL error, followed rt_sigprocmask() with a SIG_UNBLOCK argument, and the tgkill() that sends the SIGABRT.

Remember that I said that my immediate suspicion is a (fourth) libkqueue
bug? It's a fourth libkqueue bug.

And it's here:

* https://github.com/mheily/libkqueue/blob/master/src/linux/vnode.c#l70

As the inotify(7) manual page says, if an event is larger than the
buffer size given to read(), it fails with EINVAL. And events can be
larger than sizeof(struct notify_event). libkqueue doesn't deal with
this failure properly, leading to a call to abort():

* https://github.com/mheily/libkqueue/blob/master/src/linux/platform.c#l181

nosh code never calls abort(), never calls raise(SIGABRT), and would
have printed some kind of message if an unhandled exception had led to
an abort being raised by the C++ library.

The output that you are seeing from service-dt-scanner is because of a
spurious wakeup.

* https://github.com/mheily/libkqueue/blob/master/src/linux/platform.c#l199

You can turn these debug messages on with the KQUEUE_DEBUG=1 environment
variable (and compiling the library in debug mode), apparently.

* https://github.com/mheily/libkqueue/blob/master/src/common/kqueue.c#l68

libkqueue is receiving events from inotify that the caller of kevent()
isn't actually interested in, resulting in a spurious wakeup from the
call to kevent() with no actual event to report. The output to standard
error is a minor bug in service-dt-scanner, because it assumes that
every time that it is woken up and kevent() returns successfully there
will be at least one event. It's finding nonsense in the event buffer
and printing out a debug message when it ignores the nonsense. This is
fixed in version 1.18, but this isn't really the cause of your problem
here. It's just distracting log noise.

The problem here is that inotify is waking kevent() up because you
listed the directory. I suspect this change in your version of
libkqueue, at first glance:

*
https://github.com/mheily/libkqueue/commit/e41cc259a0318b0e7925521d0fe3bc7433971ace

After the spurious wakeup, there is another second event enqueued by the
kernel, that is bigger than sizeof(struct notify_event). Whether that's
an uninteresting event too, and whether it is also caused by your
listing the directory, is unknown. libkqueue isn't passing a buffer big
enough to read it so that we see what it is, and is abort()ing when the
kernel returns an error because the read buffer is too small.

This will be a tricky one for the libkqueue people to fix, since
libkqueue isn't currently geared up to process multiple events from
inotify at once, which it would have to be prepared for if it were to
start using a bigger buffer. But it is a libkqueue problem to be
fixed. All that service-dt-scanner is doing is registering just one
event of interest, and calling kevent() in a fairly tight loop that's in
fact doing nothing else (apart from dumping the value of the spurious
event).
Received on Sun Jul 12 2015 - 12:45:20 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC