Re: How to report a death by signal ? from Olivier Brunel on 2015-02-18 (skaware)

From: Olivier Brunel <jjk_at_jjacky.com>
Date: Wed, 18 Feb 2015 14:04:11 +0100

On 02/18/15 13:27, Laurent Bercot wrote:
> On 18/02/2015 11:58, Peter Pentchev wrote:
>> OK, so the "not using the whole range of valid exit codes" point rules
>> out my obvious reply - "do what the shell does - exit 128 + signum".
>
> Well the shell is happily ignoring the problem, but it doesn't mean
> it has solved it. The shell reserves a few exit codes, then does some
> best effort, hoping its invoked commands do not step on its feet.
> It works because most commands will avoid exiting something > 125,
> but it's still a convention, and most importantly, the shell itself
> does not follow that convention (it obviously cannot!)
> So, something like sh -c "sh -c foobar" does not report errors
> properly: for 126 and 127, there's no way to know if the code belongs
> to the inner shell or the outer shell, and for 128+, there's no way
> to know if the inner shell or the foobar process got killed.
>
> Shells get away with it because when they're nested, it's usually
> auto-subshell magic and users don't want to know about the inner
> shell; but here, I'm trying to solve the problem for execline commands,
> and those tend to be nested a lot - so I definitely cannot reserve codes
> for the outer command, because the inner command may very well use the
> same ones too.
>
>
>> Now the question is, do you want to solve this problem in general, or do
>> you want to solve it for a particular combination of programs, even if
>> new programs may be added to that combination in the future, but only
>> under certain rules? If it's the former (in general), then, sorry, I
>> don't have a satisfactory answer for you, and the fact that the POSIX
>> shell still keeps the "exit 128 + signum" behavior mostly means that
>> nobody else has come up with a better one, either (or it might be
>> available at least as some kind of an option).
>
> It just means that nobody cares about shell exit codes. Error handling,
> if any, is done inside of shell scripts anyway; and in most scripts, a
> random signal killing a running command isn't even something people think
> about, and I'm sure there are hilarious behaviours hiding in dark corners
> of very popular shell scripts, that fortunately remain asleep to this day.
>
> For execline, however, I cannot use the same casual approach. Execline
> scripts live and die by proper exit code reporting, and carelessness may
> lead to very obvious breakage.
>
>
>> Personally, I quite like the idea of some kind of a pipe (be it a
>> pipe(2) pair of file descriptors or an AF_UNIX/PF_UNSPEC socketpair or
>> some other kind of communication channel based on file descriptors),
>> even if it is only unidirectional:
>
> Oh, don't get me wrong, I'm a fan of child-to-parent communication via
> pipes, and I use it wherever applicable. Unfortunately, the child may
> be anything here, so I need something generic.

I don't follow, what's wrong with using a fd? Cause that was my idea as
well: return the exit code or 255. If there's a need to differentiate
between exiting 255 or a signal, an option taking a fd shall be used,
and the process P will then also write to said fd the exit code, or 255
+ signal number (as a string).

Note the the child C has nothing to do with this here, it is P that
waits for it and gets the wstat, possibly writing to the fd given as
option, and it is grandparent G that needs to specify an fd to P and
then read it to get the full status info.

Though if you want "shell compatibility" you could also have an option
to return exit code, or 128+signum when signaled, and similarly one
would either be fine with that, or have to use the fd for full/complete
info.

> Thanks for your input !
>
Received on Wed Feb 18 2015 - 13:04:11 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:38:49 UTC