Re: Some suggestions on old-fashioned usage with s6 2.10.x from Laurent Bercot on 2021-02-15 (supervision)

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Mon, 15 Feb 2021 14:58:39 +0000

>I do not really understand their excuse here. CLI incompatibility is
>trivially solvable by creating links (or so) for `halt' / `poweroff' /
>`reboot', and even the `shutdown' command can be a wrapper for an `atd'
>based mechanism.

  The options! The options need to be all compatible. :) And for
"shutdown", they would never implement a wrapper themselves, I would
have to do it for them - which is exactly what I did, although it's
a C program that actually implements shutdown, not a wrapper around an
atd program I can't assume will be present on the system.

  I'm not defending distros here, but it *is* true that a drop-in
replacement, in general, is a lot easier to deal with than a drop-in-
most-of-the-time-maybe-but-not-with-that-option replacement. Anyone
who has tried to replace GNU coreutils with busybox can relate.

> In case they complain about the implementation of the
>CLI, the actual interface to `shutdownd' is not that similar to the
>`telinit' interface (at least to the one I think it is) either.

  Which is why s6-l-i also comes with a runleveld service, for people
who need the telinit interface. shutdownd is only for the actual
stages 3 and 4, not service management (which telinit is a now obsolete
forerunner of).

>If I understand it correctly, letting `s6-svscan' exec() stage 3 also
>achieves immunity to `kill -KILL -1'. I also find this "old-fashioned"
>approach conceptually and implementationally simpler than an army of
>`s6-supervise' restarting only to be killed again

  What army? By the time the final kill happens, the service manager
has brought everything down, and shutdownd has cleaned up the scandir,
only leaving it with what *should* be restarted. You seem to think
I haven't given these basic things the two minutes of attention they
deserve.

  Conceptually, the "old-fashioned" approach may be simpler, yes.
Implementationally, I disagree that it is, and I'll give you a very
simple example to illustrate it, but it's not the only thing that
implementations must pay attention to, there are a few other quirks
that I've stumbled upon and that disappear when s6-svscan remains
pid 1 until the very end.

  You're going to kill every process. The zombies need to be reapt,
else you won't be able to unmount the filesystems. So your pid 1
needs to be able to wait for children it doesn't know it has
(foreground does not) and guarantee that it doesn't try unmounting
the filesystems before having reapt everything (a shell does not give
ordering guarantees when it gets a SIGCHLD, even though it works in
practice). So for this specific use I had to add a special case to
execline's wait command, "wait { }", that waits on *everything*, and
also make sure that wait doesn't die because it's going to run as pid 1,
even very briefly.
  And after that, you need to make sure to unmount filesystems
immediately, because if you spawn other processes, you would first have
to wait on them as well.

  For every process that may run as pid 1, you need extra special care.
Using an interpreter program as pid 1 means your interpreter needs to
have been designed for it. Using execline means every execline binary
that may run as pid 1 needs to be vetted for it. If your shutdown
sequence is e.g. written in Lisp, and your Lisp interpreter handles
pid 1 duties correctly, okay, that's fair, but that's *two* programs
that need to do it, when one would be enough.
  s6-svscan has already been designed for that and provides all the
guarantees you need. When s6-svscan is running as pid 1, it takes away
a lot of mental burden off the shutdown sequence.

> and a `shutdownd'
>restarting to execute the halting procedure (see some kind of "state"
>here? Functional programmers do not hate it for nothing).

  Yes, there is one bit of state involved. I think our feeble human minds,
and a fortiori computers, can handle one bit of state.

> I know this
>seems less recoverable than the `shutdownd' approach, but does that
>count as a reason strong enough to warrant the latter approach, if the
>halting procedure has already been distilled to its bare essentials
>and is virtually immune to all non-fatal problems (that is, excluding
>something as severe as the absence of a `reboot -f' implementation)?

  My point is that making the halting procedure virtually immune to all
non-fatal problems is *more difficult* when you tear down the
supervision tree early. I am more confident in the shutdownd approach,
because it is less fragile, more forgiving. If there's a bug in it, it
will be easy to fix.

  I understand that the barebones approach is intellectually more
satisfying - it's more minimalistic, more symmetrical, etc. But shutting
down a machine is *not* symmetrical to booting it. When you boot, you
start with nothing and need a precise sequence of instructions in order
to build up to a functional system. When you shutdown, you have a fully
functional system already, that has proven to be working, and you just
need to clean up and make sure you don't stop with an incoherent state;
you don't need to deconstruct the working system you have in order to
poweroff with the minimal amount of stuff! As long as you can cleanly
unmount the filesystems, nobody cares what your process tree looks like
when the machine is going to be *down*.

  In this instance, the existence of a reliable pid 1 with well-known
behaviour is a strong guarantee that makes writing a shutdown sequence
easy enough. Voluntarily getting rid of that guarantee and making your
system more fragile because technically supervision is not *needed*
anymore may make sense from an academic perspective, and may be
aesthetically more pleasing, but from an engineering standpoint, it is
not a good idea.

>What I intend to express is that unconditionally correlating "a bunch
>of [...] scripts" to "a 'screwdriver and duct tape' feel" is a typical
>systemd fallacy. You seemed to be confusing "scripts containing lots of
>boilerplate" with "scripts that are minimised and clear".

  The "screwdriver and duct tape" feel does not come from the fact that
those are scripts; it comes from the fact that the scripts run in a less
forgiving environment where they have to provide the necessary guarantees
themselves, as opposed to keeping using the framework that has been
running for the whole lifetime of the system and that is still valid and
helpful, even though for once you have to interact with it and tell it
to stop supervising some services because we're shutting down - which is
the exact kind of situation the supervision API was made for.

  The distinction is similar to doing things in kernel space vs. in user
space. If I have a task to do and have a kernel running, I prefer to do
the task in user space - it's more comfortable and less error-prone, and
if someone wishes to do it in kernel space, my reaction will be "why?
this is more hackish, they're probably trying to flex their kernel
programmer muscles, good engineering says this belongs in user space".
Running naked scripts as pid 1 when you don't have to kinda gives me
the same feeling.

>According to Guillermo's observation about the behavioural similarity
>between slew's `rc.boot'/`rc.halt' and the current mechanism with
>s6-linux-init, if I understand the big picture correctly enough, the
>fundamental difference between the approaches might be the difference in
>languages (to avoid further digression, here I expressly avoid talking
>about Lisp ;) and the attendant difference in dependencies. Speaking of
>the latter, I do not find declaring dependence on things like `rc' and
>BusyBox really a problem to any packager of systemd. Speaking of the
>former, the "old-fashioned" approach is obviously more flexible; I have
>also said that it is probably shorter and perhaps clearer.

  The fundamental difference is that the current s6-linux-init hardcodes
a lot of things in stage 1, purposefully. Yes, it is less flexible -
though you *still* have a stage 1 hook if you really need it - but the
whole point is to make stage 1 entirely turnkey and foolproof, and only
hand off to the user when the supervision framework is in place and
they don't have to worry about basic things like not being able to log
into the system. Same reason why I prefer the shutdownd approach:
minimize and automate all the parts where the supervision tree is not
operational, so that users can always assume that nothing they do is
going to brick the system.

  It bears repeating that the main criticism I've received for the s6
ecosystem is, overwhelmingly, the *abundance* of moving parts, and the
difficulty of grasping the big picture. The current s6-linux-init helps
with this, by hiding a lot of fragile moving parts and making it
*easier* to switch to s6 as an init system without having to fully
understand the intricate details of stage 1.
  Of course, it's not necessarily perceived as a benefit by tinkerers
like you, who do not mind, or even enjoy, the extra DIY feel. I'm
sorry - but if you need that kind of flexibility in stage 1, you are
perfectly capable of building your own stage 1 without s6-linux-init.

  I also disagree that the script approach is shorter and/or clearer.
It may be clearer to people who read a script better than a doc page
(or C code), but I don't think it should matter as long as the doc is
accurate; if it's not, that's what should be fixed. And the source code
may be shorter with a scripted stage 1, for sure, but the code paths
taken by the CPU are way shorter with the C version, and make fewer
assumptions. I'm confident that the current s6-linux-init breaks in
significantly fewer situations than its previous incarnation.

--
  Laurent

Received on Mon Feb 15 2021 - 14:58:39 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC