On Thu, Apr 23, 2015 at 4:40 PM, Laurent Bercot <ska-skaware_at_skarnet.org>
wrote:
>
>  So, I've been planning to write s6-rc, a complete startup/shutdown script
> system based on s6, with complete dependency management, and of course
> optimal parallelization - a real init system done right.
>  I worked on the design, and I think I have it more or less down; and I
> started coding.
>
>  Then Olivier released anopa: http://jjacky.com/anopa/
>
>  anopa is pretty close to my vision. It's well-designed. It's good.
> There *are* essential differences with s6-rc, though, and some of them
> are important enough that I don't want to immediately stop writing s6-rc
> and start endorsing anopa instead.
>
>  This post tries to explain how s6-rc is supposed to work, and how it
> differs from anopa, and why I find the differences important. What I hope
> to achieve is a design discussion, with Olivier of course, but also other
> people interested in the subject, on how an ideal init system should work.
>
>  My goals are to:
>  - reach a decision point: should I keep writing s6-rc or drop it ?
> Dropping it can probably only happen if Olivier agrees on making a few
> modifications to anopa, based on the present discussion, but I don't
> think it will be the case because some of those modifications are
> pretty hardcore.
>  - if I keep writing s6-rc: benefit from this discussion and from
> Olivier's experience to avoid pitfalls or designs that would not stand
> the test of real-life situations.
>
>  So, on to it.
>
>
>  Three kinds of services
>  -----------------------
>
>  Like anopa, s6-rc works internally with two kinds of services: longrun,
> which is simply defined by a service directory that will be directly
> managed by s6, and oneshot, which is defined by a directory containing
> data (a start script, a stop script, and some optional stuff).
>
>  s6-rc allows the user to provide a third kind of service: a "bundle".
> A bundle is simply a set of other services. Starting a bundle means
> starting all the services contained in the bundle.
>  A bundle can be used to emulate a SysV runlevel: the user can put all the
>
> services he needs into a single bundle, then tell s6-rc to change the
> machine
>
> state to "exactly that bundle".
>  Bundles can of course contain other bundles.
>
>  A oneshot or a longrun are called atomic services, as opposed to a bundle,
> which is not atomic.
>  Bundles are useful for the user, because "oneshot" and "longrun" are
> often too small a granularity. For instance, the "Samba" service is made
> of two longruns, smbd and nmbd, but it's still a single service. So,
> samba would be a bundle containing smbd and nmbd.
>
>  Also, the smbd daemon itself could want its own logger, smbd-log.
> Correct daemon operation depends on the existence of a logger (a daemon
> cannot start if its logger isn't working). So smbd would actually be a
> bundle of two long-runs, smbd-run (which is the smbd process itself) and
> smbd-log (which is the logger process), and smbd-run would depend on
> smbd-log.
>
>  Users who want to start Samba don't want to deal with smbd-run, smbd-log,
> nmbd-run and nmbd-log manually, so they would just start "samba", and
> s6-rc would resolve "samba" to the proper set of atomic services.
>
>
>  Source, compiled and live
>  -------------------------
>
>  Unlike anopa, s6-rc does not operate directly at run-time on the
> user-provided service definitions. Why ? Because user-provided data is
> error-prone, and boot time is a horrible time for debugging. Also, s6-rc
> uses a complete graph of all services for dependency management, and
> generating that graph at run-time is costly.
>
>  Instead, s6-rc provides a "s6-rc-compile" utility that takes the
> user-provided service definitions, the "source", and compiles it into
> binary form in a place in the root filesystem, the "compiled".
>
>  At run-time, s6-rc ignores the source, but reads its data from the
> compiled, which can be on a read-only filesystem. It also needs a
> read-write place to maintain information about its state; this place is
> called the "live". Unlike the compiled, the live is small: it can reside
> in RAM.
>
OK. But what about the situation when you are running a live compiled
(running) system. You want to install a new pkg containing a service (or a
bundle). And start that service without having to reboot the whole computer?
Can it compile just the extra bit needed, and add that during runtime,
without having to recompile everything from scratch (until the next reboot)?
>
>  The point of this separation is multifold: efficiency (all checks,
> parsing and graph generation performed at compile-time), safety (the
> compiled can be write-protected), and clarity (separation of user-
> modifiable data, current configuration data, and current live data).
>
>  Atomic services can be very small. It can be a single line of shell
> for a oneshot, for instance. I fully expect package developers to
> produce source definitions with multiple atomic services (and dependencies
> between those services) and a bundle representing the whole package.
> I expect the total number of atomic services on a typical reasonably
> loaded machine to be around a thousand. Yes, it can grow very fast -
> so having a compiled database isn't a luxury.
>
>
>  Run-time
>  --------
>
>  At run-time, s6-rc only works in *stage 2*.
>  That is important, and one of the few things I do not like in anopa:
> stage 1 should be completely off-limits to any tool.
>
>  s6-rc only wants a machine with a s6-svscan running on a scandir. It does
> not care what happened before. It does not care whether s6-svscan is
> process 1 or not.
>
>  This does not mean s6-rc cannot handle one-time initialization. On the
> contrary, my view is that one-time initialization should be deferred to
> stage 2 as much as possible, with an absolutely minimal stage 1. For those
> who want to run s6-svscan as process 1 (and they're right), I intend to
> work on a s6-init package that will provide suitable minimal stage 1s
> depending on the OS and user configuration; they will start stage 2 with
> s6-svscan running on an empty scandir - save the catch-all logger and
> maybe a getty.
>
>  In stage 2, the user should start by running the "s6-rc-init" program,
> which is roughly the equivalent of anopa's "aa-enable".
> s6-rc-init will initialize the live area, and also start all the
> supervisors for all the defined long-run services (so that notifications
> work properly later on). Service directories are copied from the compiled
> to the live, and initially they all have a down file so the supervisors
> are started but not the services. Down files, like the rest of service
> directories, are managed directly by s6-rc: one the user relinquishes
> her machine state management to s6-rc, she does not tinker manually with
> service directories ever again.
>
>  After s6-rc-init has been run, the user can simply invoke the service
> management engine, the "s6-rc" program itself. "s6-rc -u servicelist"
> will bring up all the services in servicelist. servicelist can contain
> bundle names: s6-rc will first resolve everything into a set of atomic
> services, then start everything, beginning with the services it needs to
> bring up and that have no dependencies. As soon as the dependencies are
> solved for a service belonging to the set, s6-rc starts this service.
>  s6-rc exits when it has no more services in waiting.
>
>  The s6-rc program itself is pretty small. I have finished writing its
> code:
> the source is less than 25 kB long. All the complexity has been moved to
> the data structures - basically to s6-rc-compile. I like the idea that
> the main engine, the program that actually starts and stops services and
> that the boot process lives by, is small and simple; all the hard stuff
> is handled offline.
>  Oh, and s6-rc does not use malloc. :)
>
>  If an error occurs, i.e. a start script fails, s6-rc marks this service,
> and recursively all that depends on it, as unavailable for this run. It
> will keep running until it has started everything it has been asked to
> start and that does not depend on the failing service. It then exits
> nonzero.
>  There is no retry policy. Users can loop around the s6-rc invocation if
> they want to implement a retry policy: "s6-rc -u servicelist" is
> idempotent if servicelist does not change between invocations.
>
>  A one-shot start script is considered "pending" by s6-rc while it is
> running; it is considered successful when it exits zero, and a failure
> when it exits nonzero or is killed. A "start" action for a longrun
> service is a "s6-svc -U" invocation: it is successful when the daemon is
> running and has notified its readiness. A timeout can be defined, just
> like with anopa.
>  It is possible, though not recommended, for s6-rc to assume autoreadiness
> for a longrun service (i.e. start it with "s6-svc -u").
>
>
>  Symmetry and dependencies
>  -------------------------
>
>  Dependencies are provided by the user, in the source; they are tied to
> atomic services (a bundle cannot depend on anything).
>  An atomic service can depend on any other service. A dependency on a
> bundle
>
> means a dependency on every atomic service contained in that bundle.
>  s6-rc-compile reads all the dependencies and creates the complete DAG in
> compiled. Cycles, of course, result in a compilation error.
> s6-rc-compile also automatically creates the dual DAG of reverse
> dependencies.
>
>  Bringing stuff up (s6-rc -u) and bringing stuff down (s6-rc -d) are
> symmetrical. They are handled the exact same way by s6-rc, calling the
> "start" or "stop" script for oneshots, or calling "s6-svc -U" or
> "s6-svc -d" for longruns, and using either the direct dependency graph
> or the reverse dependency graph as needed.
>
>
>  Live updates
>  ------------
>
>  There's a complex thing that anopa more or less evades but that I feel is
> necessary in order to be adopted by a distribution: live updates. I'm not
> exactly sure yet how to proceed, but I have a vague idea, and I would like
> more input on the subject.
>
>  Users will upgrade their packages. They will sometimes need to restart
> longrun services. If not much has changed, it's easy: they can do it with
> s6-svc without touching the global state, so it's not s6-rc's or anopa's
> concern. However, sometimes things change: new daemons are introduced,
> new dependencies are introduced, etc.
>
>  My view is that packages should provide source definitions, and after an
> update, the distribution should invoke s6-rc-compile again. This is easy
> enough, but then the live state does not match the current compiled
> service database anymore. anopa has a similar problem with its current
> service repository.
>
>  I am thinking about a utility, "s6-rc-update", that would take the live,
> the old compiled and the new compiled as inputs, and that would update
> the live as smartly as possible, with carefully designed heuristics;
> users could also tell s6-rc-update exactly what to do via annotations in
> the source, that s6-rc-compile would translate into the new compiled.
>
>
>  Tricky implementation details
>  -----------------------------
>
>  What good is a new init system if it's vulnerable to the old sysvrc
> pitfalls ? :P
>
>  One of the main issues with sysvrc is that scripts are run as scions of
> the invoking shell. So, a sysvrc script run by boot scripts isn't run with
> the same environment as the same script run manually by an admin, and
> this is very difficult to harden.
>
>  Supervision suites solved that problem for longrun services. Since
> daemons are started by the supervision tree, and never by an admin's shell
> or the boot scripts' shell, they are always started with a reproducible
> environment.
>
>  But what about oneshot services ? What about "start" and "stop" scripts ?
> anopa actually runs them as children of "aa-start" and "aa-stop".
> Which *may* be just as problematic. We need a way to run oneshot scripts
> in the same reproducible manner as daemons.
>
>  s6-rc does this. Every s6-rc script invocation will be reproducible.
>  I'll let you guys think a little about how it does it; I'm both very
> proud and very disgusted by the solution. If you manage to guess how
> s6-rc does it, it means that your mind is just as warped as mine; but
> no matter whether you think that's genius or that's horrible, or both,
> it's something anopa does not. :)
>
>
>  Nice things anopa does
>  ----------------------
>
>  There are a lot of nice things anopa does, and that I may shamelessly
> copy if Olivier accepts: for instance, all the terminal manipulation.
> Progress bars are shiny. :)
>
>  However, I won't add progress bars to s6-rc if it makes it significantly
> more complex (read: if it really needs heap memory). And I don't
> understand the need for the pipe from aa-start to the start script:
> what kind of information does aa-start give its child ? If you remove
> that pipe, you can run the start script with the user's stdin, so you
> don't have to add noecho support to aa-start - asking for passwords can
> be performed entirely by the start script.
>
>  I would also like to hear more about the "wants" dependencies. Is that
> a thing ? What does it mean exactly ? And what is the point of
> oneshots not marked "essential" ? Generally speaking, anopa separates
> service ordering and service dependencies; I would like to hear more
> about the goal of that differentiation. Is s6-rc's hard dependency model
> ("if A depends on B, then B will start first, and A will not start if B
> can't be successfully started") insufficient ? Could I have some real-life
> examples of this ?
>
>  ... and that was more than long enough for a first post on the subject.
> Thanks for having read that far :)
>
> --
>  Laurent
>
Received on Thu Apr 23 2015 - 16:39:37 UTC