Re: process supervisor - considerations for docker from Dreamcat4 on 2015-02-27 (supervision)

From: Dreamcat4 <dreamcat4_at_gmail.com>
Date: Fri, 27 Feb 2015 12:31:32 +0000

On Fri, Feb 27, 2015 at 10:19 AM, Gorka Lertxundi <glertxundi_at_gmail.com> wrote:
> Dreamcat4, pull request are always welcomed!
>
> 2015-02-27 0:40 GMT+01:00 Laurent Bercot <ska-supervision_at_skarnet.org>:
>
>> On 26/02/2015 21:53, John Regan wrote:
>>
>>> Besides, the whole idea here is to make an image that follows best
>>> practices, and best practices state we should be using a process
>>> supervisor that cleans up orphaned processes and stuff. You should be
>>> encouraging people to run their programs, interactively or not, under
>>> a supervision tree like s6.
>>>
>>
>> The distinction between "process" and "service" is key here, and I
>> agree with John.
>>
>> <long design rant>
>> There's a lot of software out there that seems built on the assumption
>> that
>> a program should do everything within a single executable, and that
>> processes
>> that fail to address certain issues are incomplete and the program needs to
>> be patched.
>>
>> Under Unix, this assumption is incorrect. Unix is mostly defined by its
>> simple and efficient interprocess communication, so a Unix program is best
>> designed as a *set* of processes, with the right communication channels
>> between them, and the right control flow between those processes. Using
>> Unix primitives the right way allows you to accomplish a task with minimal
>> effort by delegating a lot to the operating system.
>>
>> This is how I design and write software: to take advantage of the design
>> of Unix as much as I can, to perform tasks with the lowest possible amount
>> of code.
>> This requires isolating basic building blocks, and providing those
>> building
>> blocks as binaries, with the right interface so users can glue them
>> together on the command line.
>>
>> Take the "syslogd" service. The "rsyslogd" way is to have one executable,
>> rsyslogd, that provides the syslogd functionality. The s6 way is to combine
>> several tools to implement syslogd; the functionality already exists, even
>> if it's not immediately apparent. This command line should do:
>>
>> pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody
>> s6-applyuidgid -Uz s6-ipcserverd ucspilogd "" s6-envuidgid syslog
>> s6-applyuidgid -Uz s6-log /var/log/syslogd
>>
>>
> I love puzzles.
>
>
>> Yes, that's one unique command line. The syslogd implementation will take
>> the form of two long-running processes, one listening on /dev/log (the
>> syslogd socket) as user nobody, and spawning a short-lived ucspilogd
>> process
>> for every connection to syslog; and the other writing the logs to the
>> /var/log/syslogd directory as user syslog and performing automatic
>> rotation.
>> (You can configure how and where things are logged by writing a real s6-log
>> script at the end of the command line.)
>>
>> Of course, in the real world, you wouldn't write that. First, because s6
>> provides some shortcuts for common operations so the real command lines
>> would be a tad shorter, and second, because you'd want the long-running
>> processes to be supervised, so you'd use the supervision infrastructure
>> and write two short run scripts instead.
>>
>> (And so, to provide syslogd functionality to one client, you'd really have
>> 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process,
>> 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as
>> insane as it sounds. Processes are not a scarce resource on Unix; the
>> scarce resources are RAM and CPU. The s6 processes have been designed to
>> take *very* little of those, so the total amount of RAM and CPU they all
>> use is still smaller than the amount used by a single rsyslogd process.)
>>
>> There are good reasons to program this way. Mostly, it amounts to writing
>> as little code as possible. If you look at the source code for every single
>> command that appears on the insane command line above, you'll find that
>> it's
>> pretty short, and short means maintainable - which is the most important
>> quality to have in a codebase, especially when there's just one guy
>> maintaining it.
>> Using high-level languages also reduces the source code's size, but it
>> adds the interpreter's or run-time system's overhead, and a forest of
>> dependencies. What is then run on the machine is not lightweight by any
>> measure. (Plus, most of those languages are total crap.)
>>
>> Anyway, my point is that it often takes several processes to provide a
>> service, and that it's a good thing. This practice should be encouraged.
>> So, yes, running a service under a process supervisor is the right design,
>> and I'm happy that John, Gorka, Les and other people have figured it out.
>>
>> s6 itself provides the "process supervision" service not as a single
>> executable, but as a set of tools. s6-svscan doesn't do it all, and it's
>> by design. It's just another basic building block. Sure, it's a bit special
>> because it can run as process 1 and is the root of the supervision tree,
>> but that doesn't mean it's a turnkey program - the key lies in how it's
>> used together with other s6 and Unix tools.
>> That's why starting s6-svscan directly as the entrypoint isn't such a
>> good idea. It's much more flexible to run a script as the entrypoint
>> that performs a few basic initialization steps then execs into s6-svscan.
>> Just like you'd do for a real init. :)
>> </long design rant>
>>
>>
>>
>>> Heck, most people don't *care* about this kind of thing because they
>>> don't even know. So if you just make /init the ENTRYPOINT, 99% of
>>> people will probably never even realize what's happening. If they can
>>> run `docker run -ti imagename /bin/sh` and get a working, interactive
>>> shell, and the container exits when they type "exit", then they're
>>> good to go! Most won't even question what the image is up to, they'll
>>> just continue on getting the benefits of s6 without even realizing it.
>>>
>>
>> Ideally, that's what would happen. We must ensure that the abstraction
>> holds steadily, though - there's nothing worse than a leaky abstraction.
>>
>>
>> The main thing I'm concerned about is about preserving proper shell
>>>> quoting, because sometimes args can be like --flag='some thing'.
>>>>
>>>
>> This is a solved problem.
>> The entrypoint we're talking about is trivial to write in execline,
>> and I'll support Gorka, or anyone else, who does that. Since the
>> container will already have execline, using it for the entrypoint
>> costs nothing, and it makes command line handling and transmission
>> utterly trivial: it's exactly what I wrote it for.
>>
>>
> I'll really appreciate your help! Using execline as the default scripting
> will facilitate any conversion to other bases like busy box.

Anyway let's nevermind *too much* about what is essentially a part of
the documentation, for a feature which we have not implemented yet!
Whoever ends up to document that and blogs about it will ultimately
decide how such features are finally explained to a general public
audience. There is other kinds of work to be done before that time
comes.

Thanks to Gorak's image being a central place of development. May I
please suggest how we can proceed to implement discussed work. And
split it up into smaller pieces. Possible tasks:

* Convert Gorak's current "/init" script from bash --> execline
* Testing on Gorak's ubuntu base image

* Add support for spawning the argv[], for cmd and entry point.
* Testing on Gorak's ubuntu base image

* Create new base image for Alpine (thanks for mentioning it earlier John)
* Possibly create other base image(s). e.g. debian, arch, busy box and so on.
* Test them, refine them.

* Document the new set of s6 base images
* Blog about them
* Inform the fusion people we have created a successor
* Inform 1(+) member(s) of Docker Inc to make them aware of new s6-images.

To be clear - there is a lot of work being described ^^ there. No idea
and I don't mind who of us does which ones.

* The steps can be done in whatever order (apart from the first 3 steps)...
* It seems like a lot of work for any 1 individual to do everything.
* Hopefully as a group effort we don't end up duplicating pieces work.

I am happy to help (if I can), but at the same time I don't want to
dictate to others, or take control over it. I have no previous
experience of any execline, alpine, or s6 yet. But am happy to learn,
start using them.

Of course the proper channel ATM is to open issues and PR's on Gorak's
current s6-ubuntu base image. So we can open those fairly soon and
move discussion to there.

Another thing we don't need to worry about right now (but may become a
consideration later on):

* Once there are 2+ similar s6 images.
  * May be worth to consult Docker Inc employees about official / base
image builds on the hub.
  * May be worth to consult base image writers (of the base images we
are using) e.g. 'ubuntu' etc.

* Is a possibile to convert to github organisation to house multiple images
     * Can be helpful for others to grow support and other to come on
board later on who add new distros.
  * May be worth to ensure uniform behaviour of common s6 components
across different disto's s6 base images.
     * e.g. Central place of structured and consistent documentation
that covers all similar s6 base images together.

Again I'm not mandating that we need to do any of those things at all.
As it should not be anything of my decision whatsoever. But good idea
to keep those possibilities in mind when doing near-term work. "Try to
keep it general" basically.

For example:

I see a lot of good ideas in Gorak's base image about fixing APT. It
maybe that some of those great ideas can be fed back upstream to the
official ubuntu base image itself. Then (if they are receptive
upstream) it can later be removed from Gorak's s6-specific ubuntu base
image (being a child of that). Which generally improves the level
standardization, granularity (when people choose decide s6 or not),
etc.

>
>
>> --
>> Laurent
>>
>>
Received on Fri Feb 27 2015 - 12:31:32 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC