Re: A program that can get exactly the log of a supervised process?

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Sun, 24 Oct 2021 06:20:53 +0000

>Any idea on how the log "teeing" may be done cleanly (and portably
>if possible; something akin to `tail -f' seems unsuitable because of
>potential log rotation), and perhaps any flaw or redundancy in the
>design above?

  The obstacle I have always bumped against when trying to do similar
things is that the teeing program always has to remain there, even
after it has done its job (it has read the readiness line and informed
the supervisor). And so, instead of "service | logger", your data
flow permanently becomes "service | loggrep | logger" before
readiness, and "service | cat | logger" after readiness (the best
loggrep can do is exec into cat, or reimplement its functionality,
once it has read its readiness line).

  That wouldn't be a huge performance problem, especially if "cat" can
do zero-copy data, but it is definitely a reliability problem:
  - loggrep must die when the service dies, so a new loggrep can be run
when the service runs again. So loggrep cannot be run as a separate
supervised service in the same service pipeline. (If loggrep were to
restart independently from service, it would need to check whether
service is ready, and run as 'cat' if it is. This is doable, but more
complex.)
  - That means that either the pipe between service and loggrep cannot
be held, or loggrep must have an additional notification that service
died. This is, again, doable, but more complex.
  - If loggrep isn't supervised, and the pipe isn't being held, then
killing loggrep will incur a broken pipe, which means a service restart
with a lost line of logs, which supervision aims to avoid.

  So basically, either loggrep is a simple tee-like program but you
weaken the supervision properties of the service, or the functionality
needs to be embedded in the supervision architecture, with loggrep
being a consumer for service and a producer for logger (which is
easy with s6-rc but not so much with pure s6) and loggrep always
watching the state of service (which is not so easy with s6-rc, where
you don't know the full path to another service directory).

  In short: practical issues. It's impossible to do that in a clean,
satisfying way.

  And it entirely makes sense that it's so difficult, because the very
idea is to use the *data flow* to inform the *control flow*, and that
is inherently dangerous and not planned for in supervision
architectures.
Making your control flow depend on your data flow is not a good pattern
at all, and I really wish daemons would stop doing that.

  I'm sorry I don't have a better answer.

--
  Laurent
Received on Sun Oct 24 2021 - 08:20:53 CEST

This archive was generated by hypermail 2.4.0 : Sun Oct 24 2021 - 08:21:27 CEST