LOL, well I am trying to do drill test and see how resilience of runit
could be, this is one of the minor downfall.
And yes I did "killall -9 runsv" so simulate it have been kill by whatever
reason.
On Thu, Jun 23, 2016 at 9:37 AM, Colin Booth <cathexis_at_gmail.com> wrote:
> On Jun 22, 2016 5:39 PM, "Thomas Lau" <tlau <tlau_at_tetrioncapital.com>_at_
> <tlau_at_tetrioncapital.com>tetrioncapital.com <tlau_at_tetrioncapital.com>>
> wrote:
> >
> > here is the run script:
> >
> > #!/bin/sh
> > exec 2>&1
> > echo "*** Starting service ..."
> > RUNASUSER="tlau"
> > RUNASUID=$(getent passwd $RUNASUSER | cut -d: -f3)
> > RUNASGROUPS=$(id -G $RUNASUSER | tr ' ' ':')
> > exec chpst -u :$RUNASUID:$RUNASGROUPS /usr/bin/memcached -vvv -m 64
> >
> Your runscript seems ok, I'd probably do the user lookup differently but
> that's mostly a style thing.
> >
> > I just tested -P, doesn't help and I could kill runsv process, memcached
> daemon still running.
> >
> > I know OOM might not kill it, just trying to simulate what happen, who
> knows when I was working on a system at 3AM in the morning and accidentally
> kill runsv? :) I want to find out how is runit fault tolerance level.
>
> Oh, it sounded like you were having real issues. Runsv (and the rest of the
> runit suite) should stay functional well after the rest of the system has
> been run into the ground. As long as you avoid issuing a "killall -9 runsv"
> or something equally catastrophic you should be ok. Generally speaking, as
> long as you use the sv command instead of direct signaling, you're good to
> go. In my history of using supervision-heavy systems and getting paged
> awake at 3 AM, screwing up the supervisor is low on my list of mistakes.
>
> Cheers!
> -Colin
>
--
Thomas Lau
Director of Infrastructure
Tetrion Capital Limited
Direct: +852-3976-8903
Mobile: +852-9323-9670
Address: Suite 2716, Two IFC, Central, Hong Kong
Received on Thu Jun 23 2016 - 01:46:26 UTC