Re: Rare runsv logging problem
On Fri, Jul 25, 2014 at 10:32 PM, James Powell <james4591_at_hotmail.com> wrote:
> Another thing could be that the service may not need a log. I've directed a lot of unwanted output to /dev/null.
My service needs a log. I need to know the things these programs send
to stderr/stdout.
> Can you post one of your run files as an example?
Sure.
run:
#!/bin/sh
mkdir -p /mnt/log/baz
chown -R user1 /mnt/log/baz
cd /opt/baz/current
exec chpst -u user1 ./run_baz 2>&1
log/run:
#!/bin/sh
# The main run script takes care of ensuring the log dir exists.
exec svlogd -ttt /mnt/log/baz/
>
> Sent from my Windows Phone
> ________________________________
> From: James Powell<mailto:james4591_at_hotmail.com>
> Sent: 7/25/2014 9:35 PM
> To: Caleb Spare<mailto:cespare_at_gmail.com>; supervision_at_list.skarnet.org<mailto:supervision_at_list.skarnet.org>
> Subject: RE: Rare runsv logging problem
>
> My question is why are you running Upstart? Runit has it's own init so Upstart is pointless. Runit's binary should maintain runsv. It also could depend on the run script also having an improper handling.
>
> Sent from my Windows Phone
> ________________________________
> From: Caleb Spare<mailto:cespare_at_gmail.com>
> Sent: 7/25/2014 5:16 PM
> To: supervision_at_list.skarnet.org<mailto:supervision_at_list.skarnet.org>
> Subject: Rare runsv logging problem
>
> Hi,
>
> I've been using runit for a while now and it has been mostly
> wonderful. I'm noticing a persistent issue and I'm not sure how to
> debug it.
>
> On the servers we're running Ubuntu and we use runit 2.1.1 via the
> default package that comes with the distro. Upstart runs runsvdir and
> we use runit to manage all of our application processes. Each
> application has a simple ./run and ./log/run; the latter execs svlogd
> (this is all a typical configuration, as I understand it).
>
> The problem I'm seeing is that, very occasionally, runsv will get into
> a bad state where svlogd is not running. (I'm not sure if it fails to
> start svlogd or if this happens later on after it has been running
> properly.) When the problem occurs, pstree shows something like this:
>
> runsvdir-+-runsv-+-foo---5*[{foo}]
> | `-svlogd
> |-runsv-+-bar---21*[{bar}]
> | `-svlogd
> `-runsv---baz---250*[{baz}]
>
> Here you can see that the baz process does not have an associated
> svlogd process. Further:
>
> $ sudo sv s foo
> run: foo: (pid 4885) 526260s; run: log: (pid 875) 526517s
> $ sudo sv s baz
> run: baz: (pid 2337) 2983swarning: baz: unable to open supervise/ok:
> file does not exist
> ; run: log: (pid 2337) 2983s
>
> Two strange things there: the warning about supervise/ok and also that
> the pid for 'log' is the same as for 'baz'.
>
> When runsv is in this bad state, the output from baz goes right to
> runsvdir and ends up in /var/log/upstart/runsvdir.log.
>
> The fix I've been using is to 'sv d baz' and then kill the offending
> runsv process. Runsvdir will quickly restart it and then everything
> will be working:
>
> runsvdir-+-runsv-+-foo---5*[{foo}]
> | `-svlogd
> |-runsv-+-baz---25*[{baz}]
> | `-svlogd
> `-runsv-+-bar---20*[{bar}]
> `-svlogd
>
> I'm unsure what causes this rare problem. We only do simple things
> with the runit: sv {t,d,u}. When we deploy services, we rsync a
> directory from elsewhere on the box into /etc/services/<name> and then
> 'sv t <name>'. That source dir only has ./run, ./finish, and
> ./log/run.
>
> Any ideas of what we might be doing wrong, or how to otherwise avoid
> this issue? Or if not, what I could do to further debug?
>
> Sorry for the long email; I wanted to be thorough in my description
> and avoid making assumptions about what could be causing this problem.
>
> Thanks,
> Caleb Spare
Received on Sat Jul 26 2014 - 06:16:50 UTC
This archive was generated by hypermail 2.3.0
: Sun May 09 2021 - 19:44:18 UTC