Re: patch: sv check should wait when svrun is not ready

From: Avery Payne <avery.p.payne_at_gmail.com>
Date: Tue, 16 Jun 2015 10:42:56 -0700

I'm not the maintainer of any C code, anywhere. While I do host a
mirror or two on bitbucket, I only do humble scripts, sorry. Gerrit is
around, he's just a bit elusive.

On 6/16/2015 9:37 AM, Buck Evan wrote:
> I'd still like to get this merged.
>
> Avery: are you the current maintainer?
> I haven't seen Gerrit Pape on the list.
>
> On Tue, Feb 17, 2015 at 4:49 PM, Buck Evan <buck_at_yelp.com
> <mailto:buck_at_yelp.com>> wrote:
>
> On Tue, Feb 17, 2015 at 4:20 PM, Avery Payne
> <avery.p.payne_at_gmail.com <mailto:avery.p.payne_at_gmail.com>> wrote:
> >
> > On 2/17/2015 11:02 AM, Buck Evan wrote:
> >>
> >> I think there's only three cases here:
> >>
> >> 1. Users that would have gotten immediate failure, and no
> amount of
> >> spinning would help. These users will see their error delayed
> by $SVWAIT
> >> seconds, but no other difference.
> >> 2. Users that would have gotten immediate failure, but could
> have gotten
> >> a success within $SVWAIT seconds. All of these users will of
> course be glad
> >> of the change.
> >> 3. Users that would not have gotten immediate failure. None of
> these
> >> users will see the slightest change in behavior.
> >>
> >> Do you have a particular scenario in mind when you mention
> "breaking lots
> >> of existing installations elsewhere due to a default behavior
> change"? I
> >> don't see that there is any case this change would break.
> <snip>
>
> Thanks for the thoughtful reply Avery. My background is also
> "maintaining business software", although putting it in those terms
> gives me horrific visions of java servlets and soap protocols.
>
> > I have to look at it from a viewpoint of "what is everything
> else in the system expecting when this code is called". This
> means thinking in terms of code-as-API, so that calls elsewhere
> don't break.
>
> As a matter of API, sv-check does sometimes take up to $SVWAIT
> seconds to fail.
> Any caller to sv-check will be expecting this (strictly limited)
> delay, in the exceptional case.
> My patch just extends this existing, documented behavior to the
> special case of "unable to open supervise/ok".
> The API is unchanged, just the amount of time to return the result
> is changed.
>
> > This happens because the use of "sv check (child)" follows the
> convention of "check, and either succeed fast or fail fast", ...
>
> Either you're confused about what sv-check does, or I'm confused about
> what you're saying.
> sv-check generaly doesn't fail fast (except in the special case I'm
> trying to make no longer fail fast -- svrun is not started).
> Generally it will spin for $SVWAIT seconds before failing.
>
> > Without that fast-fail, the logged hint never occurs; the
> sysadmin now has to figure out which of three possible services in
> a dependency chain are causing the hang.
>
> Even if I put the above issue aside aside, you wouldn't get a hang,
> you'd get the failure message you're familiar with, just several
> seconds (default: 7) later. The sysadmin wouldn't search any more than
> previously. He would however find that the system fails less often,
> since it has that 7 seconds of tolerance now. This is how sv-check
> behaves already when a ./check script exits nonzero.
>
>
> > While this is
> > implemented differently from other installations, there are
> known cases
> > similar to what I am doing, where people have ./run scripts like
> this:
> >
> > #!/bin/sh
> > sv check child-service || exit 1
> > exec parent-service
>
> This would still work just fine, just strictly more often.
>
>
Received on Tue Jun 16 2015 - 17:42:56 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC