[Pkg-sysvinit-devel] NTP bites dovecot
Henrique de Moraes Holschuh
hmh at debian.org
Fri Jun 19 02:14:13 UTC 2009
On Thu, 18 Jun 2009, martin f krafft wrote:
> also sprach Henrique de Moraes Holschuh <hmh at debian.org> [2009.06.18.0528 +0200]:
> > If it is userspace that is doing it, it can be fixed by applying the
> > proper adjustment to the RTC, hwclock can do it through /etc/adjtime.
> > See also package adjtimex.
> You mean those approaches would not cause dovecot to die?
The kernel could do it properly, because it can set the clock and fix it
(apply the drift, and refuse to bring system time backwards) while userspace
is still frozen. Userspace might not like the system time jumping forward,
but usually if you have such an application that cannot be suspended because
it can't deal with large time jumps, you already know it (e.g.: timidity).
I say the kernel "could" do it properly, because I am not sure it is really
applying the drift on resume in the first place. Nor have I checked if it
refuses to move the time backwards when restoring system time from the RTC
on the resume path (that is obviously a very broken RTC that counts
backwards, and not your problem, which seems to be a RTC that is too fast).
Now, if userspace is the one setting the time on resume, you will have to
adjust the clock with stuff running, and that's bad. You *cannot* jump
backwards, so you will need slewing in that case, which will have 1s be less
than 1s (or more than 1s) for a long period of time, and that can certainly
cause problems for certain classes of applications.
IMHO, the best we can do _by default_ in that case is to simply refuse to
step the time backwards (and log warnings, of course), and make sure to not
let chrony and ntp do time stepping outside of system boot. ntp will slew
by default if it can't step, but it is by such a small ammount, it takes
*weeks* to correct a 600s error. IMHO that's unlikely to hose most
applications, so I consider that an acceptable solution. I don't know about
BTW, when one needs to tell the kernel about the RTC drift, one needs an
external reference to measure the drift. Chrony and ntp can do it
automatically using the network clocks as reference, but I don't know how
well it is working. adjtimex had docs on how to do it by hand and write it
to /etc/adjtime, and the initscript that called hwclock would use that
automatically, I think (but I last looked at it more than 5 years ago...).
> > As for running ntpdate/ntp -g (and whatever makes chrony step the
> > clock backwads) outside of early boot is well into Don't Do That
> > territory...
> Debian and Ubuntu both do it, on resume as well as on if-up. I am
> not sure this is still the case, but it certainly used to be.
That doesn't surprise me at all. Note that it is only a problem if you tell
ntp/chrony that it is OK to step the clock backwards on ifup scripts.
> > > And yes, it sucks to be stuck with sub-par hardware.
> > I can imagine. But this really _is_ something we should be able
> > to work around properly without moving system time backwards after
> > early userspace. That's why ntp has that driftfile thing, and why
> > hwclock has /etc/adjtime, and why adjtimex exists.
> So the suggestion is to slow down/speed up the clock gracefully
> until the time is right, rather than jumping it?
If that won't hose your applications, yes. There is a class of applications
that cannot tolerate a second that consistently is not a second over large
periods of time... but then, you usually have proper ntp setups and
permanent network connection if you run such stuff.
It is certainly less troublesome a default, than jumping the clock around.
Note that I have nothing against asking the user what he wants to do. If he
wants jumps, it is his problem if dovecot abends, etc.
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
More information about the Pkg-sysvinit-devel