Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?
Dominic Hargreaves
dom at earth.li
Mon Jun 1 15:14:32 UTC 2015
On Sun, May 24, 2015 at 07:38:19PM +0300, Apollon Oikonomopoulos wrote:
> On 16:38 Sun 24 May , Ben Hutchings wrote:
> > On Sun, 2015-05-24 at 14:09 +0300, Niko Tyni wrote:
> > > On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote:
> > > > On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote:
> > > >
> > > > > This is rather strange; any ideas from DSA?
> > > >
> > > > The underlying hosts do not have the same issue.
> > > >
> > > > All of the guests use the same virtual CPU version/flags.
> > > >
> > > > All of the guests use the same Linux kernel version.
> > >
> > > Thanks for the update.
> > >
> > > > I guess diving into the Linux implementation of times(2) for clues would
> > > > be the next step for figuring out what the issue is here.
> > >
> > > I'm taking the kernel maintainers in the loop. The status here is that
> > > times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual
> > > hosts running jessie (under ganeti/qemu, with jessie on the underlying
> > > hosts too). These hosts include at least barriere and x86-grnet-01.
> > >
> > > The misbehaviour is that user time stays at zero all the time, as seen
> > > for example with 'time yes'. This is making perl fail to build from
> > > source due to test failures, and I'd expect it to affect other things too.
> > >
> > > Any help is appreciated.
> >
> > I can't reproduce this, but wonder if it's related to #784960?
>
> There seems to be something fundamentally broken in
> barriere.debian.org's CPU time accounting, not related to times(2) per
> se. Just issuing
>
> yes >/dev/null
>
> and firing up top -d1 gives the following interesting results:
>
> - `yes' shows up taking 100% CPU time as expected, but
> - pressing `1' shows that all CPUs are idle (!)
>
> htop OTOH displays all CPUs as constantly 100% busy, which is
> inconsistent with the system's load average (~0.8 at that point).
>
> Also watching the output of `cat /proc/$(pidof yes)/stat | awk '{ print
> $14, $15 }'' ($14 is utime, $15 is stime per proc(5)) indeed shows 100%
> system time and 0 user time.
>
> If you look at the `top' stats for all CPUs of barriere.debian.org, it
> looks as if the only thing that's correctly being accounted for is
> iowait time.
It looks like the same thing has happened again on x86-grnet-01, meaning
we have issues[1] on
x86-grnet-01
brahms
binet
but not
babin
x86-csail-01
Buildd admins: please can the amd64 build of perl 5.22.0~rc2-2 be
given-back to see if it lands on a working host?
DSA: can you identify any differences between the working hosts and the
others which would help identify the root of this problem - assuming that
they all exhibit the same easy to reproduce behaviour seen above?
Thanks!
Dominic.
[1] <https://buildd.debian.org/status/logs.php?pkg=perl&arch=amd64>
More information about the Perl-maintainers
mailing list