[pkg-gnupg-maint] Bug#972525: sbuild randomly fails to sign changes file despite valid signature keys

Wookey wookey at wookware.org
Mon Feb 5 15:53:06 GMT 2024


I've been seeing this regularly, and getting hundreds of 'dupload
failed' emails as a result (they get sent every 5 mins now after it
goes wrong).  I've not been keeping records, (because I just bin
those hundreds of emails) but it happens most weeks, and I've had two
this week (opencv and gnuradio)

I'll start collecting info here to see if we can narrow this down a bit.

So, latest is
opencv_4.6.0+dfsg-13.1~exp1_armel built for experimental on arm-ubc-06
the changes file is Feb  4 03:29
Looking in the build logs I see it was built and uploaded successfully 3hrs later on
arm-arm-03
https://buildd.debian.org/status/architecture.php?a=armel&suite=experimental&buildd=buildd_armel-arm-ubc-06
https://buildd.debian.org/status/architecture.php?a=armel&suite=experimental&buildd=buildd_armel-arm-arm-03

The second build started 1hr after the .changes files for the frist one was made, so I guess there is a timeout of 1hr after the log arrives and if there is no uploade by then the buildd assumes failure and schedules another build?

I have noticed before that usually by the time I look at the failed
upload there is already a new build uploaded. It would be nice if the
buildds tidied up after themselves once the build is in the archive
and stopped sending tiresome email awaiting a manual clear-up. Once a
new build has been issued the old failed upload should be removed. I'm
not quite sure exactly what that check should look like. Alternatively
we could stop sending very frequent mail to buildd admins, and let the
'are files a week old' script tidy them up in due course.

The actual error on the failed log is:
Finished at 2024-02-04T03:29:15Z
Signature with key '764BC9A1354021955868EF5CC98724D9AA73AAA3' requested:
 signfile buildinfo /home/buildd/build/opencv_4.6.0+dfsg-13.1~exp1_armel-buildd.buildinfo 764BC9A1354021955868EF5CC98724D9AA73AAA3
gpg: error running '/usr/bin/gpg-agent': exit status 2
gpg: failed to start agent '/usr/bin/gpg-agent': General error
gpg: can't connect to the agent: General error
gpg: keydb_search failed: No agent running
gpg: skipped "764BC9A1354021955868EF5CC98724D9AA73AAA3": No agent running
gpg: /tmp/debsign.e1vK8yhj/opencv_4.6.0+dfsg-13.1~exp1_armel-buildd.buildinfo: clear-sign failed: No agent running
debsign: gpg error occurred!  Aborting....

Looking in var/log/messages at that time (on arm-ubc-06)
We see that some script set to log starts 1 second after sbuild returns, at 03:29:16 
and sends no more messages after 03:30:38. So takes 1m22s (82s) to run.
Does that indicate the suspected high load which might be making gpg fail?
I don't think we log load per se, do we? It has a lot of debs to check so I think it just takes a while.

times for running that script in this log for various packages:
 16s singular_4.3.2-p10+ds-1.1~exp1_armel
 15s redland_1.0.17-3.1~exp1_armel
  9s shapetools_1.4pl6-16.1~exp1_armel
  9s solvespace_3.1+ds1-3.1~exp1_armel
  8s secsipidx_1.3.2-2.1~exp1_armel
  8s scamper_20211212-1.2~exp1_armel
  6s rttr_0.9.6+dfsg1-6.1~exp1_armel
  8s mfem_4.5.2+ds-1.5~exp1_armel
 82s opencv_4.6.0+dfsg-13.1~exp1_armel  (failed to sign)
 15s rhvoice_1.8.0+dfsg-3.1~exp1_armel
  5s libposix-2008-perl_0.23-1_armel
  4s rust-expectrl_0.7.1-2_armel
  5s liblinux-fd-perl_0.016-1_armel
 10s swami_2.2.2-2.1~exp1_armel
  6s symmetrica_3.0.1+ds-2.1~exp1_armel
  9s muffin_5.8.1-2.1~exp1_armel
  5s t4kcommon_0.1.1-11.1~exp1_armel
  4s netperfmeter_1.9.6-1_armel
  6s pidgin-skype_20240122+gitab786a3+dfsg-2_armel
  6s tinyframe_0.1.1-4.1~exp1_armel
  5s toontag_0.0~git20220105193632.41237ef-2.1~exp1_armhf
  4s tse3_0.3.1-6.1~exp1_armel
 86s gcc-10_10.5.0-3_armel
  5s lomiri-camera-app_4.0.5+dfsg-1_armel
  8s opendmarc_1.4.2-4.1~exp1_armel
154s libreoffice_24.2.0-1~bpo12+1_armel
 59s gnuradio_3.10.9.2-1.1~exp1_armel    (failed to sign)

So opencv is not the longest package to process. libreoffice takes quite a lot longer. , gcc-10 slightly longer. But most are way quicker.

I have noticed that it's usually larger packages that go wrong. (libreoffice, gcc, binutils, but not always)

Not sure if any of this info helps but that's my investigations
today. Suggestions for other monitoring, or the best way to work around it by
not fixing it, just making it less annoying, welcome.

Wookey
-- 
Principal hats:  Debian, Wookware, ARM
http://wookware.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-gnupg-maint/attachments/20240205/ce916a4c/attachment.sig>


More information about the pkg-gnupg-maint mailing list