Reproducible builds in December 2023

Chris Lamb chris at reproducible-builds.org
Fri Jan 12 16:09:00 GMT 2024


--------------------------------------------------------------------
        o
      ⬋   ⬊      December 2023 in Reproducible Builds
     o     o
      ⬊   ⬋      https://reproducible-builds.org/reports/2023-12/
        o
--------------------------------------------------------------------

Welcome to the December 2023 report from the Reproducible Builds [0]
project! In these reports we outline the most important things that we
have been up to over the past month. As a rather rapid recap, whilst
anyone may inspect the source code of free software for malicious flaws,
almost all software is distributed to end users as pre-compiled binaries
(more info: [1]).

 [0] https://reproducible-builds.org
 [1] https://reproducible-builds.org/#why-does-it-matter


                                    §


"Reproducible Builds: Increasing the Integrity of Software Supply
Chains" awarded IEEE Software "Best Paper" award
-----------------------------------------------------------------

In February 2022, we announced in these reports [2] that a paper written
by Chris Lamb [3] and Stefano Zacchiroli [4] was now available in the
March/April 2022 issue of IEEE Software [5]. Titled "Reproducible
Builds: Increasing the Integrity of Software Supply Chains" [6]
(PDF [7]).

This month, however, IEEE Software [8] announced that this paper has won
their Best Paper award [9] for 2022.

 [2] https://reproducible-builds.org/reports/2023-02/
 [3] https://chris-lamb.co.uk
 [4] https://upsilon.cc/~zack/
 [5] https://ieeexplore.ieee.org/abstract/document/9403390
 [6] https://arxiv.org/abs/2104.06020
 [7] https://arxiv.org/pdf/2104.06020
 [8] https://www.computer.org/csdl/magazine/so
 [9] https://twitter.com/ieeesoftware/status/1736684911690436868

                                    §


Reproducibility to affect package migration policy in Debian
------------------------------------------------------------

In a post summarising the activities of the Debian Release Team [10] at
a recent in-person Debian event in Cambridge, UK [11], Paul Gevers
announced a change to the way packages are "migrated" into the staging
area for the next stable Debian release based on its
reproducibility status:

> The folks from the Reproducibility Project have come a long way since
they started working on it 10 years ago, and we believe it's time for
the next step in Debian. Several weeks ago, we enabled a migration
policy in our migration software that checks for regression in
reproducibility. At this moment, that is presented as just for info, but
we intend to change that to delays in the not so distant future. We
eventually want all packages to be reproducible. To stimulate
maintainers to make their packages reproducible now, we'll soon start to
apply a bounty [speedup] for reproducible builds, like we've done with
passing autopkgtests [12] for years. We'll reduce the bounty for
successful autopkgtests at that moment in time.

 [10] https://wiki.debian.org/Teams/ReleaseTeam
 [11] https://wiki.debian.org/DebianEvents/gb/2023/MiniDebConfCambridge
 [12] https://people.debian.org/~eriberto/README.package-tests.html

                                    §


Speranza: "Usable, privacy-friendly software signing"
-----------------------------------------------------

Kelsey Merrill, Karen Sollins, Santiago Torres-Arias and Zachary Newman
have developed a new system called Speranza, which is aimed at
reassuring software consumers that the product they are getting has not
been tampered with and is coming directly from a source they trust. A
write-up on TechXplore.com [13] goes into some more details:

> "What we have done," explains Sollins, "is to develop, prove correct,
and demonstrate the viability of an approach that allows the [software]
maintainers to remain anonymous." Preserving anonymity is obviously
important, given that almost everyone—software developers included—value
their confidentiality. This new approach, Sollins adds, "simultaneously
allows [software] users to have confidence that the maintainers are, in
fact, legitimate maintainers and, furthermore, that the code being
downloaded is, in fact, the correct code of that maintainer." [14]

The corresponding paper [15] is published on the arXiv [16] preprint
server in various formats, and the announcement has also been covered in
MIT News [17].

 [13] https://techxplore.com/news/2023-12-boosting-faith-authenticity-source-software.html
 [14] https://techxplore.com/news/2023-12-boosting-faith-authenticity-source-software.html
 [15] https://arxiv.org/abs/2305.06463
 [16] https://arxiv.org/
 [17] https://news.mit.edu/2023/speranza-boosting-faith-authenticity-open-source-software-1211

                                    §


Nondeterministic Git bundles
----------------------------

Paul Baecher [18] published an interesting blog post on "Reproducible
git bundles" [19]. For those who are not familiar with them, Git bundles
are used for the "offline" transfer of Git objects without an active
server sitting on the other side of a network connection. Anyway, Paul
wrote about writing a backup system for his entire system, but:

> I noticed that a small but fixed subset of [Git] repositories are
getting backed up despite having no changes made. That is odd because I
would think that repeated bundling of the same repository state should
create the exact same bundle. However [it] turns out that for some,
repositories bundling is nondeterministic.

Paul goes on to to describe his solution, which involves "forcing git to
be single threaded makes the output deterministic". The article was also
discussed on Hacker News [20].

 [18] https://baecher.dev/
 [19] https://baecher.dev/stdout/reproducible-git-bundles/
 [20] https://news.ycombinator.com/item?id=38764452

                                    §


Output from "libxlst" now deterministic
---------------------------------------

libxslt is the XSLT [21] C library developed for the GNOME project
[22], where XSLT itself is an XML language to define transformations for
XML files. This month, it was revealed that the result of the generate-
id() XSLT function is now deterministic across multiple transformations
[23], fixing many issues with reproducible builds. As the Git commit
[24] by Nick Wellnhofer describes:

  Rework the generate-id() function to return deterministic values. We
  use a simple incrementing counter and store ids in the 'psvi' member
  of nodes which was freed up by previous commits. The presence of an
  id is indicated by a new "source node" flag.

  This fixes long-standing problems with reproducible builds, see
  https://bugzilla.gnome.org/show_bug.cgi?id=751621

  This also hardens security, as the old implementation leaked the
  difference between a heap and a global pointer, see
  https://bugs.chromium.org/p/chromium/issues/detail?id=1356211

  The old implementation could also generate the same id for
  dynamically created nodes which happened to reuse the same memory.
  Ids for namespace nodes were completely broken. They now use the id
  of the parent element together with the hex-encoded namespace
  prefix.

 [21] https://en.wikipedia.org/wiki/XSLT
 [22] https://www.gnome.org/
 [23] https://gitlab.gnome.org/GNOME/libxslt/-/blob/d679f4470df2c79443ff54dbc6bd95afaf4cd876/NEWS#L47-48
 [24] https://gitlab.gnome.org/GNOME/libxslt/-/commit/82f6cbf8ca61b1f9e00dc04aa3b15d563e7bbc6d

                                    §

Community updates
-----------------

There were made a number of improvements to our website, including Chris
Lamb fixing the "generate-draft" script to not blow up if the input
files have been corrupted today or even in the past [25], Holger Levsen
updated the Hamburg 2023 summit [26] to add a link to farewell post [27]
& to add a picture of a Post-It note. [28], and Pol Dellaiera updated
the paragraph about "tar" and the "--clamp-mtime" flag [29].

On our mailing list [30] this month, Bernhard M. Wiedemann posted an
interesting summary on some of the reasons why packages are still not
reproducible [31] in 2023.

diffoscope [32] is our in-depth and content-aware diff utility that can
locate and diagnose reproducibility issues. This month, Chris Lamb made
a number of changes, including processing "objdump" symbol comment
filter inputs as Python "byte" (and not "str") instances [33] and
Vagrant Cascadian extended diffoscope support for GNU Guix [34] [35] and
updated the version in that distribution to version 253 [36] [37].

 [25] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/40c10ab9
 [26] https://reproducible-builds.org/events/hamburg2023/
 [27] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/0a17754a
 [28] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/d6f3fa6e
 [29] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/37e7878f
 [30] https://lists.reproducible-builds.org/listinfo/rb-general/
 [31] https://lists.reproducible-builds.org/pipermail/rb-general/2023-December/003215.html
 [32] https://diffoscope.org
 [33] https://salsa.debian.org/reproducible-builds/diffoscope/commit/6d788d7d
 [34] https://guix.gnu.org/
 [35] https://salsa.debian.org/reproducible-builds/diffoscope/-/commit/f1822463eb39ba673b1037e105a5af59fd04262b
 [36] https://issues.guix.gnu.org/67980
 [37] https://git.savannah.gnu.org/cgit/guix.git/commit/?id=111d010921fea8c803427dc316086434e748e773

                                    §


Paper: "Challenges of Producing Software Bill Of Materials for Java"
--------------------------------------------------------------------

Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin
Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, César Soto-Valero
and Martin Wittlinger (!) of the KTH Royal Institute of Technology [38]
in Sweden, have published an article in which they:

> … deep-dive into 6 tools and the accuracy of the SBOMs [39] they
produce for complex open-source Java projects. Our novel insights reveal
some hard challenges regarding the accurate production and usage of
software bills of materials.

The paper is available [40] on arXiv [41].

 [38] https://www.kth.se/en
 [39] https://about.gitlab.com/blog/2022/10/25/the-ultimate-guide-to-sboms/
 [40] https://arxiv.org/abs/2303.11102
 [41] https://arxiv.org/

                                    §


Debian Non-Maintainer Upload (NMU) campaign
-------------------------------------------

As mentioned in previous [42] reports [43], the Reproducible Builds team
within Debian has been organising a series of online and offline sprints
in order to clear the huge backlog of reproducible builds patches
submitted by performing so-called NMUs (Non-Maintainer Uploads [44]).

During December, Vagrant Cascadian performed a number of such
uploads, including:

  * crack [45] [46] (#1021521 & #1021522)
  * dustmite [49] [50] (#1020878 & #1020879)
  * edid-decode [53] [54] (#1020877)
  * gentoo [56] [57] (#1024284)
  * haskell98-report [59] [60] (#1024007)
  * infinipath-psm [62] [63] (#990862)
  * lcm [65] [66] (#1024286)
  * libapache-mod-evasive [68] [69] (#1020800)
  * libccrtp [71] [72] (#860470)
  * libinput [74] [75] (#995809)
  * lirc [77] [78] (#979019 [79], #979023 & #979024)
  * mm-common [82] [83] (#977177)
  * mpl-sphinx-theme [85] [86] (#1005826)
  * psi [88] [89] (#1017473)
  * python-parse-type [91] [92] (#1002671)
  * ruby-tioga [94] [95] (#1005727)
  * ucspi-proxy [97] [98] (#1024125)
  * ypserv [100] [101] (#983138)

In addition, Holger Levsen performed three "no-source-change" NMUs in
order to address the last packages without .buildinfo files in Debian
"trixie", specifically lorene (0.0.0~cvs20161116+dfsg-1.1), maria
(1.3.5-4.2) and ruby-rinku (1.7.3-2.1).

 [42] https://reproducible-builds.org/reports/2023-01/
 [43] https://reproducible-builds.org/reports/2022-12/
 [44] https://wiki.debian.org/NonMaintainerUpload
 [45] https://tracker.debian.org/crack
 [46] https://browse.dgit.debian.org/crack.git/commit/?id=4b45271101e0f3cf2ca8f5039e487d8931563011
 [49] https://tracker.debian.org/dustmite
 [50] https://browse.dgit.debian.org/dustmite.git/commit/?id=c2ec8a50eb2497ab8ea974ab4f4e5603691c7cd0
 [53] https://tracker.debian.org/edid-decode
 [54] https://salsa.debian.org/debian/edid-decode/-/commit/c9dcc3bdda30953543cf8ded821efe13f8269fc6
 [56] https://tracker.debian.org/gentoo
 [57] https://browse.dgit.debian.org/gentoo.git/commit/?id=6abbef83574c7028fa034a23471863e5107073e2
 [59] https://tracker.debian.org/haskell98-report
 [60] https://browse.dgit.debian.org/haskell98-report.git/commit/?id=3fa444643941eb7674d8a3fc6adbce447e4d5d55
 [62] https://tracker.debian.org/pkg/infinipath-psm
 [63] https://browse.dgit.debian.org/infinipath-psm.git/commit/?id=5fe0454be2b53b268b379a9c4c4e183b62c4397b
 [65] https://tracker.debian.org/lcm
 [66] https://salsa.debian.org/debian/lcm/-/commit/3ee1de8caa666836f5b92cb3a664633343b08615
 [68] https://tracker.debian.org/libapache-mod-evasive
 [69] https://browse.dgit.debian.org/libapache-mod-evasive.git/commit/?id=7ed970db575818b0742b276d1700faab75398d4b0
 [71] https://tracker.debian.org/libccrtp
 [72] https://browse.dgit.debian.org/libccrtp.git/commit/?id=3fc542e27e9b26b2c46dab2ee51f84306c441a80
 [74] https://tracker.debian.org/libinput
 [75] https://browse.dgit.debian.org/libinput.git/commit/?id=1d5d19721b4c215cb08d758b85301e20b2be1af9
 [77] https://tracker.debian.org/lirc
 [78] https://browse.dgit.debian.org/lirc.git/commit/?id=0d817194dde2519f6559b1b7a50516c46aac5b5b
 [82] https://tracker.debian.org/mm-common
 [83] https://browse.dgit.debian.org/mm-common.git/commit/?id=dc903218cc0b9bf783494f6d7280ba24e00092c5
 [85] https://tracker.debian.org/mpl-sphinx-theme
 [86] https://browse.dgit.debian.org/mpl-sphinx-theme.git/commit/?id=fd0cbbd24f534d6a35cab0ee7261b285fb3d8cfb
 [88] https://tracker.debian.org/psi
 [89] https://browse.dgit.debian.org/psi.git/commit/?id=e6f2b1f720dee8658170ce862cb290f019120d88
 [91] https://tracker.debian.org/python-parse-type
 [92] https://browse.dgit.debian.org/python-parse-type.git/commit/?id=6e17fbda135ce2987457a715e5fbc742796ada1b
 [94] https://tracker.debian.org/ruby-tioga
 [95] https://salsa.debian.org/ruby-team/ruby-tioga/-/commit/05c5ae82168cb054844aa2502c9b782976e3a93f
 [97] https://tracker.debian.org/ucspi-proxy
 [98] https://browse.dgit.debian.org/ucspi-proxy.git/commit/?id=3e0e53d6637f8e1e4600c81bf5eac632cc8c5644
 [100] https://tracker.debian.org/pkg/ypserv
 [101] https://browse.dgit.debian.org/ypserv.git/commit/?id=9c8c6a058c1c512047aa64a39338b184c0c2be9f

                                    §

Reproducibility testing framework
---------------------------------

The Reproducible Builds project operates a comprehensive testing
framework (available at tests.reproducible-builds.org [103]) in order to
check packages and other artifacts for reproducibility. In December, a
number of changes were made by Holger Levsen:

* Debian [104]-related changes:

    * Fix matching packages for the R programming language
      [105]. [106][107][108]
    * Add a Certbot [109] configuration for the Nginx web server.
    * Enable debugging for the "create-meta-pkgs" tool. [110][111]

* Arch Linux [112]-related changes

    * The "asp" has been deprecated by "pkgctl"; thanks to dvzrv for
      the pointer. [113]
    * Disable the Arch Linux builders for now. [114]
    * Stop referring to the "/trunk" branch / subdirectory. [115]
    * Use "--protocol https" when cloning repositories using the
      "pkgctl" tool. [116]

* Misc changes:

    * Install the "python3-setuptools" and "swig" packages, which are
      now needed to build OpenWrt. [117]
    * Install "pkg-config" needed to build Coreboot artifacts. [118]
    * Detect failures due to an issue where the "fakeroot" tool is
      implicitly required but not automatically installed [119]. [120]
    * Detect failures due to rename of the "vmlinuz" file. [121]
    * Improve the grammar of an error message. [122]
    * Document that "freebsd-jenkins.debian.net" has been updated to
      FreeBSD 14.0 [123]. [124]

In addition, node maintenance was performed by Holger Levsen [125] and
Vagrant Cascadian [126].

 [103] https://tests.reproducible-builds.org
 [104] https://debian.org/
 [105] https://en.wikipedia.org/wiki/R_(programming_language
 [106] https://salsa.debian.org/qa/jenkins.debian.net/commit/8d62713bf
 [107] https://salsa.debian.org/qa/jenkins.debian.net/commit/5b6d7d332
 [108] https://salsa.debian.org/qa/jenkins.debian.net/commit/37f4c1c08
 [109] https://certbot.eff.org/ 
 [110] https://salsa.debian.org/qa/jenkins.debian.net/commit/f5ec1b5df
 [111] https://salsa.debian.org/qa/jenkins.debian.net/commit/b38717780
 [112] https://archlinux.org/
 [113] https://salsa.debian.org/qa/jenkins.debian.net/commit/a07e44fa6
 [114] https://salsa.debian.org/qa/jenkins.debian.net/commit/fb40264ab
 [115] https://salsa.debian.org/qa/jenkins.debian.net/commit/22ca4f991
 [116] https://salsa.debian.org/qa/jenkins.debian.net/commit/42913dcdb
 [117] https://salsa.debian.org/qa/jenkins.debian.net/commit/43ceb3a27
 [118] https://salsa.debian.org/qa/jenkins.debian.net/commit/b0e7b255f
 [119] https://bugs.debian.org/1058994
 [120] https://salsa.debian.org/qa/jenkins.debian.net/commit/74f36029a
 [121] https://salsa.debian.org/qa/jenkins.debian.net/commit/deb577757
 [122] https://salsa.debian.org/qa/jenkins.debian.net/commit/64ba2b857
 [123] https://www.freebsd.org/releases/14.0R/relnotes/
 [124] https://salsa.debian.org/qa/jenkins.debian.net/commit/516134d7a
 [125] https://salsa.debian.org/qa/jenkins.debian.net/commit/d09194e21
 [126] https://salsa.debian.org/qa/jenkins.debian.net/commit/708a936ce

                                    §

Upstream patches
----------------

The Reproducible Builds project detects, dissects and attempts to fix as
many currently-unreproducible packages as possible. We endeavour to send
all of our patches upstream where appropriate. This month, we wrote a
large number of such patches, including:

* Bernhard M. Wiedemann:

    * apr [127] (hostname issue)
    * dune [128] (parallelism)
    * epy [129] (time-based ".pyc" issue)
    * fpc [130] (Year 2038)
    * gap [131] (date)
    * gh [132] (FTBFS in 2024)
    * kubernetes [133] (fixed random build path)
    * libgda [134] (date)
    * libguestfs [135] (tar)
    * metamail [136] (date)
    * mpi-selector [137] (date)
    * neovim [138] (randomness in Lua)
    * nml [139] (time-based ".pyc")
    * pommed [140] (parallelism)
    * procmail [141] (benchmarking)
    * pysnmp [142] (FTBFS in 2038)
    * python-efl [143] (drop Sphinx doctrees)
    * python-pyface [144] (time)
    * python-pytest-salt-factories [145] (time-based ".pyc" issue)
    * python-quimb [146] (fails to build on single-CPU systems)
    * python-rdflib [147] (random)
    * python-yarl [148] (random path)
    * qt6-webengine [149] (parallelism issue in documentation)
    * texlive [150] (Gzip modification time issue)
    * waf [151] (time-based ".pyc")
    * warewulf [152] (CPIO modification time and inode issue)
    * xemacs [153] (toolchain hostname)

* Chris Lamb:

    * #1057710 [154] filed against "python-aiostream" [155].
    * #1057721 [156] filed against "openpyxl" [157].
    * #1058681 [158] filed against "python-multipletau" [159].
    * #1059013 [160] filed against "wxmplot" [161].
    * #1059014 [162] filed against "stunnel4" [163].

* James Addison:

    * #1059592 [164] & #1059631 [165] filed against "qttools-
      opensource-src" [166].

 [127] https://build.opensuse.org/request/show/1133854
 [128] https://github.com/ocaml/dune/issues/9507
 [129] https://build.opensuse.org/request/show/1134190
 [130] https://gitlab.com/freepascal.org/fpc/source/-/issues/40552
 [131] https://github.com/gap-system/gap/pull/5550
 [132] https://github.com/cli/cli/issues/8452
 [133] https://github.com/kubernetes/kubernetes/issues/110928
 [134] https://build.opensuse.org/request/show/1134513
 [135] https://build.opensuse.org/request/show/1133981
 [136] https://build.opensuse.org/request/show/1134199
 [137] https://build.opensuse.org/request/show/1133866
 [138] https://github.com/neovim/neovim/issues/26387
 [139] https://build.opensuse.org/request/show/1134364
 [140] https://salsa.debian.org/mactel-team/pommed/-/merge_requests/2
 [141] https://build.opensuse.org/request/show/1130552
 [142] https://github.com/lextudio/pysnmp/pull/35
 [143] https://build.opensuse.org/request/show/1130399
 [144] https://build.opensuse.org/request/show/1131679
 [145] https://build.opensuse.org/request/show/1134187
 [146] https://bugzilla.opensuse.org/show_bug.cgi?id=1217831
 [147] https://build.opensuse.org/request/show/1132811
 [148] https://build.opensuse.org/request/show/1132599
 [149] https://bugzilla.opensuse.org/show_bug.cgi?id=1217774
 [150] https://build.opensuse.org/request/show/1133928
 [151] https://build.opensuse.org/request/show/1133929
 [152] https://bugzilla.opensuse.org/show_bug.cgi?id=1217973
 [153] https://build.opensuse.org/request/show/1134362
 [154] https://bugs.debian.org/1057710
 [155] https://tracker.debian.org/pkg/python-aiostream
 [156] https://bugs.debian.org/1057721
 [157] https://tracker.debian.org/pkg/openpyxl
 [158] https://bugs.debian.org/1058681
 [159] https://tracker.debian.org/pkg/python-multipletau
 [160] https://bugs.debian.org/1059013
 [161] https://tracker.debian.org/pkg/wxmplot
 [162] https://bugs.debian.org/1059014
 [163] https://tracker.debian.org/pkg/stunnel4
 [164] https://bugs.debian.org/1059592
 [165] https://bugs.debian.org/1059631
 [166] https://tracker.debian.org/pkg/qttools-opensource-src

                                    §

And finally...
--------------

If you are interested in contributing to the Reproducible Builds
project, please visit our "Contribute" [167] page on our website.
However, you can get in touch with us via:

 * IRC: "#reproducible-builds" on "irc.oftc.net".
 * Mailing list: "rb-general at lists.reproducible-builds.org" [168]
 * Mastodon: @reproducible_builds [169]
 * Twitter: @ReproBuilds [170]

 [167] https://reproducible-builds.org/contribute/
 [168] https://lists.reproducible-builds.org/listinfo/rb-general
 [169] https://fosstodon.org/@reproducible_builds
 [170] https://twitter.com/ReproBuilds



-- 
      o
    ⬋   ⬊      Chris Lamb
   o     o     reproducible-builds.org 💠
    ⬊   ⬋
      o



More information about the Reproducible-builds mailing list