Reproducible Builds in May 2024
Chris Lamb
chris at reproducible-builds.org
Mon Jun 10 10:05:36 BST 2024
--------------------------------------------------------------------
o
⬋ ⬊ May 2024 in Reproducible Builds
o o
⬊ ⬋ https://reproducible-builds.org/reports/2024-05/
o
--------------------------------------------------------------------
Welcome to the May 2024 report from the Reproducible Builds [0] project!
In these reports, we try to outline what we have been up to over the
past month and highlight news items in software supply-chain security
more broadly.
As ever, if you are interested in contributing to the project, please
visit our "Contribute" [1] page on our website.
[0] https://reproducible-builds.org
[1] https://reproducible-builds.org/contribute/
§
TABLE OF CONTENTS
* "A peek into build provenance for Homebrew"
* Distribution news
* Mailing list news
* Miscellaneous news
* Two new academic papers
* "diffoscope"
* Website updates
* Upstream patches
* Reproducibility testing framework
§
"A peek into build provenance for Homebrew"
-------------------------------------------
Joe Sweeney and William Woodruff on the Trail of Bits [3] blog wrote an
extensive post about build provenance [4] for Homebrew [5], the third-
party package manager for MacOS. Their post details how each "bottle"
(i.e. each release):
> […] built by Homebrew will come with a cryptographically verifiable
> statement binding the bottle’s content to the specific workflow and
> other build-time metadata that produced it. […] In effect, this
> injects greater transparency into the Homebrew build process, and
> diminishes the threat posed by a compromised or malicious insider by
> making it impossible to trick ordinary users into installing
> non-CI-built bottles.
The post also briefly touches on future work, including work on
source provenance:
> Homebrew’s formulae already hash-pin their source artifacts, but we
> can go a step further and additionally assert that source artifacts are
> produced by the repository (or other signing identity) that’s latent in
> their URL or otherwise embedded into the formula specification.
[3] https://www.trailofbits.com/
[4] https://blog.trailofbits.com/2024/05/14/a-peek-into-build-provenance-for-homebrew/
[5] https://brew.sh/
§
Distribution news
-----------------
In Debian this month, Johannes Schauer Marin Rodrigues (aka "josch")
noticed that the Debian binary package bash version 5.2.15-2+b3 was
"uploaded to the archive twice. Once to bookworm and once to sid but
with differing content." [6] This is problem for reproducible builds in
Debian due its assumption that the package name, version and
architecture triplet is unique. However, josch highlighted that
> This example with bash is especially problematic since bash is
> Essential:yes, so there will now be a large portion of .buildinfo files
> where it is not possible to figure out with which of the two differing
> bash packages the sources were compiled.
In response to this, Holger Levsen performed an analysis of all
.buildinfo files and found that this needs almost 1,500 binNMUs [7] to
fix the fallout from this bug.
[6] https://bugs.debian.org/1072205
[7] https://wiki.debian.org/NonMaintainerUpload
Elsewhere in Debian, Vagrant Cascadian posted about a Non-Maintainer
Upload (NMU) sprint [8] to take place during early June, and it was
announced that there is now a #debian-snapshot IRC channel on OFTC to
discuss the creation of a new source code archiving service to, perhaps,
replace snapshot.debian.org [9]. Lastly, 11 reviews of Debian packages
were added, 15 were updated and 48 were removed this month adding to our
extensive knowledge about identified issues [10]. A number of issue
types have been updated by Chris Lamb as well. [11][12]
[8] https://lists.reproducible-builds.org/pipermail/rb-general/2024-May/003404.html
[9] https://snapshot.debian.org/
[10] https://tests.reproducible-builds.org/debian/index_issues.html
[11] https://salsa.debian.org/reproducible-builds/reproducible-notes/commit/5fda7f6e
[12] https://salsa.debian.org/reproducible-builds/reproducible-notes/commit/cf46a837
Elsewhere in the world of distributions, deep within a larger
announcement from Colin Percival about the release of version
14.1-BETA2 [13], it was mentioned that the FreeBSD [14] kernels are
now built reproducibly.
[13] https://lists.freebsd.org/archives/freebsd-stable/2024-May/002133.html
[14] https://www.freebsd.org/
In Fedora, however, the change proposal mentioned in our report for
April 2024 [15] was approved, so, per the ReproduciblePackageBuilds [16]
wiki page, the "add-determinism" [17] tool is now running in new builds
for Fedora 41 ('rawhide'). The "add-determinism" tool is a Rust program
which, as its name suggests, adds determinism to files that are given as
input by "attempting to standardize metadata contained in binary or
source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in
all instances". This is essentially the Fedora version of Debian's
strip-nondeterminism. However, strip-nondeterminism is written in
Perl, and Fedora did not want to pull Perl in the buildroot for every
package. The add-determinism tool eliminates many causes of non-
determinism and work is ongoing to continue the scope of packages it can
operate on.
[15] https://reproducible-builds.org/reports/2024-04/
[16] https://fedoraproject.org/wiki/Changes/ReproduciblePackageBuilds
[17] https://github.com/keszybz/add-determinism
§
Mailing list news
-----------------
On our mailing list [18] this month, regular contributor "kpcyrd" wrote
to the list with an update on their source code indexing project,
whatsrc.org [19]. The whatsrc.org [20] project, which was launched
last month in response to the XZ Utils backdoor [21], now contains and
indexes almost 250,000 unique source code archives. In their post,
kpcyrd gives an example of its intended purpose, noting that it shown
that whilst "there seems to be consensus about [the] source code for zsh
5.9" in various Linux distributions, it "does not align with the
contents of the zsh Git repository".
[18] https://lists.reproducible-builds.org/listinfo/rb-general/
[19] https://lists.reproducible-builds.org/pipermail/rb-general/2024-May/003407.html
[20] https://whatsrc.org/
[21] https://en.wikipedia.org/wiki/XZ_Utils_backdoor
Holger Levsen also posted to the list with a 'pre-announcement' of sorts
for the 2024 Reproducible Builds summit [22]. In particular:
> [Whilst] the dates and location are not fixed yet, however if you don'
> help us with finding a suitable location *soon*, it is very likely that
> we'll meet again in Hamburg in the 2nd half of September 2024 […].
[22] https://lists.reproducible-builds.org/pipermail/rb-general/2024-May/003411.html
Lastly, Frederic-Emmanuel Picca wrote to the list asking for help
understanding the "non-reproducible status of the Debian silx package"
[23] and received replies from both Vagrant Cascadian [24] and Chris
Lamb [25].
[23] https://lists.reproducible-builds.org/pipermail/rb-general/2024-May/003393.html
[24] https://lists.reproducible-builds.org/pipermail/rb-general/2024-May/003394.html
[25] https://lists.reproducible-builds.org/pipermail/rb-general/2024-May/003396.html
§
Miscellaneous news
------------------
strip-nondeterminism [26] is our tool to remove specific non-
deterministic results from a completed build. This month strip-
nondeterminism version 1.14.0-1 was uploaded to Debian unstable [27] by
Chris Lamb chiefly to incorporate a change from Alex Muntada to avoid a
dependency on Sub::Override to perform monkey-patching and break
circular dependencies related to debhelper [28].
[26] https://tracker.debian.org/pkg/strip-nondeterminism
[27] https://tracker.debian.org/news/1532251/accepted-strip-nondeterminism-1140-1-source-into-unstable/
[28] https://salsa.debian.org/reproducible-builds/strip-nondeterminism/commit/4d235a6
Elsewhere in our tooling, Jelle van der Waa modified reprotest [29]
because the pipes [30] module will be removed in Python version 3.13
[31].
[29] https://tracker.debian.org/pkg/strip-nondeterminism
[30] https://docs.python.org/3/library/pipes.html
[31] https://salsa.debian.org/reproducible-builds/reprotest/commit/37a13d9
It was also noticed that a new blog post by Daniel Stenberg [32]
detailing "How to verify a Curl release [33]" mentions the
SOURCE_DATE_EPOCH environment variable [34]. This is because:
> The [curl] release tools document also contains another key
> component: the exact time stamp at which the release was done –
> using integer second resolution. In order to generate a correct
> tarball clone, you need to also generate the new version using the
> old version’s timestamp. Because the modification date of all files
> in the produced tarball will be set to this timestamp.
[32] https://daniel.haxx.se/
[33] https://daniel.haxx.se/blog/2024/05/23/how-to-verify-a-curl-release/
[34] https://reproducible-builds.org/docs/source-date-epoch/
Furthermore, Fay Stegerman filed a bug [35] against the Signal messenger
app for Android [36] to report that their 'reproducible' builds cannot,
in fact, be reproduced. However, Fay is quick to note that she has:
> … found zero evidence of any kind of compromise. Some differences
> are yet unexplained but everything I found seems to be benign. I am
> disappointed that Reproducible Builds have been broken for months
> but I have zero reason to doubt Signal's security in any way.
[35] https://github.com/signalapp/Signal-Android/issues/13565
[36] https://github.com/signalapp/Signal-Android
Lastly, it was observed that there was a concise and diagrammatic
overview of "supply chain threats [37]" on the SLSA [38] website.
[37] https://slsa.dev/spec/v1.0/threats-overview
[38] https://slsa.dev/
§
Two new academic papers
-----------------------
Two new scholarly papers were published this month.
Firstly, Mathieu Acher, Benoît Combemale, Georges Aaron Randrianaina and
Jean-Marc Jézéquel of University of Rennes [39] on "Embracing Deep
Variability For Reproducibility & Replicability" [40]. The authors
describe their approach as follows:
> In this short [vision] paper we delve into the application of
> software engineering techniques, specifically variability
> management, to systematically identify and explicit points of
> variability that may give rise to reproducibility issues (e.g.,
> language, libraries, compiler, virtual machine, OS, environment
> variables, etc.). The primary objectives are: i) gaining insights
> into the variability layers and their possible interactions, ii)
> capturing and documenting configurations for the sake of
> reproducibility, and iii) exploring diverse configurations to
> replicate, and hence validate and ensure the robustness of results.
> By adopting these methodologies, we aim to address the complexities
> associated with reproducibility and replicability in modern software
> systems and environments, facilitating a more comprehensive and
> nuanced perspective on these critical aspects.
A PDF of this article is available [41].
[39] https://www.univ-rennes.fr/en/welcome-university-rennes
[40] https://hal.science/hal-04582287
[41] https://hal.science/hal-04582287/document
Secondly, Ludovic Courtès, Timothy Sample, Simon Tournier and Stefano
Zacchiroli have collaborated to publish a paper on "Source Code
Archiving to the Rescue of Reproducible Deployment" [42]. Their paper
was motivated because:
> The ability to verify research results and to experiment with
> methodologies are core tenets of science. As research results are
> increasingly the outcome of computational processes, software plays
> a central role. GNU Guix [43] is a software deployment tool that
> supports reproducible software deployment, making it a foundation
> for computational research workflows. To achieve reproducibility, we
> must first ensure the source code of software packages Guix deploys
> remains available.
A PDF of this article [44] is also available.
[42] https://hal.science/hal-04586520
[43] https://guix.gnu.org/
[44] https://hal.science/hal-04582287/document
§
diffoscope
----------
diffoscope [46] is our in-depth and content-aware diff utility that can
locate and diagnose reproducibility issues. This month, Chris Lamb made
a number of changes such as uploading versions 266, 267, 268 and 269 to
Debian, making the following changes:
* New features:
* Use xz --list to supplement output when comparing .xz archives;
essential when metadata differs. (#1069329 [47])
* Include xz --verbose --verbose (ie. double) output.
(#1069329 [48])
* Strip the first line from the xz --list output. [49]
* Only include xz --list --verbose output if the xz has no other
differences. [50]
* Actually append the xz --list after the container differences,
as it simplifies a lot. [51]
* Testing improvements:
* Allow Debian testing to fail right now. [52]
* Drop apktool from Build-Depends; we can still test APK
functionality via autopkgtests. (#1071410 [53])
* Add a versioned dependency for at least version 5.4.5 for the xz
tests as they fail under (at least) version 5.2.8. (#374 [54])
* Fix tests for 7zip 24.05. [55][56]
* Fix all tests after additon of xz --list. [57][58]
* Misc:
* Update copyright years. [59]
[46] https://diffoscope.org
[47] https://bugs.debian.org/1069329
[48] https://bugs.debian.org/1069329
[49] https://salsa.debian.org/reproducible-builds/diffoscope/commit/ac8a5070
[50] https://salsa.debian.org/reproducible-builds/diffoscope/commit/52919364
[51] https://salsa.debian.org/reproducible-builds/diffoscope/commit/2acff705
[52] https://salsa.debian.org/reproducible-builds/diffoscope/commit/0a5a5cc3
[53] https://bugs.debian.org/1071410
[54] https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/374
[55] https://salsa.debian.org/reproducible-builds/diffoscope/commit/31a6a56a
[56] https://salsa.debian.org/reproducible-builds/diffoscope/commit/9b421991
[57] https://salsa.debian.org/reproducible-builds/diffoscope/commit/a6651ded
[58] https://salsa.debian.org/reproducible-builds/diffoscope/commit/8443cb8c
[59] https://salsa.debian.org/reproducible-builds/diffoscope/commit/1e782e18
In addition, James Addison fixed an issue where the HTML output showed
only the first difference in a file, while the text output shows all
differences [60][61][62], Sergei Trofimovich amended the 7zip version
test for older 7z versions that include the string "[64]" [63][64] and
Vagrant Cascadian relaxed the versioned dependency to allow version
5.4.1 for the xz tests [65] and proposed updates to guix for versions
267, 268 [66] and pushed version 269 to Guix [67].
[60] https://salsa.debian.org/reproducible-builds/diffoscope/commit/4a685bbb
[61] https://salsa.debian.org/reproducible-builds/diffoscope/commit/e976c352
[62] https://salsa.debian.org/reproducible-builds/diffoscope/commit/067a8d1c
[63] https://salsa.debian.org/reproducible-builds/diffoscope/commit/2a361d7d
[64] https://salsa.debian.org/reproducible-builds/diffoscope/commit/614c1b2c
[65] https://salsa.debian.org/reproducible-builds/diffoscope/commit/fd7eed75
[66] https://issues.guix.gnu.org/71024
[67] https://git.savannah.gnu.org/cgit/guix.git/commit/?id=c7888f5361fbdbe5182e7dbe90ccc12e2d95d3c3
Furthermore, Eli Schwartz updated the diffoscope.org website [68] in
order to explain how to install diffoscope on Gentoo [69].
[68] https://diffoscope.org/
[69] https://salsa.debian.org/reproducible-builds/diffoscope-website/commit/a58b28f
§
Website updates
---------------
There were a number of improvements made to our website this month,
including Chris Lamb making the "print" CSS stylesheet nicer [70]. Fay
Stegerman made a number of updates to the page about the
SOURCE_DATE_EPOCH environment variable [71] [72][73][74] and Holger
Levsen added some of their presentations to the "Resources" page [75].
[70] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/77b997d1
[71] https://reproducible-builds.org/docs/source-date-epoch/
[72] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/db010718
[73] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/c41667e6
[74] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/aa5e9da9
[75] https://reproducible-builds.org/docs/resources/
Furthermore, IOhannes zmölnig stipulated support for SOURCE_DATE_EPOCH
in clang [76] version 16.0.0+ [77], Jan Zerebecki expanded the "Formal
definition [78]" page and fixed a number of typos on the "Buy-in [79]"
page [80] and Simon Josefsson fixed the link to Trisquel GNU/Linux [81]
on the "Projects [82]" page [83].
[76] https://clang.llvm.org/
[77] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/7489a013
[78] https://reproducible-builds.org/docs/formal-definition/
[79] https://reproducible-builds.org/docs/buy-in/
[80] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/a5ab5d35
[81] https://trisquel.info/
[82] https://reproducible-builds.org/who/projects/
[83] https://salsa.debian.org/reproducible-builds/reproducible-website/commit/438f0ff9
§
Upstream patches
----------------
This month, we wrote a number of patches to fix specific reproducibility
issues, including:
* Bernhard M. Wiedemann:
* nauty [84] (CPU-detection issue)
* emacs [85] (ASLR)
[84] https://bugzilla.opensuse.org/show_bug.cgi?id=1225415
[85] https://mail.gnu.org/archive/html/emacs-devel/2024-05/msg01026.html
* Chris Lamb:
* #1070754 [86] filed against gensio [87].
* #1071064 [88] filed against tkgate [89].
* #1072094 [90] filed against ruby-pgplot [91].
[86] https://bugs.debian.org/1070754
[87] https://tracker.debian.org/pkg/gensio
[88] https://bugs.debian.org/1071064
[89] https://tracker.debian.org/pkg/tkgate
[90] https://bugs.debian.org/1072094
[91] https://tracker.debian.org/pkg/ruby-pgplot
§
Reproducibility testing framework
---------------------------------
The Reproducible Builds project operates a comprehensive testing
framework running primarily at tests.reproducible-builds.org [92] in
order to check packages and other artifacts for reproducibility. In May,
a number of changes were made by Holger Levsen:
* Debian-related changes:
* Enable the rebuilder-snapshot API on osuosl4. [94]
* Schedule the i386 architecture a bit more often. [95]
* Adapt cleanup_nodes.sh to the new way of running our build
services. [96]
* Add 8 more workers for the i386 architecture. [97]
* Update configuration now that the infom07 and infom08 nodes have
been reinstalled as "real" i386 systems. [98]
* Make diffoscope [99] timeouts more visible on the #debian-
reproducible-changes IRC channel. [100]
* Mark the cbxi4a-armhf node as down. [101][102]
* Only install the hdmi2usb-mode-switch package only on Debian
bookworm and earlier [103] and only install the haskell-
platform package on Debian bullseye [104].
[92] https://tests.reproducible-builds.org
[94] https://salsa.debian.org/qa/jenkins.debian.net/commit/8cf39a1d7
[95] https://salsa.debian.org/qa/jenkins.debian.net/commit/3af751f20
[96] https://salsa.debian.org/qa/jenkins.debian.net/commit/127890236
[97] https://salsa.debian.org/qa/jenkins.debian.net/commit/33d7eca13
[98] https://salsa.debian.org/qa/jenkins.debian.net/commit/c9ebd1d46
[99] https://diffoscope.org
[100] https://salsa.debian.org/qa/jenkins.debian.net/commit/9f3bcea14
[101] https://salsa.debian.org/qa/jenkins.debian.net/commit/2fa8b2402
[102] https://salsa.debian.org/qa/jenkins.debian.net/commit/a94ec1db5
[103] https://salsa.debian.org/qa/jenkins.debian.net/commit/14dbf963a
[104] https://salsa.debian.org/qa/jenkins.debian.net/commit/1574a0fdf
* Misc:
* Install the ntpdate utility as we need it later. [105]
* Document the progress on the i386 architecture nodes at
Infomaniak [106]. [107]
* Drop an outdated and unnoticed notice. [108]
* Add live_setup_schroot to the list of so-called "zombie"
jobs. [109]
[105] https://salsa.debian.org/qa/jenkins.debian.net/commit/c1c3e6862
[106] https://www.infomaniak.com/en
[107] https://salsa.debian.org/qa/jenkins.debian.net/commit/d34fd5c04
[108] https://salsa.debian.org/qa/jenkins.debian.net/commit/1388c366c
[109] https://salsa.debian.org/qa/jenkins.debian.net/commit/6c048aa94
In addition, Mattia Rizzolo reinstalled the infom07 and infom08 nodes
[110] and Vagrant Cascadian marked the cbxi4a node as online [111].
[110] https://salsa.debian.org/qa/jenkins.debian.net/commit/f79acda69
[111] https://salsa.debian.org/qa/jenkins.debian.net/commit/29da0a918
§
If you are interested in contributing to the Reproducible Builds
project, please visit our "Contribute" [112] page on our website.
However, you can get in touch with us via:
* IRC: #reproducible-builds on irc.oftc.net.
* Twitter: @ReproBuilds [113]
* Mastodon: @reproducible_builds at fosstodon.org [114]
* Mailing list: rb-general at lists.reproducible-builds.org [115]
[112] https://reproducible-builds.org/contribute/
[113] https://twitter.com/ReproBuilds
[114] https://fosstodon.org/@reproducible_builds
[115] https://lists.reproducible-builds.org/listinfo/rb-general
--
o
⬋ ⬊
o o reproducible-builds.org 💠
⬊ ⬋
o
More information about the Reproducible-builds
mailing list