Mapping Reproducibility Bug Reports to Commits

Vagrant Cascadian vagrant at
Sun Nov 14 18:48:33 GMT 2021

On 2021-11-14, Muhammad Hassan wrote:
> I am a researcher at the University of Waterloo, conducting a project
> to study reproducibility issues in Debian packages.

Great to hear!

> The first step for me is to link each Reproducibility-related bug at
> this link:
> to the corresponding commit that fixed the bug.
> However, I am unable to find an explicit way of doing so programatically. Please assist.

This is, unfortunately, a non-trivial task; programatically this
information is not directly exposed anywhere (to my knowledge).

Two approaches come to mind, mining the vcs history and parsing the bug
report mbox files...

Mining the Vcs history:

Most packages in debian are maintained in some sort of VCS, most of
which are git, and most of those are on, and most of
the time the VCS is updated in sync with the uploaded package...

I think your best bet is to parse the debian/control file to get the
Vcs-* fields and then parse the corresponding revision control systems
for commit logs which include lines like:

Closes: #NNN

(Closes: #NNN)

Closes: NNN

For some packages this will not be possible, but it would be good to get
the data of those that do have a corresponding commit and those that

Parsing the bug report logs:

Another approach is to download the mbox file from each bug report, some
of which include auto-generated messages from with the
specific commit where it is marked as pending:

Though marked as pending and included in Debian are two different
things; depending on the nature of your study you may need to confirm
that it was actually uploaded!

live well,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <>

More information about the Reproducible-builds mailing list