[Reproducible-builds] build paths and debug info for generated C objects
Daniel Kahn Gillmor
dkg at fifthhorseman.net
Sat Dec 5 02:38:54 UTC 2015
Hey all--
Niels, anthraxx and i were just commiserating about the fact that we're
punting on reproducibility of the build path. We think we might have
found a way to make progress on this.
Problem Statement
-----------------
One of the main concerns about the build path is that it gets included
by gcc in any generated DWARF [0] debugging symbols, specifically in the
dwarf attribute named DW_AT_comp_dir.
Background
----------
gcc already allows the user to tweak this attribute directly:
-fdebug-prefix-map=old=new
When compiling files in directory old, record debugging information
describing them as in new instead.
So, for example, i can do:
gcc -fdebug-prefix-map=$(pwd)=. -o test test.c
gdb still works for me when debugging code that is built this way.
However, gcc also stores all the switches used during the build in the
DW_AT_producer, so if you do this, then you're just moving the build
path to a different dwarf attribute, so it's still being encoded in the
output. This doesn't solve the reproducibility problem, but it provides
us with a way to demonstrate that removing the data from DW_AT_comp_dir
doesn't cripple our ability to debug.
We also observed that DW_AT_name already stores the name of the compiled
file relative to the DW_AT_comp_dir -- this poses no reproducibility
problems on its own.
Proposed Solution
-----------------
A minor change to gcc:
* if the "old" parameter for -fdebug-prefix-map starts with a literal $
character, make gcc treat it as an environment variable name. So:
(note the shell escaping)
export SOURCE_BUILD_DIR=$(pwd)
gcc -g -fdebug-prefix-map=\$SOURCE_BUILD_DIR=. -o test.o -c test.c
should do what we need: the gcc flags are static, and the build path
is stripped.
- What to do if the chosen env var isn't set? Probably just skip the
match entirely, and maybe raise a warning.
- What about bizarre theoretical filesystems that might have $ as a
leading character? We don't know of any. We're willing to
sacrifice them for this feature. :)
I've patched GCC to work this way successfully:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug-prefix-map-from-env.patch
Type: text/x-diff
Size: 1754 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151205/05fd1d61/attachment.patch>
-------------- next part --------------
What do r-b people think about this? I'm happy to try to push this
patch to the gcc upstream if folks here think this sounds reasonable and
would address a real future r-b issue.
Alternate Solutions
-------------------
We considered and discarded several other possible solutions, which i'm
noting below, along with the downsides that led us to select the one we
chose:
* ask gcc to not record -fdebug-prefix-map options in DW_AT_producer
- it's weird that some options wouldn't be recorded and some
would.
- build systems would need to set dynamic CFLAGS not be able to
use this approach. debian can do this in dpkg-buildpackage, but
apparently it's tougher on Arch (though Arch can more easily
set dynamic environment variables).
or three different ideas for new gcc flags, all of which share the
problem that adding a new gcc option would mean that attempts to apply
this prefix map would fail hard when using non-updated gcc:
* gcc -fdebug-prefix-map-from-env=NEW
This evaluates a specific, fixed environment variable like
SOURCE_BUILD_ROOT as the "old" part of the prefix map.
- asking gcc to adopt a new magic environment variable seems
sketchy.
* gcc -fdebug-prefix-map-from-env=ENVVAR=NEW
This does the same thing as the as the main proposal, but it uses a
distinct flag and doesn't expect the leading $ prefix. e.g.
gcc -fdebug-prefix-map-from-env=SOURCE_BUILD_ROOT=.
* gcc -fdebug-force-path-to=NEW
This approach just forces the value of all generated DW_AT_comp_dir
attributes, which might be overkill.
- this fails to record the paths relative to the build directory in
the event that a recursive descent build pattern (a tree of
Makefiles) is used. That is, if the top level Makefile does both
"make -C src1" and "make -C src2", then debug info from files
named foo.c in each directory will be indistinguishable, even
within the same project.
feedback welcome,
--dkg
[0] http://dwarfstd.org/doc/DWARF4.pdf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 948 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151205/05fd1d61/attachment.sig>
More information about the Reproducible-builds
mailing list