[Reproducible-builds] GCC patch reviewed. Proposed mail for gcc-patches mailing list

Dhole dhole at openmailbox.org
Mon Nov 9 00:02:40 UTC 2015

Hi all!

Back in summer I worked on a patch for GCC to honor the
SOURCE_DATE_EPOCH env variable when embedding timestamps with the C/C++
macros __DATE__ and __TIME__. I sent the patch to the gcc-patches
mailing list, and after some replies with suggestions and opinions it
didn't reach further.
In this year's DebConf I got in touch with doko (Debian gcc package
maintainer) and he offered to help me with the patch. I have addressed a
comment made by a gcc contributor in the original patch email thread, I
have organized it better and improved the style based on doko advices
and reviews. Before sending the patch to the gcc mailing list I'm
copying the message I intend to write here, in case anyone has comments
and/or improvements. Finally I'll wait for doko's approval before
sending it to the gcc-patches ML.




A while back I submitted a patch to the gcc-patches mailing list
to allow the embedded timestamps by C/C++ macros to be set externally
[0]. The attached patch adresses a comment from Joseph Myers <joseph at
codesourcery dot com>: move the getenv calls into the gcc/ directory;
and it also provides better organization and style following the advice
and comments of the current Debian gcc package maintainer Matthias Klose
<doko at debian dot org> (I'm sending the patch with him as co-author).

As a reminder for the motivation behind this patch, we are working on
the reproducible builds project which aims to provide users with a way
to reproduce bit-for-bit identical binary packages from the source and
build environment. The project involves Debian as well as several other
free software projects. See <https://reproducible-builds.org/> for more

Quoting from the original email:

> In order to do this, we need to make the build processes deterministic.
> As you can imagine, gcc is quite involved in producing Debian packages.
> One issue we encounter in many packages that fail to build reproducibly
> is the use of the __DATE__, __TIME__ C macros [1], right now we have 456
> affected packages that would need patching (either removing the macros,
> or passing a known date externally).

> A solution for toolchain packages that embed timestamps during the build
> process has been proposed for anyone interested and it consists of the
> following:
> The build environment can export an environment variable called
> SOURCE_DATE_EPOCH with a known timestamp in Unix epoch format (In our
> case, we use the last date of the package's debian changelog). The
> toolchain package running during the build can check if the exported
> variable is set and if so, instead of embedding the local date/time,
> embed the date/time from SOURCE_DATE_EPOCH.

The proposal to use SOURCE_DATE_EPOCH has now been gathered in a more
formal specification [2], so that any project can adhere to it to
achieve reproducible builds when dealing with timestamps.

> It would be very beneficial to our project (and other free software
> projects working on reproducible builds) if gcc supported this feature.

I'm attaching a patch for gcc-5.2.0 that enables this feature: it
modifies the behavior of the macros __DATE__ and __TIME__ when
SOURCE_DATE_EPOCH is exported.

Any comments, suggestions or other ideas to improve this patch are
mostly welcomed.

I'm willing to extend the documentation if the patch feels appropriate.

Thanks for your attention!

PD: Implementation details:

The error output in gcc/c-family/c-common.c (get_source_date_epoch) is
handled by an fprintf() to stderr followed by an exit (EXIT_FAILURE). I
am not sure this is the right approach for error handling, as I have
found many usages of the error() function in the code. If implemented
with error(), the output looks like this:

$ SOURCE_DATE_EPOCH=123a123 g++ test_cpp.cpp -o test_cpp
In file included from <command-line>:0:0:
/usr/include/stdc-predef.h:1:0: error: Environment variable
$SOURCE_DATE_EPOCH: Trailing garbage: a123

 /* Copyright (C) 1991-2014 Free Software Foundation, Inc.

For this reason I used fprintf(stderr, "error message") to report the

[0] https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02210.html
[1] https://wiki.debian.org/ReproducibleBuilds/TimestampsFromCPPMacros
[2] https://reproducible-builds.org/specs/source-date-epoch/

Best regards,
-------------- next part --------------

2015-10-10  Matthias Klose  <doko at debian.org>
	    Eduard Sanou  <dhole at openmailbox.org>
	* gcc/c-family/c-common.c (get_source_date_epoch): New function, gets 
	the environment variable SOURCE_DATE_EPOCH and parses it as long long
	with error handling.
	* gcc/c-family/c-common.h (get_source_date_epoch): Prototype.
	* gcc/c-family/c-lex.c (c_lex_with_flags): set 


2015-10-10  Matthias Klose  <doko at debian.org>
	    Eduard Sanou  <dhole at openmailbox.org>
    	* libcpp/include/cpplib.h (cpp_init_source_date_epoch): Prototype.
	* libcpp/init.c (cpp_init_source_date_epoch): New function.
	* libcpp/internal.h: Added source_date_epoch variable to struct
	cpp_reader to store a reproducible date.
	* libcpp/macro.c (_cpp_builtin_macro_text): Set pfile->date timestamp
	from pfile->source_date_epoch instead of localtime if source_date_epoch
	is set, to be used for __DATE__ and __TIME__ macros to help
	reproducible builds.
-------------- next part --------------
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 7fe7fa6..762297d 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12318,4 +12318,51 @@ pointer_to_zero_sized_aggr_p (tree t)
   return (TYPE_SIZE (t) && integer_zerop (TYPE_SIZE (t)));
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+long long
+  char *source_date_epoch;
+  unsigned long long epoch;
+  char *endptr;
+  source_date_epoch = getenv ("SOURCE_DATE_EPOCH");
+  if (source_date_epoch)
+    {
+      errno = 0;
+      epoch = strtoull (source_date_epoch, &endptr, 10);
+      if ((errno == ERANGE && (epoch == ULLONG_MAX || epoch == 0))
+	  || (errno != 0 && epoch == 0))
+	{
+	  fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: strtoull: \
+%s\n", xstrerror(errno));
+	  exit (EXIT_FAILURE);
+	}
+      if (endptr == source_date_epoch)
+	{
+	  fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: \
+No digits were found: %s\n", endptr);
+	  exit (EXIT_FAILURE);
+	}
+      if (*endptr != '\0')
+	{
+	  fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: \
+Trailing garbage: %s\n", endptr);
+	  exit (EXIT_FAILURE);
+        }
+      if (epoch > ULONG_MAX)
+	{
+	  fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: \
+value must be smaller than or equal to: %lu but was found to be: \
+%llu \n", ULONG_MAX, epoch);
+	  exit (EXIT_FAILURE);
+	}
+      return (long long) epoch;
+    }
+  else
+    return -1;
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index cabf452..4b55277 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1437,4 +1437,10 @@ extern bool contains_cilk_spawn_stmt (tree);
 extern tree cilk_for_number_of_iterations (tree);
 extern bool check_no_cilk (tree, const char *, const char *,
 		           location_t loc = UNKNOWN_LOCATION);
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+extern long long get_source_date_epoch();
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index bb55be8..9344f66 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -402,6 +402,10 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
   enum cpp_ttype type;
   unsigned char add_flags = 0;
   enum overflow_type overflow = OT_NONE;
+  long long source_date_epoch = -1;
+  source_date_epoch = get_source_date_epoch();
+  cpp_init_source_date_epoch(parse_in, source_date_epoch);
   timevar_push (TV_CPP);
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5e08014..9eee80f 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -775,6 +775,9 @@ extern void cpp_init_special_builtins (cpp_reader *);
 /* Set up built-ins like __FILE__.  */
 extern void cpp_init_builtins (cpp_reader *, int);
+/* Initialize the source_date_epoch value.  */
+extern void cpp_init_source_date_epoch (cpp_reader *, long long);
 /* This is called after options have been parsed, and partially
    processed.  */
 extern void cpp_post_options (cpp_reader *);
diff --git a/libcpp/init.c b/libcpp/init.c
index 45a4d13..dfde0a1 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -530,6 +530,13 @@ cpp_init_builtins (cpp_reader *pfile, int hosted)
     _cpp_define_builtin (pfile, "__OBJC__ 1");
+/* Initialize the source_date_epoch value.  */
+cpp_init_source_date_epoch (cpp_reader *pfile, long long source_date_epoch)
+  pfile->source_date_epoch = source_date_epoch; 
 /* Sanity-checks are dependent on command-line options, so it is
    called as a subroutine of cpp_read_main_file ().  */
diff --git a/libcpp/internal.h b/libcpp/internal.h
index c2d0816..4363d96 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -502,6 +502,10 @@ struct cpp_reader
   const unsigned char *date;
   const unsigned char *time;
+  /* Externally set timestamp to replace current date and time useful for
+     reproducibility.  */
+  long long source_date_epoch;
   /* EOF token, and a token forcing paste avoidance.  */
   cpp_token avoid_paste;
   cpp_token eof;
diff --git a/libcpp/macro.c b/libcpp/macro.c
index 1e0a0b5..42beb9c 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -350,13 +350,20 @@ _cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node)
 	  time_t tt;
 	  struct tm *tb = NULL;
-	  /* (time_t) -1 is a legitimate value for "number of seconds
-	     since the Epoch", so we have to do a little dance to
-	     distinguish that from a genuine error.  */
-	  errno = 0;
-	  tt = time(NULL);
-	  if (tt != (time_t)-1 || errno == 0)
-	    tb = localtime (&tt);
+	  /* Set a reproducible timestamp for __DATE__ and __TIME__ macro
+	     usage if SOURCE_DATE_EPOCH is defined.  */
+	  if (pfile->source_date_epoch != -1)
+	     tb = gmtime ((time_t*) &pfile->source_date_epoch);
+	  else
+	    {
+	      /* (time_t) -1 is a legitimate value for "number of seconds
+		 since the Epoch", so we have to do a little dance to
+		 distinguish that from a genuine error.  */
+	      errno = 0;
+	      tt = time (NULL);
+	      if (tt != (time_t)-1 || errno == 0)
+		tb = localtime (&tt);
+	    }
 	  if (tb)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151109/702fd4e0/attachment.sig>

More information about the Reproducible-builds mailing list