[Reproducible-builds] [PATCH] Allow embedded timestamps by C/C++ macros to be set externally (2)

Dhole dhole at openmailbox.org
Sun Nov 15 22:14:01 UTC 2015


Hi!

A while back I submitted a patch to the gcc-patches mailing list
to allow the embedded timestamps by C/C++ macros to be set externally
[0]. The attached patch addresses a comment from Joseph Myers <joseph at
codesourcery dot com>: move the getenv calls into the gcc/ directory;
and it also provides better organization and style following the advice
and comments of the current Debian gcc package maintainer Matthias Klose
<doko at debian dot org> (I'm sending the patch with him as co-author).

As a reminder for the motivation behind this patch, we are working on
the reproducible builds project which aims to provide users with a way
to reproduce bit-for-bit identical binary packages from the source and
build environment. The project involves Debian as well as several other
free software projects. See <https://reproducible-builds.org/> for more
information.

Quoting from the original email:

> In order to do this, we need to make the build processes deterministic.
> As you can imagine, gcc is quite involved in producing Debian packages.
> One issue we encounter in many packages that fail to build reproducibly
> is the use of the __DATE__, __TIME__ C macros [1], right now we have 456
> affected packages that would need patching (either removing the macros,
> or passing a known date externally).

> A solution for toolchain packages that embed timestamps during the build
> process has been proposed for anyone interested and it consists of the
> following:
> The build environment can export an environment variable called
> SOURCE_DATE_EPOCH with a known timestamp in Unix epoch format (In our
> case, we use the last date of the package's debian changelog). The
> toolchain package running during the build can check if the exported
> variable is set and if so, instead of embedding the local date/time,
> embed the date/time from SOURCE_DATE_EPOCH.

The proposal to use SOURCE_DATE_EPOCH has now been gathered in a more
formal specification [2], so that any project can adhere to it to
achieve reproducible builds when dealing with timestamps.

> It would be very beneficial to our project (and other free software
> projects working on reproducible builds) if gcc supported this feature.

I'm attaching a patch for gcc-5.2.0 that enables this feature: it
modifies the behavior of the macros __DATE__ and __TIME__ when
SOURCE_DATE_EPOCH is exported.

Any comments, suggestions or other ideas to improve this patch are
mostly welcomed.

I'm willing to extend the documentation if the patch feels appropriate.

Thanks for your attention!


PD: Implementation details:

The error output in gcc/c-family/c-common.c (get_source_date_epoch) is
handled by an fprintf() to stderr followed by an exit (EXIT_FAILURE). I
am not sure this is the right approach for error handling, as I have
found many usages of the error() function in the code. If implemented
with error(), the output looks like this:

"""
$ SOURCE_DATE_EPOCH=123a123 g++ test_cpp.cpp -o test_cpp
In file included from <command-line>:0:0:
/usr/include/stdc-predef.h:1:0: error: Environment variable
$SOURCE_DATE_EPOCH: Trailing garbage: a123

 /* Copyright (C) 1991-2014 Free Software Foundation, Inc.
 ^
"""

For this reason I used fprintf(stderr, "error message") to report the
errors.


[0] https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02210.html
[1] https://wiki.debian.org/ReproducibleBuilds/TimestampsFromCPPMacros
[2] https://reproducible-builds.org/specs/source-date-epoch/

Best regards,
Dhole
-------------- next part --------------
gcc/c-family/ChangeLog:

2015-10-10  Eduard Sanou  <dhole at openmailbox.org>
	    Matthias Klose  <doko at debian.org>
	* gcc/c-family/c-common.c (get_source_date_epoch): New function, gets 
	the environment variable SOURCE_DATE_EPOCH and parses it as long long
	with error handling.
	* gcc/c-family/c-common.h (get_source_date_epoch): Prototype.
	* gcc/c-family/c-lex.c (c_lex_with_flags): set 
	parse_in->source_date_epoch.

libcpp/ChangeLog:

2015-10-10  Eduard Sanou  <dhole at openmailbox.org>
	    Matthias Klose  <doko at debian.org>
	* libcpp/include/cpplib.h (cpp_init_source_date_epoch): Prototype.
	* libcpp/init.c (cpp_init_source_date_epoch): New function.
	* libcpp/internal.h: Added source_date_epoch variable to struct
	cpp_reader to store a reproducible date.
	* libcpp/macro.c (_cpp_builtin_macro_text): Set pfile->date timestamp
	from pfile->source_date_epoch instead of localtime if source_date_epoch
	is set, to be used for __DATE__ and __TIME__ macros to help
	reproducible builds.
-------------- next part --------------
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 7fe7fa6..f795871 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12318,4 +12318,49 @@ pointer_to_zero_sized_aggr_p (tree t)
   return (TYPE_SIZE (t) && integer_zerop (TYPE_SIZE (t)));
 }
 
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+long long
+get_source_date_epoch()
+{
+  char *source_date_epoch;
+  unsigned long long epoch;
+  char *endptr;
+
+  source_date_epoch = getenv ("SOURCE_DATE_EPOCH");
+  if (!source_date_epoch)
+    return -1
+
+  errno = 0;
+  epoch = strtoull (source_date_epoch, &endptr, 10);
+  if ((errno == ERANGE && (epoch == ULLONG_MAX || epoch == 0))
+      || (errno != 0 && epoch == 0))
+    {
+      fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: strtoull: \
+%s\n", xstrerror(errno));
+      exit (EXIT_FAILURE);
+    }
+  if (endptr == source_date_epoch)
+    {
+      fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: \
+No digits were found: %s\n", endptr);
+      exit (EXIT_FAILURE);
+    }
+  if (*endptr != '\0')
+    {
+      fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: \
+Trailing garbage: %s\n", endptr);
+      exit (EXIT_FAILURE);
+    }
+  if (epoch > ULONG_MAX)
+    {
+      fprintf (stderr, "environment variable $SOURCE_DATE_EPOCH: \
+value must be smaller than or equal to: %lu but was found to be: \
+%llu \n", ULONG_MAX, epoch);
+      exit (EXIT_FAILURE);
+    }
+  return (long long) epoch;
+}
+
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index cabf452..4b55277 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1437,4 +1437,10 @@ extern bool contains_cilk_spawn_stmt (tree);
 extern tree cilk_for_number_of_iterations (tree);
 extern bool check_no_cilk (tree, const char *, const char *,
 		           location_t loc = UNKNOWN_LOCATION);
+
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+extern long long get_source_date_epoch();
+
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index bb55be8..9344f66 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -402,6 +402,10 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
   enum cpp_ttype type;
   unsigned char add_flags = 0;
   enum overflow_type overflow = OT_NONE;
+  long long source_date_epoch = -1;
+
+  source_date_epoch = get_source_date_epoch();
+  cpp_init_source_date_epoch(parse_in, source_date_epoch);
 
   timevar_push (TV_CPP);
  retry:
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5e08014..9eee80f 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -775,6 +775,9 @@ extern void cpp_init_special_builtins (cpp_reader *);
 /* Set up built-ins like __FILE__.  */
 extern void cpp_init_builtins (cpp_reader *, int);
 
+/* Initialize the source_date_epoch value.  */
+extern void cpp_init_source_date_epoch (cpp_reader *, long long);
+
 /* This is called after options have been parsed, and partially
    processed.  */
 extern void cpp_post_options (cpp_reader *);
diff --git a/libcpp/init.c b/libcpp/init.c
index 45a4d13..dfde0a1 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -530,6 +530,13 @@ cpp_init_builtins (cpp_reader *pfile, int hosted)
     _cpp_define_builtin (pfile, "__OBJC__ 1");
 }
 
+/* Initialize the source_date_epoch value.  */
+void
+cpp_init_source_date_epoch (cpp_reader *pfile, long long source_date_epoch)
+{
+  pfile->source_date_epoch = source_date_epoch; 
+}
+
 /* Sanity-checks are dependent on command-line options, so it is
    called as a subroutine of cpp_read_main_file ().  */
 #if ENABLE_CHECKING
diff --git a/libcpp/internal.h b/libcpp/internal.h
index c2d0816..4363d96 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -502,6 +502,10 @@ struct cpp_reader
   const unsigned char *date;
   const unsigned char *time;
 
+  /* Externally set timestamp to replace current date and time useful for
+     reproducibility.  */
+  long long source_date_epoch;
+
   /* EOF token, and a token forcing paste avoidance.  */
   cpp_token avoid_paste;
   cpp_token eof;
diff --git a/libcpp/macro.c b/libcpp/macro.c
index 1e0a0b5..42beb9c 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -350,13 +350,20 @@ _cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node)
 	  time_t tt;
 	  struct tm *tb = NULL;
 
-	  /* (time_t) -1 is a legitimate value for "number of seconds
-	     since the Epoch", so we have to do a little dance to
-	     distinguish that from a genuine error.  */
-	  errno = 0;
-	  tt = time(NULL);
-	  if (tt != (time_t)-1 || errno == 0)
-	    tb = localtime (&tt);
+	  /* Set a reproducible timestamp for __DATE__ and __TIME__ macro
+	     usage if SOURCE_DATE_EPOCH is defined.  */
+	  if (pfile->source_date_epoch != -1)
+	     tb = gmtime ((time_t*) &pfile->source_date_epoch);
+	  else
+	    {
+	      /* (time_t) -1 is a legitimate value for "number of seconds
+		 since the Epoch", so we have to do a little dance to
+		 distinguish that from a genuine error.  */
+	      errno = 0;
+	      tt = time (NULL);
+	      if (tt != (time_t)-1 || errno == 0)
+		tb = localtime (&tt);
+	    }
 
 	  if (tb)
 	    {
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151115/be2ddafe/attachment.sig>


More information about the Reproducible-builds mailing list