[Pkg-utopia-maintainers] Bug#932381: libdbus-1-3: lidbus _dbus_marshal_write_basic uses implementation defined behaviour (unaligned read)

Witold Baryluk witold.baryluk at gmail.com
Thu Jul 18 16:15:09 BST 2019


Package: libdbus-1-3
Version: 1.12.16-1
Severity: important


Hi,

in libdbus in function _dbus_marshal_write_basic
(in file dbus-marshal-basic.c around lines 859-881).

for cases INT16, INT32, INT64 and DOUBLE, the function uses
implementation defined behaviour of converting void pointer to
integer pointer of bigger size, and dereferencing it.

This is implementation defined and generally not correct.

See specifically section 6.3.2.3 subpoints 5 and 6,
of C11 standard (or other ones, including older ones).

In case of gcc, the result is actually undefined behaviour.

Example:

const void *value;

const dbus_uint64_t *u64_p;

case DOUBLE:
  u64_p = value;
  return f2(...., *u64_p, ...);




If the value itself doesn't have a proper alignment, then behaviour is
undefined, (or implementation defined). This makes it unportable.


In this case, value is often not aligned to native uint64 alignement,
and dereference does lead to a CPU trap on some architectures
and some compilers.

I suggest to change all dereferences to types bigger than 1 byte to
use memcpy, like this:

case DOUBLE:
  memcpy(&tmp_u64, value, sizeof(tmp_u64));
  return f2(...., tmp_u64, ...);


Compilers are smart to inline memcpy, and optimize it very well, when the
the length is known at compile time like here.
On most architectures compiler will actually convert it to a single instruction!
Or use additional infered or instructed information about alignment,
to pesimize or optimize the code.

Example, aarch64 gcc 8.2

unsigned long int s1(void* p) {
  unsigned long int tmp
  memcpy(&tmp, v, 8);
  return tmp;
}
unsigned long int s2(void* p) {
  return *((unsigned long int*)p);
}

Compilation result:

s1(void*):
  ldr x0, [x0]
  ret

s2(void*):
  ldr x0, [x0]
  ret


Same generated code to a byte.


Similar results can be seen on i386, amd64 and ppc64 (depending which
model of Power CPU is targeted exactly). Both gcc and clang, even older
versions.

On 32-bit arm, it will pesimize correctly to read and combined aligned
reads (which is in total 3 extra instructions). On architectures like
32-bit ARM, and Alpha (64-bit), MIPS (both 32 and 64 bit), 32-bit PowerPC
the resulting code with memcpy will in fact be faster than current code,
because it will not trap and emulate read in kernel, which is very slow.


Regards,
Witold



-- System Information:
Debian Release: bullseye/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: alpha

Kernel: Linux 4.19.0-5-alpha-generic
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages libdbus-1-3 depends on:
ii  libc6.1      2.28-4
ii  libsystemd0  241-6+b2

Versions of packages libdbus-1-3 recommends:
ii  dbus  1.12.16-1

libdbus-1-3 suggests no packages.

-- no debconf information



More information about the Pkg-utopia-maintainers mailing list