Bug#883407: libc6: getpwnam_r() leaks memory

Sat Dec 9 16:03:33 GMT 2017

Control: forwarded -1 https://github.com/systemd/systemd/issues/7596

On Fri, Dec 8, 2017 at 4:15 PM, Aurelien Jarno <aurelien at aurel32.net> wrote:
> control: reassign -1 systemd
> control: retitle -1 libnss-systemd: _nss_systemd_getpwnam_r leaks memory
>
> Hi,
>
> On 2017-12-08 16:05, Tim Rühsen wrote:
>> On Tue, 5 Dec 2017 19:17:42 +0100 Aurelien Jarno <aurelien at aurel32.net>
>> wrote:
>> > It's not something I can reproduce here, but getpwnam_r can behave very
>> > differently depending on the nss configuration your system. A small
>> > reproducer and the content of /etc/nsswitch.conf would definitely help.
>> >
>> > That said libc6 version 2.25-3 included security fixes and memory leak
>> > fixes for the glob function. Can you confirm the version you used, and
>> > if it's really 2.25-3 try with version 2.25-2 which is still in testing.
>> Here we have a reproducer (assuming the there is no user 'O' on system).
>>
>> #include <sys/types.h>
>> #include <pwd.h>
>> int main(void)
>> {
>>         struct passwd *p;
>>         char tmp[1024];
>>         struct passwd pw;
>>
>>         getpwnam_r("O", &pw, tmp, sizeof(tmp), &p);
>>         return 0;
>> }
>>
>> Build/compile/reproduce:
>> gcc -g x.c -o x
>> valgrind --leak-check=full ./x
>
> Thanks for the reproducer, I am able to reproduce it now. For that it's
> also mandatory to enable libnss-systemd in /etc/nsswitch.conf.
>
> As it only happens with libnss-systemd, this library was suspicious.
> It's possible to LD_PRELOAD it so that valgrind can resolve the symbol
> names. This gives:
>
> | ==1397== HEAP SUMMARY:
> | ==1397==     in use at exit: 4,096 bytes in 1 blocks
> | ==1397==   total heap usage: 118 allocs, 117 frees, 27,975 bytes allocated
> | ==1397==
> | ==1397== 4,096 bytes in 1 blocks are still reachable in loss record 1 of 1
> | ==1397==    at 0x4C2ABEF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> | ==1397==    by 0x4E4B927: mempool_alloc_tile (mempool.c:62)
> | ==1397==    by 0x4E4B927: mempool_alloc0_tile (mempool.c:81)
> | ==1397==    by 0x4E4B927: hashmap_base_new.lto_priv.150 (hashmap.c:732)
> | ==1397==    by 0x4E60A1D: hashmap_base_ensure_allocated (hashmap.c:786)
> | ==1397==    by 0x4E60A1D: internal_ordered_hashmap_ensure_allocated (hashmap.c:799)
> | ==1397==    by 0x4E60A1D: sd_bus_call_async (sd-bus.c:1736)
> | ==1397==    by 0x4E60A1D: bus_send_hello (sd-bus.c:428)
> | ==1397==    by 0x4E60A1D: sd_bus_start (sd-bus.c:1000)
> | ==1397==    by 0x4E60A1D: sd_bus_open_system (sd-bus.c:1094)
> | ==1397==    by 0x4E491E5: _nss_systemd_getpwnam_r (nss-systemd.c:155)
> | ==1397==    by 0x512A715: getpwnam_r@@GLIBC_2.2.5 (getXXbyYY_r.c:314)
> | ==1397==    by 0x10867D: main (x.c:9)
>
> In that case the memory is still reachable because the libnss_systemd is
> not unloaded by the call to dlclose(). Looking at the systemd code, one
> can see that the mempool_drop function to free the memory pool is only
> enabled when VALGRIND is defined. It therefore looks like the systemd
> people are aware of the issue and consider the mempool doesn't have to
> be freed by default.
>
> It might be possible to fix that by a suppression in valgrind. That said
> I prefer to leave the decision to the systemd maintainer. I am therefore
> reassigning the bug there.

Normally this is not a problem since the mempool will live (and be
reused) as long as the calling process. This is not true for nss
modules, and thus I've forwarded this upstream.

Unfortunately just defining VALGRIND also appears to disable some
optimizations in the journal, so I don't think we can just enable it.

-- 

Saludos,
Felipe Sateler