[Nut-upsuser] upsd crashes with a "broken pipe" error
Zach La Celle
lacelle at roboticresearch.com
Tue Feb 15 19:07:08 UTC 2011
Resurrecting this problem, because I finally caught it in the debugger...
Here's the trace, with some GDB prints. Please excuse the length.
---------------------------------------------
...
545815.397326 mainloop: polling 4 filedescriptors
*** glibc detected *** /sbin/upsd: malloc(): memory corruption:
0x000000000061f300 ***
======= Backtrace: =========
/lib/libc.so.6(+0x775b6)[0x7ffff76ac5b6]
/lib/libc.so.6(+0x7b6d8)[0x7ffff76b06d8]
/lib/libc.so.6(__libc_malloc+0x6e)[0x7ffff76b158e]
/sbin/upsd[0x408e91]
/sbin/upsd[0x4091d9]
/sbin/upsd[0x409431]
/sbin/upsd[0x4097e2]
/sbin/upsd[0x409bdb]
/sbin/upsd[0x402a26]
/sbin/upsd[0x403789]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7ffff7653c4d]
/sbin/upsd[0x402079]
======= Memory map: ========
00400000-0040e000 r-xp 00000000 fb:00 8806463
/sbin/upsd
0060d000-0060e000 r--p 0000d000 fb:00 8806463
/sbin/upsd
0060e000-0060f000 rw-p 0000e000 fb:00 8806463
/sbin/upsd
0060f000-00630000 rw-p 00000000 00:00 0
[heap]
7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0
7ffff0021000-7ffff4000000 ---p 00000000 00:00 0
7ffff6dfd000-7ffff6e13000 r-xp 00000000 fb:00 9093248
/lib/libgcc_s.so.1
7ffff6e13000-7ffff7012000 ---p 00016000 fb:00 9093248
/lib/libgcc_s.so.1
7ffff7012000-7ffff7013000 r--p 00015000 fb:00 9093248
/lib/libgcc_s.so.1
7ffff7013000-7ffff7014000 rw-p 00016000 fb:00 9093248
/lib/libgcc_s.so.1
7ffff7014000-7ffff7020000 r-xp 00000000 fb:00 5104002
/lib/libnss_files-2.11.1.so
7ffff7020000-7ffff721f000 ---p 0000c000 fb:00 5104002
/lib/libnss_files-2.11.1.so
7ffff721f000-7ffff7220000 r--p 0000b000 fb:00 5104002
/lib/libnss_files-2.11.1.so
7ffff7220000-7ffff7221000 rw-p 0000c000 fb:00 5104002
/lib/libnss_files-2.11.1.so
7ffff7221000-7ffff722b000 r-xp 00000000 fb:00 5104004
/lib/libnss_nis-2.11.1.so
7ffff722b000-7ffff742a000 ---p 0000a000 fb:00 5104004
/lib/libnss_nis-2.11.1.so
7ffff742a000-7ffff742b000 r--p 00009000 fb:00 5104004
/lib/libnss_nis-2.11.1.so
7ffff742b000-7ffff742c000 rw-p 0000a000 fb:00 5104004
/lib/libnss_nis-2.11.1.so
7ffff742c000-7ffff7434000 r-xp 00000000 fb:00 5104000
/lib/libnss_compat-2.11.1.so
7ffff7434000-7ffff7633000 ---p 00008000 fb:00 5104000
/lib/libnss_compat-2.11.1.so
7ffff7633000-7ffff7634000 r--p 00007000 fb:00 5104000
/lib/libnss_compat-2.11.1.so
7ffff7634000-7ffff7635000 rw-p 00008000 fb:00 5104000
/lib/libnss_compat-2.11.1.so
7ffff7635000-7ffff77af000 r-xp 00000000 fb:00 5103992
/lib/libc-2.11.1.so
7ffff77af000-7ffff79ae000 ---p 0017a000 fb:00 5103992
/lib/libc-2.11.1.so
7ffff79ae000-7ffff79b2000 r--p 00179000 fb:00 5103992
/lib/libc-2.11.1.so
7ffff79b2000-7ffff79b3000 rw-p 0017d000 fb:00 5103992
/lib/libc-2.11.1.so
7ffff79b3000-7ffff79b8000 rw-p 00000000 00:00 0
7ffff79b8000-7ffff79c1000 r-xp 00000000 fb:00 9093205
/lib/libwrap.so.0.7.6
7ffff79c1000-7ffff7bc0000 ---p 00009000 fb:00 9093205
/lib/libwrap.so.0.7.6
7ffff7bc0000-7ffff7bc1000 r--p 00008000 fb:00 9093205
/lib/libwrap.so.0.7.6
7ffff7bc1000-7ffff7bc2000 rw-p 00009000 fb:00 9093205
/lib/libwrap.so.0.7.6
7ffff7bc2000-7ffff7bc3000 rw-p 00000000 00:00 0
7ffff7bc3000-7ffff7bda000 r-xp 00000000 fb:00 5103999
/lib/libnsl-2.11.1.so
7ffff7bda000-7ffff7dd9000 ---p 00017000 fb:00 5103999
/lib/libnsl-2.11.1.so
7ffff7dd9000-7ffff7dda000 r--p 00016000 fb:00 5103999
/lib/libnsl-2.11.1.so
7ffff7dda000-7ffff7ddb000 rw-p 00017000 fb:00 5103999
/lib/libnsl-2.11.1.so
7ffff7ddb000-7ffff7ddd000 rw-p 00000000 00:00 0
7ffff7ddd000-7ffff7dfd000 r-xp 00000000 fb:00 9093254
/lib/ld-2.11.1.so
7ffff7fee000-7ffff7ff1000 rw-p 00000000 00:00 0
7ffff7ff8000-7ffff7ffb000 rw-p 00000000 00:00 0
7ffff7ffb000-7ffff7ffc000 r-xp 00000000 00:00 0
[vdso]
7ffff7ffc000-7ffff7ffd000 r--p 0001f000 fb:00 9093254
/lib/ld-2.11.1.so
7ffff7ffd000-7ffff7ffe000 rw-p 00020000 fb:00 9093254
/lib/ld-2.11.1.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0
7ffffffea000-7ffffffff000 rw-p 00000000 00:00 0
[stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
Program received signal SIGABRT, Aborted.
0x00007ffff7668a75 in raise () from /lib/libc.so.6
(gdb) up
#1 0x00007ffff766c5c0 in abort () from /lib/libc.so.6
(gdb)
#2 0x00007ffff76a24fb in ?? () from /lib/libc.so.6
(gdb)
#3 0x00007ffff76ac5b6 in ?? () from /lib/libc.so.6
(gdb)
#4 0x00007ffff76b06d8 in ?? () from /lib/libc.so.6
(gdb)
#5 0x00007ffff76b158e in malloc () from /lib/libc.so.6
(gdb)
#6 0x0000000000408e91 in add_arg_word (ctx=0x61f150) at parseconf.c:125
125 ctx->arglist = realloc(ctx->arglist,
(gdb)
(gdb) p ctx->arglist
$2 = (char **) 0x0
(gdb) p ctx->numargs
$3 = 1
(gdb)
#7 0x00000000004091d9 in endofword (ctx=0x61f150) at parseconf.c:212
212 add_arg_word(ctx);
(gdb)
#8 0x0000000000409431 in collect (ctx=0x61f150) at parseconf.c:324
324 endofword(ctx);
(gdb)
#9 0x00000000004097e2 in parse_char (ctx=0x61f150) at parseconf.c:447
447 ctx->state = collect(ctx);
(gdb)
#10 0x0000000000409bdb in pconf_char (ctx=0x61f150, ch=9 '\t') at
parseconf.c:607
607 parse_char(ctx);
(gdb)
#11 0x0000000000402a26 in client_readline (client=0x61f110) at upsd.c:587
587 switch (pconf_char(&client->ctx, buf[i]))
(gdb)
#12 0x0000000000403789 in mainloop (argc=<value optimized out>,
argv=<value optimized out>) at upsd.c:854
854 client_readline((ctype_t
*)handler[i].data);
(gdb) info line
Line 854 of "upsd.c" starts at address 0x403780 <main+1968> and ends at
0x403790 <main+1984>.
(gdb)
---------------------------------------------
You can see where the problem happens in parseconf.c, on line 125 with
the code:
/* resize the lists */
ctx->arglist = realloc(ctx->arglist,
sizeof(char *) * ctx->numargs);
This is in the function "static void add_arg_word(PCONF_CTX_t *ctx)",
which traced back to upsd is called from line 587:
switch (pconf_char(&client->ctx, buf[i]))
This also might help:
(gdb) p *ctx
$4 = {f = 0x0, state = 5, ch = 9, arglist = 0x0, argsize = 0x0, numargs
= 1, maxargs = 1, wordbuf = 0x61f2e0 "Z", wordptr = 0x61f2fd "",
wordbufsize = 16, linenum = 0, error = 0, errmsg = '\000' <repeats 255
times>, errhandler = 0, magic = 7497264, arg_limit = 32, wordlen_limit =
512}
If I go "up" in GDB to the pconf_char function, here is the character
which is killing it:
(gdb) p ch
$6 = 9 '\t'
That's all of the information I could think to grab for you. I've got
it running in the debugger again, so when it dies, if you have something
else you'd like to see, let me know.
Thank you. I really hope this can get solved!
More information about the Nut-upsuser
mailing list