[Teammetrics-discuss] How does commitstat injects data

Sun Jan 15 08:00:49 UTC 2012

Hi,

> $ ./commitstat.py -u tille
> Segmentation fault

Well, that's new!

> Hmmm, I'm a bit concerned about things on blends.d.n as well:
>
> Jan 11 08:44:01 blends kernel: postgres[21917]: segfault at b85eb2aa ip b72d3776 sp bf84d6c0 error 4 in postgres[b7271000+4da000]
> Jan 11 08:44:02 blends kernel: python[14889]: segfault at 28 ip 080ed0a9 sp bfa10a9c error 4 in python2.6[8048000+1e0000]
> Jan 12 08:33:49 blends kernel: postgres[10556]: segfault at b88032aa ip b72d3776 sp bf84d6c0 error 4 in postgres[b7271000+4da000]
> Jan 13 08:29:15 blends kernel: postgres[30998]: segfault at b860f2aa ip b72d3776 sp bf84d6c0 error 4 in postgres[b7271000+4da000]
> Jan 13 19:55:07 blends kernel: python[11015]: segfault at 28 ip 080ed0a9 sp bfdbf6fc error 4 in python2.6[8048000+1e0000]

Hmm, interesting.

> Finally this was the reason why I wanted to look into your insert
> statements today - just to see whether I can find any explanation.  I

Ok.

> now have three coredumps from postgresql and have no idea what to do
> with these - hey, inspecting core dumps with gdb is something I did not
> entertained before.  I'm happy enough that I learned how to create those
> dumps which is not default behavour of postgresql.

I don't have any experience with the dumps, but is it possible to
infer something meaningful because I think most of these errors are
there because of 'too much data'.

> The reason why I cared for those dumps is that in summer I several times
> observed these without any clue what this might have caused.  It might
> be a reasonable explanation that this was the time when tests on
> commitstat were run.  Later I upgraded to postgresql 9.1 and assumed the

Seeing the dumps and the segmentation fault this time, possible, yes.

> I have no idea whether this helps but I have an idea what might be a
> chance on one hand while as a side effect beeing faster and has shorter
> code.
> Just dive into the usage of COPY.  If I understand the format of your
> and you are done.  According to PostgreSQL docs this should be quite
> performant and saves you some parsing and inserting code.

Seems like this is supposed to be very effective, because it reads
from the file as compared to a disk.

> temporary file over to blends.d.n, inject it via copy command and fetch
> the next team afterwards.  The overall performance should be the same
> and it might help to let vasks "relax" a bit.

Ok.

> What do you think about my suggestion?  Should I wait before starting
> again?

Yes, please wait! I would like to implement both the changes and then see.

--
Sukhbir