[Teammetrics-discuss] How does commitstat injects data

Fri Jan 13 21:43:07 UTC 2012

On Sat, Jan 14, 2012 at 02:36:47AM +0530, Sukhbir Singh wrote:
> On checking vasks, I think commitstat.py has stopped. Can you please
> check the status at your end?

$ ./commitstat.py -u tille
Segmentation fault

> If this is the case, I have a very stupid solution but then there
> seems to be no other way -- run the script in parts during the first
> run. For x-y-z teams, then x-y teams and then x teams.

Well, we currently have:

teammetrics=# SELECT project, count(*) from commitstat group by project ;
     project     | count  
-----------------+--------
 debian-live     |    341
 pkg-scicomp     |    357
 pkg-openoffice  |   3913
 kernel          | 237710
 pet             |    346
 pkg-kde         |   1061
 d-i             |  82980
 pkg-osm         |   1051
 pkg-samba       |     16
 nm              |    219
 pkg-perl        |   2081
 debian-release  |   1079
 teammetrics     |    325
 perl            |    614
 pkg-common-lisp |   2005
 pkg-multimedia  |  16302
 pkg-java        |   4605
 debian-science  |   3985
 debian-med      |   1468
 debconf         |   1349
 pkg-phototools  |   1219
 debtags         |    704
 pkg-hurd        |    962
 pkg-games       |   6626
 demudi          |      8
 pkg-grass       |   2270
 pkg-postgresql  |     43
(27 rows)

I just keep on trying as is again.  I also have seen 

> Why it crashes
> is because there is *lots* of data to be processed and that keeps the
> CPU very busy for a long time and I think then ultimately the process
> is killed.

Hmmm, I'm a bit concerned about things on blends.d.n as well:

Jan 11 08:44:01 blends kernel: postgres[21917]: segfault at b85eb2aa ip b72d3776 sp bf84d6c0 error 4 in postgres[b7271000+4da000]
Jan 11 08:44:02 blends kernel: python[14889]: segfault at 28 ip 080ed0a9 sp bfa10a9c error 4 in python2.6[8048000+1e0000]
Jan 12 08:33:49 blends kernel: postgres[10556]: segfault at b88032aa ip b72d3776 sp bf84d6c0 error 4 in postgres[b7271000+4da000]
Jan 13 08:29:15 blends kernel: postgres[30998]: segfault at b860f2aa ip b72d3776 sp bf84d6c0 error 4 in postgres[b7271000+4da000]
Jan 13 19:55:07 blends kernel: python[11015]: segfault at 28 ip 080ed0a9 sp bfdbf6fc error 4 in python2.6[8048000+1e0000]

Finally this was the reason why I wanted to look into your insert
statements today - just to see whether I can find any explanation.  I
now have three coredumps from postgresql and have no idea what to do
with these - hey, inspecting core dumps with gdb is something I did not
entertained before.  I'm happy enough that I learned how to create those
dumps which is not default behavour of postgresql.

The reason why I cared for those dumps is that in summer I several times
observed these without any clue what this might have caused.  It might
be a reasonable explanation that this was the time when tests on
commitstat were run.  Later I upgraded to postgresql 9.1 and assumed the
crashes are over because I did not observed them ... until 11. January.
So let's assume commitstat is guilty for the problem - just because we
do not have any better theory.

I have no idea whether this helps but I have an idea what might be a
chance on one hand while as a side effect beeing faster and has shorter
code.

teammetrics=# \help copy
Command:     COPY
Description: copy data between a file and a table
Syntax:
COPY table_name [ ( column [, ...] ) ]
    FROM { 'filename' | STDIN }
    [ [ WITH ] ( option [, ...] ) ]
...

Just dive into the usage of COPY.  If I understand the format of your
temporary file correctly you just can feed this file into a COPY command
and you are done.  According to PostgreSQL docs this should be quite
performant and saves you some parsing and inserting code.

Moreover, if you above give advise to split the job up into per project
chunks (I have no idea whether this helps or not) why not just doing it
staight in the code and do the fetchrevisions.py per team - copy the
temporary file over to blends.d.n, inject it via copy command and fetch
the next team afterwards.  The overall performance should be the same
and it might help to let vasks "relax" a bit.

> Maybe they have set something up on vasks:
> 
>     if cpu_usage == 100% for $TIME, kill process.
> 
> Possible?

I don't think so, but you might like to use UNIX command nice to reduce
the priority of your job compared to other jobs.

> It shows that it stopped at pkg-samba. Now I suggest that you run it
> *only* for pkg-samba and then see? If it runs flawlessly, it will
> confirm my 'theory'. Else, I will check it again.

What do you think about my suggestion?  Should I wait before starting
again?

> (Before running commitstat.py from scratch, please execute refresh.sh on vasks)

I did so.

Kind regards

       Andreas.

-- 
http://fam-tille.de