[Teammetrics-discuss] NNTPStat completed successfully.
Andreas Tille
andreas at an3as.eu
Sat Nov 5 17:47:19 UTC 2011
On Sat, Nov 05, 2011 at 11:44:52AM +0530, Sukhbir Singh wrote:
> Hmm, ok so:
>
> teammetrics=> SELECT COUNT(*) FROM listarchives WHERE archive_date
> <'1995-01-01';
>
> count
> -------
> 9
> (1 row)
>
> So 9 such messages. Out of which three are spam.
Well, these are just the messages which are *obviously* wrong. There
are few chances to check whether a message that should be dated in say
year 2009 is rongly placed in for instance 2002. Above we just have a
prove that wrong dates exist. There is nearly no way to check how many
occurences of this problem we have.
> I understand your point about getting the message date from the
> archive date but don't you think nine special cases are not special
> enough from 1879288 messages for changing the standard way of getting
> the message date? :)
As I said: Your number estimation is most probably wrong and we do not
even have a reasonable way to estimate how wrong it is at all.
> If we want to resort to that approach, I can change it for the lists
> on Alioth but it won't be possible for Gmane archives, because getting
> the date from the header is the only way.
I agree that Gmane gives probably few chances to detect this.
> > I somehow have the impression that updatenames.py fails in some
> > circumstances.
>
> Well, it does an exact match only and so it will only do what we tell
> it to! What kind of changes would you like to make in that?
Anything that makes all versions of "me" to "Andreas Tille". Perhaps
we need some "author like '%...'" in addition.
> We are missing archives for some years and that is indeed a cause of
> worry, but then, it's your call! If you feel that the web archives
> method is better, we will go with that. Or wait for the mbox
> archives.
Could you estimate the time effort to work on this (to enable us
comparing what comes first - real mboxes or web archives?)
> (Though we did seem to be missing some authors from the web
> archives method IIRC)
Do you think so? I do not remember.
> Er, sorry but you might have to start it again :'( . Last night, I
> needed to run liststat.py but I gave the command for commitstat.py by
> mistake and then I later cleared the files on vasks because I didn't
> know you were running it. My mistake, please run it again. Sorry!
No problem. Wait a moment - I get a drink at next DebConf when we
meet again! :-) That's the usual punishment for mistakes like this!
Kind regards
Andreas.
--
http://fam-tille.de
More information about the Teammetrics-discuss
mailing list