[Soc-coordination] Debian Teams Activity Metrics - Report IV [Update]
Olly Betts
olly at survex.com
Thu Aug 4 00:18:17 UTC 2011
On Thu, Aug 04, 2011 at 01:13:00AM +0530, Sukhbir Singh wrote:
> 1. Wrote a script that fetches messages for lists.debian.org from
> Gmane and then creates a mbox archive for them. This allows us to
> parse lists.debian.org as we don't have mbox archives for that as of
> now. So we fetch the messages from Gmane, create mbox archives from
> them and then throw that over to the Alioth parser (there is support
> for preventing redundant fetching of messages also, so we don't strain
> Gmane).
> This took some time because there were lots of things to handle
> (dates!) when creating mbox archives and the code had to be written in
> such a way that it would work with the Alioth parser without modifying
> that as initially we were not aware that we would follow this approach
> of getting the archives. Actually, we didn't know that we won't get
> the mbox archives for lists.debian.org but our 'hack' works :)
Hmm, it sounds like you're fetching individual messages from gmane via
NNTP (or worse by scrapping the web frontend) and then building mbox
archives yourself.
Gmane actually has an mbox export feature for this sort of thing:
http://gmane.org/export.php
Cheers,
Olly
More information about the Soc-coordination
mailing list