[Soc-coordination] Status Report: ListArchive - week 8

Pali Rohár pali.rohar at gmail.com
Sat Jul 12 16:39:30 UTC 2014


Hello,

this week I worked on optimizations of algorithms. One which 
building full tree of email threads from (possible incomplete) 
transitive closure of directly acyclic graph. I changed way how 
was used topological sorting and now it should properly handle 
email threads where are some missing in-reply-to headers, but 
references was present. I changed code to use arrays when 
possible instead hashes which should speed up some operations on 
bigger threads. Also now there are no more warnings reported 
about using uninitialized values. I fixed problems when emails 
threads could have possible loops (normally it should not happen, 
but somebody can generate emails which contains loop in in-reply-
to or references headers). Next I optimized SQL code which is 
responsible in cgi script for generating root of trees. Because 
debian-user ML contains now about 740 000 mails, original SQL 
code which used B trees for ordering started to be slow. It was 
quite fast for database with 100 000 - 300 000 mails, but not for 
800 000. To fix this problem I created new SQL table for caching 
needed information together with having indexes on ordering 
columns. This allowed me to simplify and speed up select 
statements for selecting and ordering roots of email trees at 
cost of inserting/updating more rows when adding new email to 
database. Basically now generating html pages (from cgi script) 
for debian-user takes about 0.1s, before it was more than one 
second. I think this is good enough.

-- 
Pali Rohár
pali.rohar at gmail.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.alioth.debian.org/pipermail/soc-coordination/attachments/20140712/2cfe75d4/attachment.sig>


More information about the Soc-coordination mailing list