[Teammetrics-discuss] SPAM filter issue
Andreas Tille
andreas at an3as.eu
Wed Jan 8 14:10:33 UTC 2014
Hi Sukhbir,
I injected 'ImageJ ports on ' to the SPAM filter. It is not actually
SPAM but due to a broken wrapper in the imagej package from time to time
we got those peaks in the stats.
$ grep -R "ImageJ ports on "
spamfilter.py: 'ImageJ ports on ',
hacks/get-archive-pages: 'akavanagh', # strange "spammer" on Debian-med-packaging mailing list Subject: "ImageJ ports on kasilas" in February 2020
Unfortunately this does not work since if you try
teammetrics=# SELECT * from listarchives where project = 'debian-med-packaging' and name like 'Malte V%' limit 3;
project | domain | name | email_addr | subject | message_id | archive_date | today_date | msg_raw_len | msg_no_blank_len | msg_no_quotes_len | msg_no_sig_len | is_spam
----------------------+-------------------------+----------------+-----------------------+-----------------------------------------------------+----------------------------------------+--------------+------------+-------------+------------------+-------------------+----------------+---------
debian-med-packaging | lists.alioth.debian.org | Malte Vassholz | vassholz at mail.desy.de | [Debian-med-packaging] ImageJ ports on exflpcx18766 | E1VBRMa-0006ib-DW at exflpcx18766.desy.de | 2013-08-19 | 2013-12-03 | 37 | 1 | 1 | 1 | t
debian-med-packaging | lists.alioth.debian.org | Malte Vassholz | vassholz at mail.desy.de | [Debian-med-packaging] ImageJ ports on exflpcx18766 | E1VBRMX-0006br-Ft at exflpcx18766.desy.de | 2013-08-19 | 2013-12-03 | 37 | 1 | 1 | 1 | t
debian-med-packaging | lists.alioth.debian.org | Malte Vassholz | vassholz at mail.desy.de | [Debian-med-packaging] ImageJ ports on exflpcx18766 | E1VBRMX-0006bS-Fr at exflpcx18766.desy.de | 2013-08-19 | 2013-12-03 | 37 | 1 | 1 | 1 | t
(3 rows)
you see there are some occurences left and when trying
teammetrics=# SELECT count(*) from listarchives where project = 'debian-med-packaging' and name like 'Malte V%' ;
count
-------
603
(1 row)
this is exactly the number we get into the graph which is pure rubish.
I wonder why spamfilter.py did not deleted this and I'll leave the data
in the database for your inspection (rather than just kicking these
manually).
Kind regards
Andreas.
--
http://fam-tille.de
More information about the Teammetrics-discuss
mailing list