[Teammetrics-discuss] Name fixing (Was: NNTPStat completed successfully.)

Sukhbir Singh sukhbir.in at gmail.com
Wed Nov 9 11:54:28 UTC 2011


Hello!

> teammetrics=# SELECT name, count(*) from listarchives where name ilike '%tille%' group by name;
>               name                | count
> -----------------------------------+-------
>  Andreas Tille                     |  7678
>  Tille, Andreas                    |   157
>  <tille at debian.org                 |     6
>  'Andreas Tille'                   |     2
>  <tillea at rki.de                    |     1
>

Interesting find. To find other names, I did this:

    select count(*) from listarchives where name ilike '<%'; count
    -------
      4681
    (1 row)


and

    select count(*) from listarchives where name ilike '''%';
     count
    -------
        77
    (1 row)

So clearly, there are other such names. When checking the message from
lists.d.o web interface, I see that these are the messages that have a
'From'  header like:

    From: <foo at bar.com>

instead of

    From: Foo Bar <foo at bar.com>

... the name is missing. So there is no nothing we can do in this case
because constructing a name from an email address is not possible.

However, we can strip the `<` and `'` characters from the 'Name' field
which will make:  'Andreas Tille' and Andreas Tille equal.

For the cases of: tillea at rki.de ;  tille at debian.org, again the same
problem is there.

So, to summarize:

1. I will strip the `<` and `'` characters.
2. For cases where the name == the email address, there is nothing we
can do except add entries manually.

We have another bug in our code, something that I feel stupid about!

I just noticed when investigating the above problem, we are storing
the email address in the form of:

    name at domain.com

Stupid me! I don't know came into my mind that I had this line in liststat:

    email_addr = email_raw.replace('@', ' at ')

:(

I will fix all these issues and then push them.



More information about the Teammetrics-discuss mailing list