maxage causes loss of local email

Janna Martl janna.martl109 at gmail.com
Wed Mar 18 23:01:11 GMT 2015


On Wed, Mar 18, 2015 at 11:46:06AM +0100, Nicolas Sebrecht wrote:
> Could you share your WIP?

OK, just did. I made a couple of changes from the strategy we
discussed; the rationale for them is explained in
docs/doc-src/API.rst.

> So, I'm not surprised about the result you have. Take this (bigdata)
> example:
> 
> 0. The mail is received by SMTP at t0.
> 1. It is sent to the DATABASE at t1.
> 2. It is received by the DATABASE at t2.
> 3. It is persistently stored by the DATABASE at t3.
> 4. It is considered from the IMAP infra at t4.
> N. It is given an INTERNALDATE at some point in time, we don't known
> when exactly (at t0, t1, t2, somwhere else), what's the value of it (the
> reference time taken) and we don't know who is in charge of assigning
> it.

Shouldn't the INTERNALDATE be somewhere in the range [t0, t4], and
shouldn't t0 ... t4 be the dates in the header? In the cases I've
found where message1 has higher INTERNALDATE and lower UID than
message2, each message's INTERNALDATE fell within the range of the
dates in its headers, but that range for message1 was strictly higher
than the entire range for message2.

> Because we highly rely on UIDs in OfflineIMAP, we should prefer working
> with UIDs for maxage too to avoid having to workaround unexpected
> results like the one above.
 
The problem is that internaldates are, unavoidably, used: the local
folder equivalent of SINCE(X) is getting all the messages with date
newer than X; that date originally came as an internaldate from the
server.
 
> About what we care, since we are asking for message up to (maxage + n)
> with n in days, few divergences in the UIDs order regarding the "real
> delivery date order" should have no impact to our algorithm. When the
> IMAP infra consider a new mail, it is assigned a new ascending UID.
>
> I wonder if there's something else that could hurt.

I only noticed this because I had a few messages where the gap was 5
days. That does mess things up: you get situations like this

                              A           E
    (local)          |--------------|--------------->
                   -48              0
                              A  B*    C*      D
    (remote)  |--------------|--------------->
              -48            0

where A has UID higher than it should, and so our procedure picks C*
as having the smallest acceptable UID, but A doesn't get excluded because
UID(A) > UID(C*).

I really want to believe that these anomalous messages are somehow the
fault of my own experimentation (even though they go back further in
time than I've been messing around with this stuff).

-- J.M.




More information about the OfflineIMAP-project mailing list