maxage causes loss of local email

Nicolas Sebrecht nicolas.s-dev at laposte.net
Thu Mar 19 11:32:40 GMT 2015


On Wed, Mar 18, 2015 at 07:01:11PM -0400, Janna Martl wrote:

> OK, just did. I made a couple of changes from the strategy we
> discussed; the rationale for them is explained in
> docs/doc-src/API.rst.

Thanks, will look at that.

> > So, I'm not surprised about the result you have. Take this (bigdata)
> > example:
> > 
> > 0. The mail is received by SMTP at t0.
> > 1. It is sent to the DATABASE at t1.
> > 2. It is received by the DATABASE at t2.
> > 3. It is persistently stored by the DATABASE at t3.
> > 4. It is considered from the IMAP infra at t4.
> > N. It is given an INTERNALDATE at some point in time, we don't known
> > when exactly (at t0, t1, t2, somwhere else), what's the value of it (the
> > reference time taken) and we don't know who is in charge of assigning
> > it.
> 
> Shouldn't the INTERNALDATE be somewhere in the range [t0, t4], and
> shouldn't t0 ... t4 be the dates in the header?

We don't know. It depends on implementations.

The most important point is that what's written in the RFCs is already
subject to interpretation. So, making rules on top of the raw RFCs from
what we COULD indirectly expect from what's written in the RFCs is not
reliable:

RFCs point X => rule X
RFCs point Y => rule Y

can be ambigous sometimes, but

rule X + rule Y => rule Z (we could be tempted to rely on but is not in RFCs)

is even more dangerous.

> > Because we highly rely on UIDs in OfflineIMAP, we should prefer working
> > with UIDs for maxage too to avoid having to workaround unexpected
> > results like the one above.
>  
> The problem is that internaldates are, unavoidably, used: the local
> folder equivalent of SINCE(X) is getting all the messages with date
> newer than X; that date originally came as an internaldate from the
> server.

Of course. SINCE command is not much precise. This doesn't hurt in a day
to day use because users not getting the mails they want will simply
redo the SINCE command with a higher range.

It should not hurt for us neither if our strategy is to use the SINCE
response as a /starting point/ and work further with UIDs sets.

> > About what we care, since we are asking for message up to (maxage + n)
> > with n in days, few divergences in the UIDs order regarding the "real
> > delivery date order" should have no impact to our algorithm. When the
> > IMAP infra consider a new mail, it is assigned a new ascending UID.
> >
> > I wonder if there's something else that could hurt.
> 
> I only noticed this because I had a few messages where the gap was 5
> days. That does mess things up: you get situations like this
> 
>                               A           E
>     (local)          |--------------|--------------->
>                    -48              0
>                               A  B*    C*      D
>     (remote)  |--------------|--------------->
>               -48            0
> 
> where A has UID higher than it should, and so our procedure picks C*
> as having the smallest acceptable UID, but A doesn't get excluded because
> UID(A) > UID(C*).
> 
> I really want to believe that these anomalous messages are somehow the
> fault of my own experimentation (even though they go back further in
> time than I've been messing around with this stuff).

I don't know. I don't get this sample because C* (why '*'?) isn't even
there locally.

I have to look at the code.

Do you think you could redo some tests on a new account/folder to check
again with the same offending mails?

-- 
Nicolas Sebrecht




More information about the OfflineIMAP-project mailing list