[supersedes] Re: maxage causes loss of local email

Janna Martl janna.martl109 at gmail.com
Fri Mar 13 07:59:29 GMT 2015

Oops, my ascii art got messed up. Read this email, not the previous one.


What about IMAP-to-IMAP sync? This is strictly harder than the
Maildir-to-IMAP case: if we have a Maildir, we know the internal
timezone, and it's less expensive to fetch messages.

>     for uid in local_messageslist:
>         if uid in server_messageslist:
>             lowest_common_uid = uid
>             break
>  ...
>   if lowest_common_uid != None:
>       for mlist in [local_messageslist, server_messageslist]:
>           for uid in mlist:
>               if uid > lowest_common_uid:
>                   continue
>               del mlist[uid]

In this situation
                                E         X
        -24             0             24

                D     F    G H I          X
-24             0             24

we end up ignoring everything before X, which is wrong because G, H,
E, I should be deleted. In a previous incarnation of this, you only
looked for lowest_common_uid < "0" (where "0" is with respect to the
Maildir's timezone, which we know), which avoids this problem but
doesn't generalize to the IMAP/IMAP case where we don't know where 0
is on either list. Internaldates would be mighty helpful.....

>           # TODO: make a new local scan and a new fetch for UIDs within
>           # (maxage + (maxage_safety_interval * maxage_helper)),
>           # look for common UID again:
>           # - if found: proceed with it, print a hint of what was
>           # the working maxage_safety_interval/maxage.
>           # - else: recurse.

The maximum timezone difference (call this tzgap) is 26 hours (-1200
to +1400). A lot of this complexity would go away if we just assumed
that tzgap < 2 days, for example. Suppose we start by doing two
SINCE(maxage + 1) queries, and don't find a lowest_common_uid.
How far back do we have to worry about? Here I've put A in the worst
possible place, i.e. a distance of 24 + tzgap < 3 from 0. We don't
have to worry about things further out (e.g. B) because we want to do
the usual deletion procedure on the two (-24, now) messagelists after
having excluded problematic things like A, and B just doesn't enter
into that picture.

      A                                                      E         
                       -24              0             24

      A                D     F    G H I                
     -24            0              24


Some pseudocode:

# from before
messagelist1_small = do SINCE(maxage + 1) query on server1
messagelist2_small = do SINCE(maxage + 1) query on server2

# now do the first attempt at finding a lowest_common_uid

if lowest_common_uid == None:
   messagelist1_big = do SINCE(maxage + 3) query on server1
   messagelist2_big = do SINCE(maxage + 3) query on server2
   for uid in messagelist1_small:
       if uid is in messagelist2_big and uid is not in messagelist2_small:
   	    # found A
           remove uid from messagelist1_small
   for uid in messagelist2_small:
       if uid is in messagelist1_big and uid is not in messagelist1_small:
   	    # found A
           remove uid from messagelist2_small
# now do the usual deletion procedure using messagelist1_small and
# messagelist2_small

Also, I'm still kind of confused exactly what edge cases you want this
to take care of.

* Do you want to avoid touching IMAP INTERNALDATEs entirely, or are
there specific things you don't want to do with them?

* Earlier you mentioned timezone gaps of > 1 day being problematic,
but I think the above idea deals with this. It just doesn't deal with
arbitrary timezone gaps. Do you specifically want to treat the case of
a server that's responding to SINCE(yesterday) queries with messages
that are a week old? (And if the server is that broken, can we trust
it to give out strictly increasing UIDs?)

-- J.M.

More information about the OfflineIMAP-project mailing list