[supersedes] Re: maxage causes loss of local email

Janna Martl janna.martl109 at gmail.com
Fri Mar 13 07:46:44 GMT 2015


What about IMAP-to-IMAP sync? This is strictly harder than the
Maildir-to-IMAP case: if we have a Maildir, we know the internal
timezone, and it's less expensive to fetch messages.

>      for uid in local_messageslist:
>          if uid in server_messageslist:
>              lowest_common_uid = uid
>              break
>   ...
>
>    if lowest_common_uid != None:
>        for mlist in [local_messageslist, server_messageslist]:
>            for uid in mlist:
>                if uid > lowest_common_uid:
>                    continue
>                del mlist[uid]

In this situation
                                 E         X
          |--------------|--------------->
         -24             0             24

                 D     F    G H I          X
|--------------|--------------->
-24             0             24

we end up ignoring everything before X, which is wrong because G, H,
E, I should be deleted. In a previous incarnation of this, you only
looked for lowest_common_uid < "0" (where "0" is with respect to the
Maildir's timezone, which we know), which avoids this problem but
doesn't generalize to the IMAP/IMAP case where we don't know where 0
is on either list. Internaldates would be mighty helpful.....

>            # TODO: make a new local scan and a new fetch for UIDs within
>            # (maxage + (maxage_safety_interval * maxage_helper)),
>            # look for common UID again:
>            # - if found: proceed with it, print a hint of what was
>            # the working maxage_safety_interval/maxage.
>            # - else: recurse.

The maximum timezone difference (call this tzgap) is 26 hours (-1200
to +1400). A lot of this complexity would go away if we just assumed
that tzgap < 2 days, for example. Suppose we start by doing two
SINCE(maxage + 1) queries, and don't find a lowest_common_uid.
How far back do we have to worry about? Here I've put A in the worst
possible place, i.e. a distance of 24 + tzgap < 3 from 0. We don't
have to worry about things further out (e.g. B) because we want to do
the usual deletion procedure on the two (-24, now) messagelists after
having excluded problematic things like A, and B just doesn't enter
into that picture.
             
                                                  E         
B      A                   |--------------|--------------->
                          -24             0             24
             
B      A                D     F    G H I          
       |--------------|--------------->
       -24            0              24

       |______24______|_____tzgap_________|

Some pseudocode:

# from before
messagelist1_small = do SINCE(maxage + 1) query on server1
messagelist2_small = do SINCE(maxage + 1) query on server2

# now do the first attempt at finding a lowest_common_uid

if lowest_common_uid == None:
    messagelist1_big = do SINCE(maxage + 3) query on server1
    messagelist2_big = do SINCE(maxage + 3) query on server2
    
    for uid in messagelist1_small:
        if uid is in messagelist2_big and uid is not in messagelist2_small:
    	    # found A
            remove uid from messagelist1_small
    
    for uid in messagelist2_small:
        if uid is in messagelist1_big and uid is not in messagelist1_small:
    	    # found A
            remove uid from messagelist2_small
    
# now do the usual deletion procedure using messagelist1_small and
# messagelist2_small

Also, I'm still kind of confused exactly what edge cases you want this
to take care of.

* Do you want to avoid touching IMAP INTERNALDATEs entirely, or are
there specific things you don't want to do with them?

* Earlier you mentioned timezone gaps of > 1 day being problematic,
but I think the above idea deals with this. It just doesn't deal with
arbitrary timezone gaps. Do you specifically want to treat the case of
a server that's responding to SINCE(yesterday) queries with messages
that are a week old? (And if the server is that broken, can we trust
it to give out strictly increasing UIDs?)

-- J.M.




More information about the OfflineIMAP-project mailing list