[PATCH 5/4] Re: IMAP: revamp _msgs_to_fetch()

Nicolas Sebrecht nicolas.s-dev at laposte.net
Wed Mar 25 12:32:44 GMT 2015


On Wed, Mar 25, 2015 at 01:40:04AM -0400, Janna Martl wrote:

> I think this is necessary if we're doing an IMAP-IMAP sync, and right
> now self is the local folder. Per our most recent discussion, for the
> local folder, there are two steps: (1) find messages within maxage;
> (2) expand that list to also include all messages with UID > min(uid's
> within maxage). We've already done step (1) above; this is step (2),
> except it looks a little different -- step (1) gave us a list of
> MSN's, not UID's, so you can ask for msn's in the range min_msn:* if
> there are no other conditions to enforce but if you also have to
> restrict that to < maxsize, I think you need another query.

In the first SEARCH command we restrict with SINCE which returns
messages with internal date within or later than the specified date

  https://tools.ietf.org/html/rfc3501#section-6.4.4

where "internal date" refers to the received date and time (SMTP only)

  https://tools.ietf.org/html/rfc3501#section-2.3.3

.

So, when the message comes from the IMAP4rev1 COPY command, the
"internal date" SHOULD be used and when it comes from the APPEND
command, the server SHOULD use date and times in the description.

So, the internal date possibly comes from the message headers or
something else. Also, all these SHOULD keywords make pure UIDs/MSNs
strategy less reliable.

If the user make IMAP/IMAP sync in one direction from a mailbox where
new messages are all coming from SMTP, this doesn't worth the second
SEARCH.  But this case is more optimization than an edge case.

Hence, I'm re-adding the block with changes because it was buggy. If
someone wants to avoid the second SEARCH round, he will have to
introduce a new configuration option.

That sucks because we are re-introducing a forced second SEARCH.  OTOH,
IMAP/IMAP with maxage enabled should not be so common.

-- >8 --
From: Nicolas Sebrecht <nicolas.s-dev at laposte.net>
Date: Wed, 25 Mar 2015 12:16:19 +0100
Subject: [PATCH] re-add worthy (fixed) second SEARCH command

---

Applies on top of PATCH v2. Full squashed commit will follow.


 offlineimap/folder/IMAP.py    | 66 +++++++++++++++++++++++++++++++++++--------
 offlineimap/folder/Maildir.py |  2 +-
 2 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/offlineimap/folder/IMAP.py b/offlineimap/folder/IMAP.py
index 834b446..f825104 100644
--- a/offlineimap/folder/IMAP.py
+++ b/offlineimap/folder/IMAP.py
@@ -161,17 +161,35 @@ class IMAPFolder(BaseFolder):
         Returns: range(s) for messages or None if no messages
         are to be fetched."""
 
+        def search(search_conditions):
+            """Actually request the server with the specified conditions.
+
+            Returns: range(s) for messages or None if no messages
+            are to be fetched."""
+            res_type, res_data = imapobj.search(None, search_conditions)
+            if res_type != 'OK':
+                raise OfflineImapError("SEARCH in folder [%s]%s failed. "
+                    "Search string was '%s'. Server responded '[%s] %s'"% (
+                    self.getrepository(), self, search_cond, res_type, res_data),
+                    OfflineImapError.ERROR.FOLDER)
+            return res_data[0].split()
+
         res_type, imapdata = imapobj.select(self.getfullname(), True, True)
         if imapdata == [None] or imapdata[0] == '0':
             # Empty folder, no need to populate message list.
             return None
 
         conditions = []
-
+        # 1. min_uid condition.
         if min_uid != None:
             conditions.append("UID %d:*"% min_uid)
+        # 2. maxage condition.
         elif maxage != None:
             # Find out what the oldest message is that we should look at.
+            # FIXME: we are checking maxage validity way too late. Also, we are
+            # doing similar computing in MaildirFolder()._iswithinmaxage()
+            # WITHOUT this sanity check. We really want to work with
+            # oldest_struct as soon as possible.
             oldest_struct = time.gmtime(time.time() - (60*60*24*maxage))
             if oldest_struct[0] < 1900:
                 raise OfflineImapError("maxage setting led to year %d. "
@@ -181,23 +199,47 @@ class IMAPFolder(BaseFolder):
                 oldest_struct[2],
                 MonthNames[oldest_struct[1]],
                 oldest_struct[0])
-
+        # 3. maxsize condition.
         maxsize = self.getmaxsize()
         if maxsize != None:
             conditions.append("SMALLER %d"% maxsize)
 
         if len(conditions) > 1:
-            # Build SEARCH conditions.
-            search_cond = "(%s)"% ' '.join(conditions)
-            res_type, res_data = imapobj.search(None, search_cond)
-            if res_type != 'OK':
-                raise OfflineImapError("SEARCH in folder [%s]%s failed. "
-                    "Search string was '%s'. Server responded '[%s] %s'"% (
-                    self.getrepository(), self, search_cond, res_type, res_data),
-                    OfflineImapError.ERROR.FOLDER)
+            # Build SEARCH command.
+            if maxage == None:
+                search_cond = "(%s)"% ' '.join(conditions)
+                search_result = search(search_cond)
+            else:
+                # Get the messages within maxage is not enough. We want all
+                # messages with UID > min_uid from these within-maxage messages.
+                # We can't rely on maxage only to get the proper min_uid because
+                # the internal date used by the SINCE command might sightly
+                # diverge from the date and time the message was assigned its
+                # UID. Same logic as applied for the Maildir, we have to
+                # re-include some messages.
+                #
+                # Ordering by UID is the same as ordering by MSN, so we get the
+                # messages with MSN > min_msn of the within-maxage messages.
+                msg_seq_numbers = map(lambda s : int(s), search_result)
+                if len(msg_seq_numbers) < 1:
+                    return None # Nothing to sync.
+                min_msn = min(msg_seq_numbers)
+                # If no maxsize, can just ask for all messages with MSN > min_msn.
+                search_cond = "%d:*"% min_msn
+                if maxsize != None:
+                    # Restrict the range min_msn:* to those with acceptable size.
+                    # Single-quotes prevent imaplib2 from quoting the sequence.
+                    search_cond = "'%s (SMALLER %d)'"% (min_msn, maxsize)
+                # Having to make a second query sucks but this is only for
+                # IMAP/IMAP configurations with maxage enabled. We assume this
+                # is not so common. This time overhead should be acceptable
+                # regarding the benefits introduced by all the avoided sync of
+                # maxage.
+                search_result = search(search_cond)
             # Resulting MSN are separated by space, coalesce into ranges
-            return imaputil.uid_sequence(res_data[0].split())
-        # By default examine all messages in this folder.
+            return imaputil.uid_sequence(search_result)
+
+        # By default consider all messages in this folder.
         return '1:*'
 
     # Interface from BaseFolder
diff --git a/offlineimap/folder/Maildir.py b/offlineimap/folder/Maildir.py
index 308abf3..ddc66bf 100644
--- a/offlineimap/folder/Maildir.py
+++ b/offlineimap/folder/Maildir.py
@@ -94,7 +94,7 @@ class MaildirFolder(BaseFolder):
     # Checks to see if the given message is within the maximum age according
     # to the maildir name which should begin with a timestamp
     def _iswithinmaxage(self, messagename, maxage):
-        # In order to have the same behaviour as SINCE in an IMAP search
+        # In order to have similar behaviour as SINCE in an IMAP search
         # we must convert this to the oldest time and then strip off hrs/mins
         # from that day.
         oldest_time_utc = time.time() - (60*60*24*maxage)

-- 
Nicolas Sebrecht




More information about the OfflineIMAP-project mailing list