PARTIALLY REMOVING MAXAGE (was: [PATCH v4] make maxage use UIDs to avoid timezone issues)
Nicolas Sebrecht
nicolas.s-dev at laposte.net
Thu Apr 2 09:50:21 BST 2015
This is PATCH v8.
On Wed, Apr 01, 2015 at 05:11:08PM -0400, Janna Martl wrote:
> 1. When using maxage, local and remote messagelists are supposed to only
> contain messages from at most maxage days ago. But local and remote used
> different timezones to calculate what "maxage days ago" means, resulting
> in loss of mail.
s,loss of mail,mail removals on one side,
> Now, we ask the local folder for maxage days' worth of
> mail, find the lowest UID, and then ask the remote folder for all UID's
> starting with that lowest one.
>
> 2. maxage was fundamentally wrong in the IMAP-IMAP case: it assumed that
> remote messages have UIDs in the same order as their local counterparts,
> which could be false, e.g. when messages are copied in quick succession.
> So, remove support for maxage in the IMAP-IMAP case.
>
> 3. Add startdate option for IMAP-IMAP syncs: use messages from the given
> repository starting at startdate, and all messages from the other
> repository. In the first sync, the other repository must be empty.
>
> 4. Allow maxage to be specified either as number of days to sync (as
> previously) or as a fixed date.
> ---
> docs/offlineimap.txt | 18 ++++-
> offlineimap.conf | 46 ++++++++---
> offlineimap/accounts.py | 151 +++++++++++++++++++++++++++++--------
> offlineimap/folder/Base.py | 67 ++++++++++++++++
> offlineimap/folder/Gmail.py | 8 +-
> offlineimap/folder/GmailMaildir.py | 4 +-
> offlineimap/folder/IMAP.py | 95 ++++++++++++-----------
> offlineimap/folder/Maildir.py | 82 +++++++++++++-------
> 8 files changed, 345 insertions(+), 126 deletions(-)
>
> diff --git a/docs/offlineimap.txt b/docs/offlineimap.txt
> index 858fc0b..618f2ab 100644
> --- a/docs/offlineimap.txt
> +++ b/docs/offlineimap.txt
> @@ -135,7 +135,8 @@ Ignore any autorefresh setting in the configuration file.
> Run only quick synchronizations.
> +
> Ignore any flag updates on IMAP servers. If a flag on the remote IMAP changes,
> -and we have the message locally, it will be left untouched in a quick run.
> +and we have the message locally, it will be left untouched in a quick run. This
> +option is ignored if maxage is set.
>
>
> -u <UI>::
> @@ -400,8 +401,19 @@ If you then point your local mutt, or whatever MUA you use to `~/mail/`
> as root, it should still recognize all folders.
>
>
> -Authors
> --------
> +* Edge cases with maxage causing too many messages to be synced.
> ++
> +All messages from at most maxage days ago (+/- a few hours, depending on
> +timezones) are synced, but there are cases in which older messages can also be
> +synced. This happens when a message's UID is significantly higher than those of
> +other messages with similar dates, e.g. when messages are added to the local
> +folder behind offlineimap's back, causing them to get assigned a new UID, or
> +when offlineimap first syncs a pre-existing Maildir. In the latter case, it
> +could appear as if a noticeable and random subset of old messages are synced.
> +
> +
> +Main authors
> +------------
>
> John Goerzen, Sebastian Spaetz, Eygene Ryabinkin, Nicolas Sebrecht.
>
> diff --git a/offlineimap.conf b/offlineimap.conf
> index 5bc48a8..525e3f4 100644
> --- a/offlineimap.conf
> +++ b/offlineimap.conf
> @@ -260,6 +260,8 @@ remoterepository = RemoteExample
> # This option stands in the [Account Test] section.
> #
> # OfflineImap can replace a number of full updates by quick synchronizations.
> +# This option is ignored if maxage or startdate are used.
> +#
> # It only synchronizes a folder if
> #
> # 1) a Maildir folder has changed
> @@ -327,21 +329,26 @@ remoterepository = RemoteExample
>
> # This option stands in the [Account Test] section.
> #
> -# When you are starting to sync an already existing account you can tell
> -# OfflineIMAP to sync messages from only the last x days. When you do this,
> -# messages older than x days will be completely ignored. This can be useful for
> -# importing existing accounts when you do not want to download large amounts of
> -# archive email.
> +# maxage enables you to sync only recent messages. There are two ways to specify
> +# what "recent" means: if maxage is given as an integer, then only messages from
> +# the last maxage days will be synced. If maxage is given as a date, then only
> +# messages later than that date will be synced.
> +#
> +# Messages older than the cutoff will not be synced, their flags will not be
> +# changed, they will not be deleted, etc. For OfflineIMAP it will be like these
> +# messages do not exist. This will perform an IMAP search in the case of IMAP or
> +# Gmail and therefore requires that the server support server side searching.
> +#
> +# Known edge cases are described in offlineimap(1).
> #
> -# Messages older than maxage days will not be synced, their flags will not be
> -# changed, they will not be deleted, etc. For OfflineIMAP it will be like these
> -# messages do not exist. This will perform an IMAP search in the case of IMAP
> -# or Gmail and therefore requires that the server support server side searching.
> -# This will calculate the earliest day that would be included in the search and
> -# include all messages from that day until today. The maxage option expects an
> -# integer (for the number of days).
> +# maxage is allowed only when the local folder is of type Maildir. It can't be
> +# used with startdate.
> +#
> +# The maxage option expects an integer (for the number of days) or a date of the
> +# form yyyy-mm-dd.
> #
> #maxage = 3
> +#maxage = 2015-04-01
>
>
> # This option stands in the [Account Test] section.
> @@ -448,6 +455,21 @@ localfolders = ~/Test
>
> # This option stands in the [Repository LocalExample] section.
> #
> +# startdate syncs mails starting from a given date. It applies the date
> +# restriction to LocalExample only. The remote repository MUST be empty
> +# at the first sync where this option is used.
> +#
> +# Unlike maxage, this is supported for IMAP-IMAP sync.
> +#
> +# startdate can't be used with maxage.
> +#
> +# The startdate option expects a date in the format yyyy-mm-dd.
> +#
> +#startdate = 2015-04-01
> +
> +
> +# This option stands in the [Repository LocalExample] section.
> +#
> # Some users may not want the atime (last access time) of folders to be
> # modified by OfflineIMAP. If 'restoreatime' is set to yes, OfflineIMAP
> # will restore the atime of the "new" and "cur" folders in each maildir
> diff --git a/offlineimap/accounts.py b/offlineimap/accounts.py
> index cac4d88..b192eca 100644
> --- a/offlineimap/accounts.py
> +++ b/offlineimap/accounts.py
> @@ -17,10 +17,11 @@
> from subprocess import Popen, PIPE
> from threading import Event
> import os
> +import time
> from sys import exc_info
> import traceback
>
> -from offlineimap import mbnames, CustomConfig, OfflineImapError
> +from offlineimap import mbnames, CustomConfig, OfflineImapError, imaplibutil
> from offlineimap import globals
> from offlineimap.repository import Repository
> from offlineimap.ui import getglobalui
> @@ -402,6 +403,87 @@ def syncfolder(account, remotefolder, quick):
>
> Filtered folders on the remote side will not invoke this function."""
>
> + def check_uid_validity(localfolder, remotefolder, statusfolder):
> + # If either the local or the status folder has messages and
> + # there is a UID validity problem, warn and abort. If there are
> + # no messages, UW IMAPd loses UIDVALIDITY. But we don't really
> + # need it if both local folders are empty. So, in that case,
> + # just save it off.
> + if localfolder.getmessagecount() or statusfolder.getmessagecount():
Shouldn't it be:
if localfolder.getmessagecount() > 0 and statusfolder.getmessagecount() > 0:
^^^
?
> + if not localfolder.check_uidvalidity():
> + ui.validityproblem(localfolder)
> + localrepos.restore_atime()
> + return
> + if not remotefolder.check_uidvalidity():
> + ui.validityproblem(remotefolder)
> + localrepos.restore_atime()
> + return
> + else:
> + # Both folders empty, just save new UIDVALIDITY
> + localfolder.save_uidvalidity()
> + remotefolder.save_uidvalidity()
> +
> + def save_min_uid(folder, min_uid):
> + uidfile = folder.get_min_uid_file()
> + fd = open(uidfile, 'wt')
> + fd.write(str(min_uid) + "\n")
> + fd.close()
> +
> + def cachemessagelists_by_date(localfolder, remotefolder, date):
def cachemessagelists_upto_date(localfolder, remotefolder, date):
or
def cachemessagelists_within_date(localfolder, remotefolder, date):
"by" reads as sorting of creating a dict with dates for keys.
> + """ Returns messages with uid > min(uids of within-date
> + messages)."""
> +
> + localfolder.cachemessagelist(min_date=date)
> + check_uid_validity(localfolder, remotefolder, statusfolder)
> + # local messagelist had date restriction applied already. Restrict
> + # sync to messages with UIDs >= min_uid from this list.
> + #
> + # local messagelist might contain new messages (with uid's < 0).
> + positive_uids = filter(
> + lambda uid: uid > 0, localfolder.getmessageuidlist())
> + if len(positive_uids) > 0:
> + remotefolder.cachemessagelist(min_uid=min(positive_uids))
> + else:
> + # No messages with UID > 0 in range in localfolder.
> + # date restriction was applied with respect to local dates but
> + # remote folder timezone might be different from local, so be
> + # safe and make sure the range isn't bigger than in local.
> + remotefolder.cachemessagelist(
> + min_date=time.gmtime(time.mktime(date) + 24*60*60))
> +
> + def cachemessagelists_startdate(new, partial, date):
> + """ Retrieve messagelists when startdate has been set for
> + the folder 'partial'.
> +
> + Idea: suppose you want to clone the messages after date in one
> + account (partial) to a new one (new). If new is empty, then copy
> + messages in partial newer than date to new, and keep track of the
> + min uid. On subsequent syncs, sync all the messages in new against
> + those after that min uid in partial. This is a partial replacement
> + for maxage in the IMAP-IMAP sync case, where maxage doesn't work:
> + the UIDs of the messages in localfolder might not be in the same
> + order as those of corresponding messages in remotefolder, so if L in
> + local corresponds to R in remote, the ranges [L, ...] and [R, ...]
> + might not correspond. But, if we're cloning a folder into a new one,
> + [min_uid, ...] does correspond to [1, ...].
> +
> + This is just for IMAP-IMAP. For Maildir-IMAP, use maxage instead.
> + """
> +
> + new.cachemessagelist()
> + if not new.getmessageuidlist():
All those y.getmessageuidlist() checks should be in the form:
len(y.getmessageuidlist()) < 0
to make it clear what we are checking.
That being said, this check looks wrong. We'd rather check for the
existance of a cached min_uid.
This is because we want to remove [min_uid:*] on partial if all messages
on new has been removed. Syncing with an empty new.getmessageuidlist()
against partial is a valid use case.
> + partial.cachemessagelist(min_date=date)
> + uids = partial.getmessageuidlist()
> + if len(uids) > 0:
> + min_uid = min(uids)
> + else:
> + min_uid = 1
> + save_min_uid(partial, min_uid)
> + else:
> + min_uid = partial.retrieve_min_uid()
> + partial.cachemessagelist(min_uid=min_uid)
> +
> +
> remoterepos = account.remoterepos
> localrepos = account.localrepos
> statusrepos = account.statusrepos
> @@ -429,43 +511,46 @@ def syncfolder(account, remotefolder, quick):
>
> statusfolder.cachemessagelist()
>
> - if quick:
> - if (not localfolder.quickchanged(statusfolder) and
> - not remotefolder.quickchanged(statusfolder)):
> - ui.skippingfolder(remotefolder)
> - localrepos.restore_atime()
> - return
>
> # Load local folder.
> ui.syncingfolder(remoterepos, remotefolder, localrepos, localfolder)
> - ui.loadmessagelist(localrepos, localfolder)
> - localfolder.cachemessagelist()
> - ui.messagelistloaded(localrepos, localfolder, localfolder.getmessagecount())
>
> - # If either the local or the status folder has messages and
> - # there is a UID validity problem, warn and abort. If there are
> - # no messages, UW IMAPd loses UIDVALIDITY. But we don't really
> - # need it if both local folders are empty. So, in that case,
> - # just save it off.
> - if localfolder.getmessagecount() or statusfolder.getmessagecount():
> - if not localfolder.check_uidvalidity():
> - ui.validityproblem(localfolder)
> - localrepos.restore_atime()
> - return
> - if not remotefolder.check_uidvalidity():
> - ui.validityproblem(remotefolder)
> - localrepos.restore_atime()
> - return
> + # Retrieve messagelists, taking into account age-restriction
> + # options
> + maxage = localfolder.getmaxage()
> + localstart = localfolder.getstartdate()
> + remotestart = remotefolder.getstartdate()
> + if (maxage != None) + (localstart != None) + (remotestart != None) > 1:
> + raise OfflineImapError("You can set at most one of the "
> + "following: maxage, startdate (for the local folder), "
> + "startdate (for the remote folder)",
> + OfflineImapError.ERROR.REPO), None, exc_info()[2]
> + if (maxage != None or localstart or remotestart) and quick:
> + # IMAP quickchanged isn't compatible with options that
> + # involve restricting the messagelist, since the "quick"
> + # check can only retrieve a full list of UIDs in the folder.
> + ui.warn("Quick syncs (-q) not supported in conjunction "
> + "with maxage or startdate; ignoring -q.")
> + if maxage != None:
> + cachemessagelists_by_date(localfolder, remotefolder, maxage)
Shouldn't we check_uid_validity() here, too?
> + elif localstart != None:
> + cachemessagelists_startdate(remotefolder, localfolder,
> + localstart)
> + check_uid_validity(localfolder, remotefolder, statusfolder)
> + elif remotestart != None:
> + cachemessagelists_startdate(localfolder, remotefolder,
> + remotestart)
> + check_uid_validity(localfolder, remotefolder, statusfolder)
> else:
> - # Both folders empty, just save new UIDVALIDITY
> - localfolder.save_uidvalidity()
> - remotefolder.save_uidvalidity()
> -
> - # Load remote folder.
> - ui.loadmessagelist(remoterepos, remotefolder)
> - remotefolder.cachemessagelist()
> - ui.messagelistloaded(remoterepos, remotefolder,
> - remotefolder.getmessagecount())
> + localfolder.cachemessagelist()
> + if quick:
> + if (not localfolder.quickchanged(statusfolder) and
> + not remotefolder.quickchanged(statusfolder)):
> + ui.skippingfolder(remotefolder)
> + localrepos.restore_atime()
> + return
> + check_uid_validity(localfolder, remotefolder, statusfolder)
> + remotefolder.cachemessagelist()
>
> # Synchronize remote changes.
> if not localrepos.getconfboolean('readonly', False):
> diff --git a/offlineimap/folder/Base.py b/offlineimap/folder/Base.py
> index 16b5819..e52bec8 100644
> --- a/offlineimap/folder/Base.py
> +++ b/offlineimap/folder/Base.py
> @@ -17,6 +17,7 @@
>
> import os.path
> import re
> +import time
> from sys import exc_info
>
> from offlineimap import threadutil, emailutil
> @@ -298,6 +299,72 @@ class BaseFolder(object):
>
> raise NotImplementedError
>
> + def getmaxage(self):
> + """ maxage is allowed to be either an integer or a date of the
> + form YYYY-mm-dd. This returns a time_struct. """
Good!
> +
> + maxagestr = self.config.getdefault("Account %s"%
> + self.accountname, "maxage", None)
> + if not maxagestr:
if maxagestr == None:
> + return None
> + # is it a number?
> + try:
> + maxage = int(maxagestr)
Handle maxage < 1, perhaps 2.
> + return time.gmtime(time.time() - 60*60*24*maxage)
> + except ValueError:
> + pass
> + # is it a date string?
> + try:
> + date = time.strptime(maxagestr, "%Y-%m-%d")
> + if date[0] < 1900:
> + raise OfflineImapError("maxage led to year %d. "
> + "Abort syncing."% date[0],
> + OfflineImapError.ERROR.MESSAGE)
> + return date
> + except ValueError:
> + raise OfflineImapError("invalid maxage value %s",
> + OfflineImapError.ERROR.MESSAGE)
> +
> + def getmaxsize(self):
> + return self.config.getdefaultint("Account %s"%
> + self.accountname, "maxsize", None)
> +
> + def getstartdate(self):
> + """ Retrieve the value of the configuration option startdate """
> + datestr = self.config.getdefault("Repository " + self.repository.name,
> + 'startdate', None)
> + try:
> + if not datestr:
> + return None
> + date = time.strptime(datestr, "%Y-%m-%d")
> + if date[0] < 1900:
> + raise OfflineImapError("startdate led to year %d. "
> + "Abort syncing."% date[0],
> + OfflineImapError.ERROR.MESSAGE)
> + return date
> + except ValueError:
> + raise OfflineImapError("invalid startdate value %s",
> + OfflineImapError.ERROR.MESSAGE)
> +
> + def get_min_uid_file(self):
> + startuiddir = os.path.join(self.config.getmetadatadir(),
> + 'Repository-' + self.repository.name, 'StartUID')
> + if not os.path.exists(startuiddir):
> + os.mkdir(startuiddir, 0o700)
> + return os.path.join(startuiddir, self.getfolderbasename())
> +
> + def retrieve_min_uid(self):
> + uidfile = self.get_min_uid_file()
> + try:
> + fd = open(uidfile, 'rt')
> + min_uid = long(fd.readline().strip())
> + fd.close()
> + return min_uid
> + except:
> + raise IOError("Can't read %s. To start using startdate, "\
> + "folder must be empty"% uidfile)
> +
> +
> def savemessage(self, uid, content, flags, rtime):
> """Writes a new message, with the specified uid.
>
> diff --git a/offlineimap/folder/Gmail.py b/offlineimap/folder/Gmail.py
> index 1afbe47..354d544 100644
> --- a/offlineimap/folder/Gmail.py
> +++ b/offlineimap/folder/Gmail.py
> @@ -121,16 +121,18 @@ class GmailFolder(IMAPFolder):
>
> # TODO: merge this code with the parent's cachemessagelist:
> # TODO: they have too much common logics.
> - def cachemessagelist(self):
> + def cachemessagelist(self, min_date=None, min_uid=None):
> if not self.synclabels:
> - return super(GmailFolder, self).cachemessagelist()
> + return super(GmailFolder, self).cachemessagelist(min_date=min_date,
> + min_uid=min_uid)
>
> self.messagelist = {}
>
> self.ui.collectingdata(None, self)
> imapobj = self.imapserver.acquireconnection()
> try:
> - msgsToFetch = self._msgs_to_fetch(imapobj)
> + msgsToFetch = self._msgs_to_fetch(imapobj, min_date=min_date,
> + min_uid=min_uid)
> if not msgsToFetch:
> return # No messages to sync
>
> diff --git a/offlineimap/folder/GmailMaildir.py b/offlineimap/folder/GmailMaildir.py
> index 894792d..0ae00bf 100644
> --- a/offlineimap/folder/GmailMaildir.py
> +++ b/offlineimap/folder/GmailMaildir.py
> @@ -64,9 +64,9 @@ class GmailMaildirFolder(MaildirFolder):
> 'filename': '/no-dir/no-such-file/', 'mtime': 0}
>
>
> - def cachemessagelist(self):
> + def cachemessagelist(self, maxage=None, min_uid=None):
> if self.ismessagelistempty():
> - self.messagelist = self._scanfolder()
> + self.messagelist = self._scanfolder(maxage=maxage, min_uid=min_uid)
>
> # Get mtimes
> if self.synclabels:
> diff --git a/offlineimap/folder/IMAP.py b/offlineimap/folder/IMAP.py
> index 4b470a2..253ac97 100644
> --- a/offlineimap/folder/IMAP.py
> +++ b/offlineimap/folder/IMAP.py
> @@ -18,6 +18,7 @@
> import random
> import binascii
> import re
> +import os
> import time
> from sys import exc_info
>
> @@ -79,6 +80,12 @@ class IMAPFolder(BaseFolder):
> def waitforthread(self):
> self.imapserver.connectionwait()
>
> + def getmaxage(self):
> + if self.config.getdefault("Account %s"%
> + self.accountname, "maxage", None):
> + raise OfflineImapError("maxage is not supported on IMAP-IMAP sync",
> + OfflineImapError.ERROR.REPO), None, exc_info()[2]
> +
> # Interface from BaseFolder
> def getcopyinstancelimit(self):
> return 'MSGCOPY_' + self.repository.getname()
> @@ -143,8 +150,7 @@ class IMAPFolder(BaseFolder):
> return True
> return False
>
> -
> - def _msgs_to_fetch(self, imapobj):
> + def _msgs_to_fetch(self, imapobj, min_date=None, min_uid=None):
> """Determines sequence numbers of messages to be fetched.
>
> Message sequence numbers (MSNs) are more easily compacted
> @@ -152,57 +158,55 @@ class IMAPFolder(BaseFolder):
>
> Arguments:
> - imapobj: instance of IMAPlib
> + - min_date (optional): a time_struct; only fetch messages newer than this
> + - min_uid (optional): only fetch messages with UID >= min_uid
> +
> + This function should be called with at MOST one of min_date OR
> + min_uid set but not BOTH.
>
> Returns: range(s) for messages or None if no messages
> are to be fetched."""
>
> - res_type, imapdata = imapobj.select(self.getfullname(), True, True)
> - if imapdata == [None] or imapdata[0] == '0':
> - # Empty folder, no need to populate message list
> - return None
> + def search(search_conditions):
> + """Actually request the server with the specified conditions.
>
> - # By default examine all messages in this folder
> - msgsToFetch = '1:*'
> -
> - maxage = self.config.getdefaultint(
> - "Account %s"% self.accountname, "maxage", -1)
> - maxsize = self.config.getdefaultint(
> - "Account %s"% self.accountname, "maxsize", -1)
> -
> - # Build search condition
> - if (maxage != -1) | (maxsize != -1):
> - search_cond = "(";
> -
> - if(maxage != -1):
> - #find out what the oldest message is that we should look at
> - oldest_struct = time.gmtime(time.time() - (60*60*24*maxage))
> - if oldest_struct[0] < 1900:
> - raise OfflineImapError("maxage setting led to year %d. "
> - "Abort syncing."% oldest_struct[0],
> - OfflineImapError.ERROR.REPO)
> - search_cond += "SINCE %02d-%s-%d"% (
> - oldest_struct[2],
> - MonthNames[oldest_struct[1]],
> - oldest_struct[0])
> -
> - if(maxsize != -1):
> - if(maxage != -1): # There are two conditions, add space
> - search_cond += " "
> - search_cond += "SMALLER %d"% maxsize
> -
> - search_cond += ")"
> -
> - res_type, res_data = imapobj.search(None, search_cond)
> + Returns: range(s) for messages or None if no messages
> + are to be fetched."""
> + res_type, res_data = imapobj.search(None, search_conditions)
> if res_type != 'OK':
> raise OfflineImapError("SEARCH in folder [%s]%s failed. "
> "Search string was '%s'. Server responded '[%s] %s'"% (
> self.getrepository(), self, search_cond, res_type, res_data),
> OfflineImapError.ERROR.FOLDER)
> + return res_data[0].split()
>
> - # Resulting MSN are separated by space, coalesce into ranges
> - msgsToFetch = imaputil.uid_sequence(res_data[0].split())
> + res_type, imapdata = imapobj.select(self.getfullname(), True, True)
> + if imapdata == [None] or imapdata[0] == '0':
> + # Empty folder, no need to populate message list.
> + return None
>
> - return msgsToFetch
> + conditions = []
> + # 1. min_uid condition.
> + if min_uid != None:
> + conditions.append("UID %d:*"% min_uid)
> + # 2. date condition.
> + elif min_date != None:
> + # Find out what the oldest message is that we should look at.
> + conditions.append("SINCE %02d-%s-%d"% (
> + min_date[2], MonthNames[min_date[1]], min_date[0]))
> + # 3. maxsize condition.
> + maxsize = self.getmaxsize()
> + if maxsize != None:
> + conditions.append("SMALLER %d"% maxsize)
> +
> + if len(conditions) >= 1:
> + # Build SEARCH command.
> + search_cond = "(%s)"% ' '.join(conditions)
> + search_result = search(search_cond)
> + return imaputil.uid_sequence(search_result)
> +
> + # By default consider all messages in this folder.
> + return '1:*'
>
> # Interface from BaseFolder
> def msglist_item_initializer(self, uid):
> @@ -210,19 +214,21 @@ class IMAPFolder(BaseFolder):
>
>
> # Interface from BaseFolder
> - def cachemessagelist(self):
> + def cachemessagelist(self, min_date=None, min_uid=None):
> + self.ui.loadmessagelist(self.repository, self)
> self.messagelist = {}
>
> imapobj = self.imapserver.acquireconnection()
> try:
> - msgsToFetch = self._msgs_to_fetch(imapobj)
> + msgsToFetch = self._msgs_to_fetch(
> + imapobj, min_date=min_date, min_uid=min_uid)
> if not msgsToFetch:
> return # No messages to sync
>
> # Get the flags and UIDs for these. single-quotes prevent
> # imaplib2 from quoting the sequence.
> res_type, response = imapobj.fetch("'%s'"%
> - msgsToFetch, '(FLAGS UID)')
> + msgsToFetch, '(FLAGS UID INTERNALDATE)')
> if res_type != 'OK':
> raise OfflineImapError("FETCHING UIDs in folder [%s]%s failed. "
> "Server responded '[%s] %s'"% (self.getrepository(), self,
> @@ -247,6 +253,7 @@ class IMAPFolder(BaseFolder):
> flags = imaputil.flagsimap2maildir(options['FLAGS'])
> rtime = imaplibutil.Internaldate2epoch(messagestr)
> self.messagelist[uid] = {'uid': uid, 'flags': flags, 'time': rtime}
> + self.ui.messagelistloaded(self.repository, self, self.getmessagecount())
>
> def dropmessagelistcache(self):
> self.messagelist = {}
> diff --git a/offlineimap/folder/Maildir.py b/offlineimap/folder/Maildir.py
> index 79c34a7..d400a3f 100644
> --- a/offlineimap/folder/Maildir.py
> +++ b/offlineimap/folder/Maildir.py
> @@ -91,25 +91,17 @@ class MaildirFolder(BaseFolder):
> token."""
> return 42
>
> - # Checks to see if the given message is within the maximum age according
> - # to the maildir name which should begin with a timestamp
> - def _iswithinmaxage(self, messagename, maxage):
> - # In order to have the same behaviour as SINCE in an IMAP search
> - # we must convert this to the oldest time and then strip off hrs/mins
> - # from that day.
> - oldest_time_utc = time.time() - (60*60*24*maxage)
> - oldest_time_struct = time.gmtime(oldest_time_utc)
> - oldest_time_today_seconds = ((oldest_time_struct[3] * 3600) \
> - + (oldest_time_struct[4] * 60) \
> - + oldest_time_struct[5])
> - oldest_time_utc -= oldest_time_today_seconds
> + def _iswithintime(self, messagename, date):
> + """Check to see if the given message is newer than date (a
> + time_struct) according to the maildir name which should begin
> + with a timestamp."""
>
> timestampmatch = re_timestampmatch.search(messagename)
> if not timestampmatch:
> return True
> timestampstr = timestampmatch.group()
> timestamplong = long(timestampstr)
> - if(timestamplong < oldest_time_utc):
> + if(timestamplong < time.mktime(date)):
> return False
> else:
> return True
> @@ -150,18 +142,21 @@ class MaildirFolder(BaseFolder):
> flags = set((c for c in flagmatch.group(1) if not c.islower()))
> return prefix, uid, fmd5, flags
>
> - def _scanfolder(self):
> + def _scanfolder(self, min_date=None, min_uid=None):
> """Cache the message list from a Maildir.
>
> + If min_date is set, this finds the min UID of all messages newer than
> + min_date and uses it as the real cutoff for considering messages.
> + This handles the edge cases where the date is much earlier than messages
> + with similar UID's (e.g. the UID was reassigned much later).
> +
> Maildir flags are: R (replied) S (seen) T (trashed) D (draft) F
> (flagged).
> :returns: dict that can be used as self.messagelist.
> """
>
> - maxage = self.config.getdefaultint("Account " + self.accountname,
> - "maxage", None)
> - maxsize = self.config.getdefaultint("Account " + self.accountname,
> - "maxsize", None)
> + maxsize = self.getmaxsize()
> +
> retval = {}
> files = []
> nouidcounter = -1 # Messages without UIDs get negative UIDs.
> @@ -170,12 +165,11 @@ class MaildirFolder(BaseFolder):
> files.extend((dirannex, filename) for
> filename in os.listdir(fulldirname))
>
> + date_excludees = {}
> for dirannex, filename in files:
> # We store just dirannex and filename, ie 'cur/123...'
> filepath = os.path.join(dirannex, filename)
> - # Check maxage/maxsize if this message should be considered.
> - if maxage and not self._iswithinmaxage(filename, maxage):
> - continue
> + # Check maxsize if this message should be considered.
> if maxsize and (os.path.getsize(os.path.join(
> self.getfullname(), filepath)) > maxsize):
> continue
> @@ -192,16 +186,43 @@ class MaildirFolder(BaseFolder):
> nouidcounter -= 1
> else:
> uid = long(uidmatch.group(1))
> - # 'filename' is 'dirannex/filename', e.g. cur/123,U=1,FMD5=1:2,S
> - retval[uid] = self.msglist_item_initializer(uid)
> - retval[uid]['flags'] = flags
> - retval[uid]['filename'] = filepath
> + if min_uid != None and uid > 0 and uid < min_uid:
> + continue
> + if min_date != None and not self._iswithintime(filename, min_date):
> + # Keep track of messages outside of the time limit, because they
> + # still might have UID > min(UIDs of within-min_date). We hit
> + # this case for maxage if any message had a known/valid datetime
> + # and was re-uploaded because the UID in the filename got lost
> + # (e.g. local copy/move). On next sync, it was assigned a new
> + # UID from the server and will be included in the SEARCH
> + # condition. So, we must re-include them later in this method
> + # in order to avoid inconsistent lists of messages.
> + date_excludees[uid] = self.msglist_item_initializer(uid)
> + date_excludees[uid]['flags'] = flags
> + date_excludees[uid]['filename'] = filepath
> + else:
> + # 'filename' is 'dirannex/filename', e.g. cur/123,U=1,FMD5=1:2,S
> + retval[uid] = self.msglist_item_initializer(uid)
> + retval[uid]['flags'] = flags
> + retval[uid]['filename'] = filepath
> + if min_date != None:
> + # Re-include messages with high enough uid's.
> + positive_uids = filter(lambda uid: uid > 0, retval)
> + if positive_uids:
> + min_uid = min(positive_uids)
> + for uid in date_excludees.keys():
> + if uid > min_uid:
> + # This message was originally excluded because of
> + # its date. It is re-included now because we want all
> + # messages with UID > min_uid.
> + retval[uid] = date_excludees[uid]
> return retval
>
> # Interface from BaseFolder
> def quickchanged(self, statusfolder):
> - """Returns True if the Maildir has changed"""
> - self.cachemessagelist()
> + """Returns True if the Maildir has changed
> +
> + Assumes cachemessagelist() has already been called """
> # Folder has different uids than statusfolder => TRUE.
> if sorted(self.getmessageuidlist()) != \
> sorted(statusfolder.getmessageuidlist()):
> @@ -218,9 +239,12 @@ class MaildirFolder(BaseFolder):
> return {'flags': set(), 'filename': '/no-dir/no-such-file/'}
>
> # Interface from BaseFolder
> - def cachemessagelist(self):
> + def cachemessagelist(self, min_date=None, min_uid=None):
> if self.ismessagelistempty():
> - self.messagelist = self._scanfolder()
> + self.ui.loadmessagelist(self.repository, self)
> + self.messagelist = self._scanfolder(min_date=min_date,
> + min_uid=min_uid)
> + self.ui.messagelistloaded(self.repository, self, self.getmessagecount())
>
> # Interface from BaseFolder
> def getmessagelist(self):
--
Nicolas Sebrecht
More information about the OfflineIMAP-project
mailing list