<DKIM> maxage causes loss of local email

Nicolas Sebrecht nicolas.s-dev at laposte.net
Mon Mar 9 10:09:13 UTC 2015


On Mon, Mar 09, 2015 at 02:47:11AM -0400, Janna Martl wrote:
> On Sat, Mar 07, 2015 at 01:37:30PM +0100, Nicolas Sebrecht wrote:

> I found a couple more bugs: the reason I wasn't getting correct times
> out of the server before is that the IMAP fetch operation was asking
> for uid and flags, but not internaldate (but then attempting to parse
> the result as if it had a date). Also, Internaldate2epoch() was just
> wrong (it parsed the time zone but didn't use it).

Good work! My comments below.

> Yes, I've been using it for the past couple days, and tried to check
> various cases that were previously problematic.

> Signed-off-by: Janna Martl <janna.martl109 at gmail.com>

Thanks, applied.

> >There is a little downside I didn't think about before, though. The last
> >downloaded mails won't anymore be the last as sorted by ls, find and
> >friends. If users have any kind of script that operates on the last
> >downloaded mails (e.g. hackish way for anti-virus/filtering processing),
> >such scripts will be broken.
> 
> True. find -ctime still works, though. Maybe the commit message should say
> something to this effect? E.g.:

Forgot to add it to the commit message (which I made long enough, BTW).

> Make new messages' filenames start with the internal date instead of
> the retrieval date; this fixes a bug with the maxage feature, where
> local messages would get deleted if their internal date and retrieval
> dates were significantly different. (Note that ls | head might now
> fail to find the most recently retrieved messages, but find -ctime
> still works.)

Could you patch the Changelog.rst about that so it won't be left behind?

> diff -Nur offlineimap/folder/Base.py offlineimap-patched/folder/Base.py
> --- offlineimap/folder/Base.py	2015-02-28 02:18:19.685320880 -0500
> +++ offlineimap-patched/folder/Base.py	2015-03-09 00:26:33.588696271 -0400

Please, work on the current 'next' branch, just updated. This will avoid
merge conflicts. If it's too much hard for this patch, let's continue
like that, I'll do the conflicts solving.

The most important thing to my eyes is the documentation. I think the
whole timezone/maxage issues are quiet hard to understand. That's why I
think the whole thing should be documented in length somewhere.

There's no one method handling the whole strategy, so I propose to add a
new section in docs/doc-src/API.rst about the current and previous
patches:

"""
A note about maxage.

It's not obvious to know if the mail is within maxage or not. The reason
is that IMAP servers are using different timzones for the SINCE IMAP
command.

The global strategy is to...
...

It's the responsability of:
- ... to...
- ... to...
"""

> @@ -18,6 +18,7 @@
>  import os.path
>  import re
>  from sys import exc_info
> +import time
>  
>  from offlineimap import threadutil, emailutil
>  from offlineimap import globals
> @@ -247,6 +248,26 @@
>  
>          raise NotImplementedError
>  
> +
> +    def time_is_within_maxage(self, t, maxage):
> +        """ Checks if time t (expressed as seconds since the epoch) is later than
> +        00:00 UTC on (today's date) - (maxage - 1 days). We do this because
> +        SINCE in an IMAP search has maximum granularity of days. """

    def s_time_is_within_d_range(self, t, d):
        """Check if time t is later than (today - d days)."""

        :param t: expressed as seconds since the epoch
        :param d: expressed as number of days

        :returns: True of False

        We do this because SINCE in an IMAP search has maximum
        granularity of days."""

> +
> +        #We must convert this to the oldest time and then strip off hrs/mins
> +        #from that day
           ^^

Add one space after the #.

I think the implementation get it wrong. If you just strip off the
hrs/mins, you are only be reducing up to

  (maxage - X hours) with X within [0; 24],

not the (maxage - 24 hours) we want.

I would let the decision of what's the correct range up to the caller:

  s_time_is_within_d_range(t, maxage - 1)

> +        oldest_time_utc = time.time() - (60*60*24*maxage)
> +        oldest_time_struct = time.gmtime(oldest_time_utc)
> +        oldest_time_today_seconds = ((oldest_time_struct[3] * 3600) \
> +            + (oldest_time_struct[4] * 60) \
> +            + oldest_time_struct[5])
> +        oldest_time_utc -= oldest_time_today_seconds
> +        if(t < oldest_time_utc):
> +            return False
> +        else:
> +            return True
> +
> +
>      def dropmessagelistcache(self):
>          raise NotImplementedException
>  
> diff -Nur offlineimap/folder/Gmail.py offlineimap-patched/folder/Gmail.py
> --- offlineimap/folder/Gmail.py	2015-02-28 02:18:19.685320880 -0500
> +++ offlineimap-patched/folder/Gmail.py	2015-03-09 00:23:49.276532639 -0400
> @@ -140,7 +140,7 @@
>              #
>              # NB: msgsToFetch are sequential numbers, not UID's
>              res_type, response = imapobj.fetch("'%s'" % msgsToFetch,
> -              '(FLAGS X-GM-LABELS UID)')
> +              '(FLAGS X-GM-LABELS UID INTERNALDATE)')
>              if res_type != 'OK':
>                  raise OfflineImapError("FETCHING UIDs in folder [%s]%s failed. " % \
>                    (self.getrepository(), self) + \
> @@ -163,7 +163,6 @@
>                                            minor = 1)
>              else:
>                  uid = long(options['UID'])
> -                self.messagelist[uid] = self.msglist_item_initializer(uid)
>                  flags = imaputil.flagsimap2maildir(options['FLAGS'])
>                  m = re.search('\(([^\)]*)\)', options['X-GM-LABELS'])
>                  if m:
> @@ -172,6 +171,14 @@
>                      labels = set()
>                  labels = labels - self.ignorelabels
>                  rtime = imaplibutil.Internaldate2epoch(messagestr)
> +                maxage = self.config.getdefaultint("Account %s"% self.accountname,
> +                                           "maxage", -1)

Indent only 4 spaces more. Splitting the line after '%s' is good
practice.

> +                # We'd asked the server for (maxage + 1) days of messages to be
> +                # safe since we don't know what timezone it's using; now throw
> +                # away the ones that aren't in the desired range.
> +                if maxage and not self.time_is_within_maxage(rtime, maxage):
> +                    continue
> +                self.messagelist[uid] = self.msglist_item_initializer(uid)
>                  self.messagelist[uid] = {'uid': uid, 'flags': flags, 'labels': labels, 'time': rtime}
>  
>      def savemessage(self, uid, content, flags, rtime):

Good.

> diff -Nur offlineimap/folder/IMAP.py offlineimap-patched/folder/IMAP.py
> --- offlineimap/folder/IMAP.py	2015-02-28 02:18:19.685320880 -0500
> +++ offlineimap-patched/folder/IMAP.py	2015-03-08 23:52:42.884093854 -0400
> @@ -175,7 +175,10 @@
>  
>              if(maxage != -1):
>                  #find out what the oldest message is that we should look at
                   ^^

Fix indent while there. ,-)

> -                oldest_struct = time.gmtime(time.time() - (60*60*24*maxage))
> +                # We don't know which timezone the server will interpret our
> +                # "SINCE <day>" query in, so fetch too many messages and throw

                # "SINCE <day>" query in, so fetch too many messages up to (maxage + 1).
                # We will throw away [...] later in <method/function name>.

> +                # away the ones that are too old later.
> +                oldest_struct = time.gmtime(time.time() - (60*60*24*(maxage + 1)))
>                  if oldest_struct[0] < 1900:
>                      raise OfflineImapError("maxage setting led to year %d. "
>                                             "Abort syncing." % oldest_struct[0],
> @@ -222,7 +225,7 @@
>              # Get the flags and UIDs for these. single-quotes prevent
>              # imaplib2 from quoting the sequence.
>              res_type, response = imapobj.fetch("'%s'"%
> -                msgsToFetch, '(FLAGS UID)')
> +                msgsToFetch, '(FLAGS UID INTERNALDATE)')
>              if res_type != 'OK':
>                  raise OfflineImapError("FETCHING UIDs in folder [%s]%s failed. "
>                                         "Server responded '[%s] %s'"% (
> @@ -245,9 +248,16 @@
>                                            minor = 1)
>              else:
>                  uid = long(options['UID'])
> -                self.messagelist[uid] = self.msglist_item_initializer(uid)
>                  flags = imaputil.flagsimap2maildir(options['FLAGS'])
>                  rtime = imaplibutil.Internaldate2epoch(messagestr)
> +                maxage = self.config.getdefaultint("Account %s"% self.accountname,
> +                                           "maxage", -1)

Fix indent.

> +                # We'd asked the server for (maxage + 1) days of messages to be
> +                # safe since we don't know what timezone it's using; now throw
> +                # away the ones that aren't in the desired range.
> +                if maxage and not self.time_is_within_maxage(rtime, maxage):
> +                    continue
> +                self.messagelist[uid] = self.msglist_item_initializer(uid)
>                  self.messagelist[uid] = {'uid': uid, 'flags': flags, 'time': rtime}
>  
>      def dropmessagelistcache(self):
> diff -Nur offlineimap/folder/Maildir.py offlineimap-patched/folder/Maildir.py
> --- offlineimap/folder/Maildir.py	2015-03-09 00:15:11.472543230 -0400
> +++ offlineimap-patched/folder/Maildir.py	2015-03-09 00:16:19.024803136 -0400
> @@ -94,25 +94,13 @@
>      #Checks to see if the given message is within the maximum age according
>      #to the maildir name which should begin with a timestamp
>      def _iswithinmaxage(self, messagename, maxage):
> -        #In order to have the same behaviour as SINCE in an IMAP search
> -        #we must convert this to the oldest time and then strip off hrs/mins
> -        #from that day
> -        oldest_time_utc = time.time() - (60*60*24*maxage)
> -        oldest_time_struct = time.gmtime(oldest_time_utc)
> -        oldest_time_today_seconds = ((oldest_time_struct[3] * 3600) \
> -            + (oldest_time_struct[4] * 60) \
> -            + oldest_time_struct[5])
> -        oldest_time_utc -= oldest_time_today_seconds
> -
>          timestampmatch = re_timestampmatch.search(messagename)
>          if not timestampmatch:
>              return True
>          timestampstr = timestampmatch.group()
>          timestamplong = long(timestampstr)
> -        if(timestamplong < oldest_time_utc):
> -            return False
> -        else:
> -            return True
> +        return self.time_is_within_maxage(timestamplong, maxage)

        # <why we are asking for (maxage - 1) instead of plain
        # (maxage).>
        return self.time_is_within_maxage(timestamplong, maxage - 1)

> +
>  
>      def _parse_filename(self, filename):
>          """Returns a messages file name components
> diff -Nur offlineimap/imaplibutil.py offlineimap-patched/imaplibutil.py
> --- offlineimap/imaplibutil.py	2015-02-28 02:18:19.685320880 -0500
> +++ offlineimap-patched/imaplibutil.py	2015-03-08 01:13:52.860248989 -0500
> @@ -178,6 +178,8 @@
>  
>      Returns seconds since the epoch."""
>  
> +    import calendar

This is a big import while we only use one function.

    from calendar import timegm

Is calendar a standard CPython library?

> +
>      mo = InternalDate.match(resp)
>      if not mo:
>          return None
> @@ -201,4 +203,4 @@
>  
>      tt = (year, mon, day, hour, min, sec, -1, -1, -1)
>  
> -    return time.mktime(tt)
> +    return calendar.timegm(tt) - zone

-- 
Nicolas Sebrecht



More information about the OfflineIMAP-project mailing list