<DKIM> maxage causes loss of local email

Nicolas Sebrecht nicolas.s-dev at laposte.net
Sun Mar 1 00:18:16 GMT 2015


On Sat, Feb 28, 2015 at 02:33:07PM -0500, Janna Martl wrote:

> I use the maxage setting, and for a long time I've noticed that every
> day, a random subset of mail of age maxage + 1 gets deleted locally
> (but not on the remote server). I think I've found the cause:
> 
> Normally, offlineimap deletes mail that exists locally and in the
> statusfolder, but not in the remote folder. When maxage comes into
> play, this means it deletes UIDs whose local incarnation is younger
> than maxage, but whose remote incarnation is older than maxage. Age of
> the local copy is determined by the first string of numbers in the
> filename, and it seems this is just the current time when the mail is
> retrieved. So the trouble happens when something is retrieved locally
> much later than its remote timestamp.
> 
> (As another consequence of this problem, if I have very old mail on
> the remote server that isn't in my local folder for some reason, I
> can't save it locally, because the new local file gets marked with the
> current timestamp, and then gets deleted on the next pass of
> offlineimap.)

I have to admit I understood almost anything, starting from the first
sentence which doesn't makes sense to me. This turn me uncomfortable
because you seem to have clear overview about what you're talking about.
Could you rewrite your analisys and provide more context about the
current vs expected behaviours? Giving references to the code (methods,
etc) might be a good starting point.

> One solution would be to check a local message's internal timestamp in
> addition to the filename, when doing the maxage check. But this seems
> slow. Another solution would be to include the email's internal time
> in the filename. I wrote a (probably extremely clumsy!) patch that
> makes new Maildir filenames
>     (internal date)_(retrieval date)_(uniqueness string)...
> which, at least, fixes this problem for me.
> 
> Thoughts?

Theorically speaking at least, maxage doesn't require any additional
data to work. I have no doubt about the issue you raise but since it's
there for a long time, I'm even surprised that it can delete emails.

<complaining to myself>
I wonder what feature is fully working in OfflineIMAP without any
cumbersome edge-case.
<complaining to myself/>

Anyway, tuning the Maildir format tend to be error prone. Maildir
readers both MUA and IMAP servers usually have unexpected expectations
on it. So, this is a "avoid changes as much as possible" area.

If storing additional information is actually necessary (which would
still need to be confirmed IMHO), we should either rely on headers (hard
to handle right) or the local database.

> --- /usr/lib/python2.7/site-packages/offlineimap/folder/Maildir.py	2015-02-28 14:12:37.959293351 -0500
> +++ Maildir.py	2015-02-28 14:10:40.768698235 -0500
> @@ -22,6 +22,7 @@
>  import tempfile
>  from .Base import BaseFolder
>  from threading import Lock
> +from offlineimap import emailutil
>  
>  try:
>      from hashlib import md5
> @@ -235,16 +236,21 @@
>          filepath = os.path.join(self.getfullname(), filename)
>          return os.path.getmtime(filepath)
>  
> -    def new_message_filename(self, uid, flags=set()):
> +    def new_message_filename(self, uid, flags=set(), rtime=None):
>          """Creates a new unique Maildir filename
>  
>          :param uid: The UID`None`, or a set of maildir flags
>          :param flags: A set of maildir flags
>          :returns: String containing unique message filename"""
>          timeval, timeseq = _gettimeseq()
> -        return '%d_%d.%d.%s,U=%d,FMD5=%s%s2,%s' % \
> -            (timeval, timeseq, os.getpid(), socket.gethostname(),
> -             uid, self._foldermd5, self.infosep, ''.join(sorted(flags)))
> +        if rtime is None:
> +            return '%d_%d.%d.%s,U=%d,FMD5=%s%s2,%s' % \
> +                (timeval, timeseq, os.getpid(), socket.gethostname(),
> +                 uid, self._foldermd5, self.infosep, ''.join(sorted(flags)))
> +        else:
> +            return '%d_%d_%d.%d.%s,U=%d,FMD5=%s%s2,%s' % \
> +                (rtime, timeval, timeseq, os.getpid(), socket.gethostname(),
> +                 uid, self._foldermd5, self.infosep, ''.join(sorted(flags)))
>  
>  
>      def save_to_tmp_file(self, filename, content):
> @@ -314,7 +320,9 @@
>          # Otherwise, save the message in tmp/ and then call savemessageflags()
>          # to give it a permanent home.
>          tmpdir = os.path.join(self.getfullname(), 'tmp')
> -        messagename = self.new_message_filename(uid, flags)
> +        if rtime is None:
> +            rtime = emailutil.get_message_date(content)
> +        messagename = self.new_message_filename(uid, flags, rtime=rtime)
>          tmpname = self.save_to_tmp_file(messagename, content)
>          if rtime != None:
>              os.utime(os.path.join(self.getfullname(), tmpname), (rtime, rtime))
> @@ -382,8 +390,10 @@
>          oldfilename = self.messagelist[uid]['filename']
>          dir_prefix, filename = os.path.split(oldfilename)
>          flags = self.getmessageflags(uid)
> +        content = self.getmessage(uid)
> +        rtime = emailutil.get_message_date(content)

I didn't check in what method we are but I would expect both above lines
to be the most time consumers. I don't know if it would have real
impacts, though.

>          newfilename = os.path.join(dir_prefix,
> -          self.new_message_filename(new_uid, flags))
> +          self.new_message_filename(new_uid, flags, rtime=rtime))
>          os.rename(os.path.join(self.getfullname(), oldfilename),
>                    os.path.join(self.getfullname(), newfilename))
>          self.messagelist[new_uid] = self.messagelist[uid]


-- 
Nicolas Sebrecht



More information about the OfflineIMAP-project mailing list