[PATCH] Avoid Fatal error: Word too long from Cyrus IMAP servers by chunking fetch.

Sebastian Spaeth Sebastian at SSpaeth.de
Thu Jan 20 09:26:06 UTC 2011


On Wed, 19 Jan 2011 08:41:27 -0500, "Edward Z. Yang" <ezyang at MIT.EDU> wrote:
>              # Now, get the flags and UIDs for these.
>              # We could conceivably get rid of maxmsgid and just say
>              # '1:*' here.
> -            response = imapobj.fetch(messagesToFetch, '(FLAGS UID)')[1]
> +            batchNum = 50
> +            response = []
> +            queueOfMessagesToFetch = messagesToFetch.split(',')
> +            while queueOfMessagesToFetch:
> +                batch = queueOfMessagesToFetch[0:batchNum]
> +                queueOfMessagesToFetch = queueOfMessagesToFetch[batchNum:]
> +                response += imapobj.fetch(','.join(batch), '(FLAGS 

Hold your horses, a grumpy man has a comment :-)!

it certainly makes sense to start chunking at some point, but I am very
scared about the potential performance implications that your patch has.
We run this for every sync and every folder. Some of my folders have
10,000 mails in it, this patch would imply a 2000-fold increase in
command roundtrips compared to what I now have.

My mail servers coped fine without any chunking so far, so I am not sure
why we would want to always introduce some here. (also just fetching 50
UIDs and flags at a time seems like a very small data set).

I don't think we should perform the least common denominator actions by
default. If we do that for all server quirks, offlineimap would soon
become unusable.

I see the following strategies:

- Make the chunk limited configurable in offlineimap but off by
  default. Catch the cyrus server exception and inform the user to find
  out a limit and set it in her configuration file.

- Don't chunk by default, and if offlineimap detects the experienced
  error, it sets some limit itself, caching that value somewhere in the
  LocalStatus file or some other cache file (I can imagine many
  situations where such a status/configuration cache would come in handy
  anyway). We could even add some smartness to increase the limit
  adaptively until it hits the wall.

In any case, don't force me to fetch my 100k mail UIDs in batches of 50
at a time. At the very least let us time the performance difference with
a real mail server.

What do you think?

Sebastian



More information about the OfflineIMAP-project mailing list