[PATCH 5/7] Re: Don't keep sqlite connections open

Sebastian Spaeth Sebastian at SSpaeth.de
Fri May 6 20:38:29 UTC 2011


On Fri, 6 May 2011 11:14:17 -0700 (PDT), chris coleman <christocoleman at yahoo.com> wrote:
> I want to put this out there and get the opinion of you and the list.
> 
> For performance (while multi-threading.. dealing with huge inboxes.. multiple accounts on multiple servers) and data-integrity reasons (crashes or other interruptions in the code that might damage the data stored previously in flat text files, typically 100KB in size, that were getting written to disk possibly 100 times in a single invocation of offlineimap, every 3 minutes for one week... Now I know why the disk light stays on solid for 1-2minutes when the script is running... yikes!).

Yes, OfflineImap always played super safe and writes out the cache after
every single change. It does so by writing it first to a new tmp file
and then moving it into place of the old one to avoid partially written
content (often fsyncing inbetween). We basically have to expect to be
killed or crash at anytime, so playing safe is good, in general. It is
safe but it kills performance, especially for those guys that have
multi-million email boxes (yes, they do exist).

That's why the sqlite patches have been floating since 2008 or so. But
due to development stagnation, they were never incorporated.

Using a database such as sqlite is very well suited for our purposes I
think, although people have pointed out the benefits or plain text
files when it comes to e.g. recovering from corruption.

This change is a major step in my opinion, in terms of performance and
generally being nicer to our hard disks. However, there is plenty of
stuff left to do in offlineimap on which I would rather focus then
implementing or even incorporating an abstract database backend. sqlite
is good, but *I* don't really see any benefit in being able to stuff
your cache into a postgres. After all, firefox doesn't offer you the
possibility to put your bookmarks and its cache into postgres either.

> There are some already existing frameworks , pre-packaged, tested and working, and they're available with a simple "apt-get python-sqlobject" or "apt-get python-sqlalchemy" for example.

Now, that we have a 2nd LocalStatus backend implemented, it would be
rather trivial to implement more backends, also one that implements an
db abstraction backend. There are only a few functions to
implement. Patches are welcome, but *I* am not gonna introduce another
level of abstraction for a smallish cache.

> I think it would be really cool to let the user pick the database that they have available, with a setting in the .offlineimaprc, and the offlineimap python code using one of these persistence frameworks , would be unchanged.

Once we are convinced that sqlite works great and everyone who remembers
that it even can do plaintext, *I* would rather remove the current
option from offlineimap.conf again and just use the sensible
default. offlineimap.conf is a monster as it is, and each additional
code path means more paths to test (and conversely, less code paths
actively being used) which is bound to introduce more regressions and
failure opportunities. I'd rather try to keep offlineimap as simple as
possible.

> 1) the added performance and reliability would really be awesome!  

I am 100% sure that using a 3rd party db abstraction would not gain us
performance over just using sqlite. But I am willing to be convinced by
benchmarks :-).

> 2) no need to close the connection every time you go thru the loop because another thread will corrupt it.  
Fixed in the latest revision, sqlite3 is multithreading capable since 3.3. after
all (published in April 2008 or so). We don't close it anymore.

[SNIP lots of valid stuff]
> What's your opinion?

All nice and good. In the end, it comes down to someone getting their
hands dirty and implementing it. When it comes to developer time, *I*
would spend my time rather debugging IMAP hangs and improve our Error
message handling, than including one more level of abstraction,
requiring additional packages to install. This is just a smallish cache,
it's not like we are doing a db-based web app. :-)

But this being open source, the door is always open to contributions for
everyone to scratch their itches. ;)

Sorry if that is not what you wanted to hear from me, but you asked for
my opinion :)

Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/offlineimap-project/attachments/20110506/ccfefaec/attachment.pgp>


More information about the OfflineIMAP-project mailing list