Best alternative to maildir?
lists at xenhideout.nl
Fri Sep 19 22:28:13 BST 2014
I was not asking for such remarks.
I was asking for feedback on a creative proposal.
That is all. These remarks are useless. I have already set my mind, and
you cannot decide what works best for me and what doesn't.
ps. a filesystem was not designed to "handle a large number of files". It
was designed to /handle files/ no matter their number. If you create an
ext3fs with 20 inodes, it will NOT handle a large number of files. The
number of files a fs can or will handle is up to the programmer, designer
or configurator. It is pretty pointless to go and increase your file
number "just because you can". Such remarks are utterly pointless and
idiotic and not something I would expect from this list/program. You are
now treating it as a "we just always do it this way, no reason really".
It's like saying I need to write a 2 meg text file and you say "well,
split it in 100 files, the filesystem was designed for that!".
What nonsense. Give someone else your crap, not me. And if no one here can
help me, I'll just do it on my own. I have no need for wise-guys who claim
to know better (and even best) how ANOTHER person should be doing his
projects. I was not asking or begging for your programming time
whatsoever, just a few simple perspectives on how best to do what I want
"I want to build a house."
"You shouldn't build a house, houses are stupid."
ps2. "If you are worried about updating metadata too often, enable noatime
in fstab." kinda proves how moronic you are being here. You are now
tunnel-visioning onto a parameter of a freakin filesystem or mount option
whereas any decent design should be filesystem-agnostic except perhaps for
obvious requirements (as could be) such as symlink support. So you are now
basically suggesting to me performance alleviations in case your superb
design starts overloading the (metadata) system of that fs. THAT PROVES
that your design is flawed. Or you wouldn't immediately start suggesting
workarounds to the problems it may cause!!!!
And then you say "It all works just fine on unix." when you JUST TOLD ME
how to adjust freaking PARAMETERS in case it DOESN'T "just work fine"....
My god.. And your complaints about lack of interoperatiblity obviously
wouldn't matter if you were to port a software to a failing OS like MS
Windows? That completely CHOKED (in Windows Explorer) on a new folder
worth 6000 .eml files. "Lock you into one toolset". Well, tell you what, I
have never used any other program that stored email as thousands of .eml
files. Nor have I ever wanted to. From the perspective of a Windows user,
it is nonsensical at best and utterly utterly dumb at best. The word I
want to use is not even in English. DEBIEL. Having folders with 6k+ files
makes them completely unmanagable in a manual way. Sure, shell scripts.
Fine. Everyone should be dependent on his or her own ability to write
fail-proof shell-scripts just in case he or she wants to use freaking
EMAIL. You are a commercial mind my friend. You should design the next OS,
and sell it to consumers, you will do excellent!
There is a good reason why not any commercial (let's say, really popular)
email client would never store email as individual files. 99.5% of users
would complain about it. People who go into Thunderbird see a neat lists
of files that coincide with their IMAP folders. They are not burdened with
the details of what is inside those files, because they have a tool for
that: It's called their email program (Thunderbird itself).
When you write software (libraries) you also make reasonable judgements
about how you will distribute your containment. On the one extreme, you
could put every subroutine or function into its own file and then include
that. On the other extreme, you could not compartmentalise at all and dump
everything into a single monolothic file. It is clear any design needs to
find the middle ground.
Thunderbird stores it as some form of mbox (not really sure if it is what
they call "mbox" but with a simple plugin (it should be standard of
course) it is easy to export as .eml files. This plugin is also only
capable of importing .eml files, it seems. So where is my lack of
interoperability now? I use Thunderbird, it doesn't store as .eml, and I
am still capable of interoperating with anything I want.
Unless of course I want 30 different tools to operate on my email
collection at once. That would spell "safety". Having these files all out
in the open would invite anything and everyone to mess with it. Any badly
configurated tool could start deleting stuff and I wouldn't notice. If it
started deleting "mbox" files I would be SURE to notice. What about the
filenames of those emails. How are you going to make that meaningful? I
can think of no meaningful naming scheme whatsoever, since the filenames
themselves are not relevant to an email program. Are you now going to
encode metadata in the name??? That would introduce MORE opportunities for
disaster. Now the filesystem has to, for example, be able to deal with
e.g. Japanese kanji characters and the like, AND your own OS has to be
able to handle it as well.
You are creating requirements and dependencies on perfect operation left
and right. And you call this a good design??
You also did not really grasp the meaning of my design goals. I am not
intending to create a backup store without any containment (do you not
realize the complete contradiction in those words? Backup? No-containment?
Safe? Dispersed?) just because "other tools" will be able to operate on my
"backup". I HAVE IMAP FOR THAT.
My backups get TUCKED AWAY inside even encrypted rsyncable tarballs. Why
on EARTH do I want other tools to be able to operate on that? In general a
backup wants to be restored in its entirety and when that is not the case,
the backup tool itself (in this case OfflineIMAP) can arrange for its
With email at least, I rarely make manual mistakes because we have Trash
folders that provide a two-level delete. So my backup is meant for
safekeeping, not for operating on. In the rare event that I do need to
operate on it, it really won't be hard, I can promise you that.
In short, the filesystem is NO place for storing email metadata
whatsoever. All metadata should be done by the application.
The "interoperable" system I and we have is called IMAP. It is a
standardized protocol which has a huge variety of tools to work with.
Although it seems to be lacking in terms of backup abilities and
possibilities and alternatives, which is why I am here.
This is my "medium" through which I move to other tools. I don't need
another standardized protocol (such as the way I store my email locally)
to operate with "other tools". LEAST of all on my BACKUPS.
Introducing another common medium is pointless to begin with. At least for
my every purpose, it is. Completely. I have been using email for a million
years, so to speak, just like everyone else here, and I have never needed
any other "tools" to work on my email store. And failing proper backups,
even one failing tool that syncs with some IMAP store is enough to ruin
your data. Even for a mail spool, I would not depend on individual files.
You are putting too much emphasis and too much reliance and too much
dependence on the filesystem handling your operational tasks when it is
not even designed for that. It is abusing your fs.
So you are claiming it was designed for that. I am claiming you are
abusing it. There we have our difference in perspective.
"It all works just fine" or "It works for me" are more of those
meaningless remarks. Whether it "works just fine" for you or for some
system you claim to be expert on, does not mean anything at all. What does
"it works" mean. What goals, standards and requirements are you relating
that to when you say that? One could have very low goals and then it would
"work just fine" and another could have very high goals and it would fall
So these are subjective, relative statements with unexplained premises.
What works for you may not work for another. You fail to look outside your
narrow set of requirements.
If I thought the tools available would do the job, I would not have
written my initial email. So basically you are thinking I am a moron,
which is why I am calling you one straight out. If the tools did the job I
need them to do, I would not have written anything and I would have just
used them. PERIOD. My writing proves to you that the tools do not do the
job I want them to do. YOU KNOW THAT. But you are just offended because
you are apparently kindof ego-invested in these tools being so awesome.
And if someone comes out and says "these tools are not very awesome for
me" you apparently feel insulted because I seem to insult your choices
(or those of others you agree with). And then you start projecting YOUR
emotional needs onto ME. But I'm not dissing your personal choices. If you
think these tools are right for you, fine. They are not right for me. But
you can't live with that, apparently.
And so I am setting out to improve them. For my personal purposes. And why
should I not have the right to do that? It is open source, probably,
apparently. Every other tool I need as well.
Everything is available: python, sqlite, a linux system, and offlineimap
itself. I do not even need more than that to completely achieve my current
goals. Why are you trying to think for me what would work best for me? You
are obviously not thinking in a creative direction, but in a reactive
pattern trying to dissuade me from proceeding as I am doing.
One tool I *could* use is having the ability to easier sync stuff I am
currently working on. Saving this email only on my server is not good
enough for me as I write. So I am currently at least manually copying it
from my postponed-msgs folder each time but I am doing it so often that it
becomes an annoying thing to do. Presently I *could* use that IMAPSize to
sync my ENTIRE mailbox after postponing it (in Alpine). But ideally I
would use something else (like OfflineIMAP) with a script to just sync
that postponed-msgs and (currently) also the "drafts" folder on the press
of a button. Again, that doesn't require anything more than what I am
When I am writing something bigger or more important, I also make
intermediate printouts in case disaster strikes. I do not flirt with
disaster. Never. Except... after I started using IMAPSize. Coincidence? We
both know I am not speaking bull.
For very important things, I sometimes store it in 4 locations at once as
I am writing or working on it. And now you are blaming my workflow for
lack of safety? Sure I lack the tools presently to automate it. I have a
sync script setup to a USB stick and that works fine, I don't have it
loaded now. If I need a third location for this text of this email, I
would mount it and start syncing that. Currently I need a sync from IMAP
to local files more than anything else. But in principle, individual
drafts of individual emails are not a meaningful distinction or
separation. If I could sync the entire folder at once, that would be
exactly what I need. Why then store as individual files? What purpose does
The only purpose it has is for lazy programmers who use the filesystem to
provide for operational support they would otherwise need to encode into a
different datastructure in some individual (set) of files. Just imagine a
database storing every row of every table in a separate individual file!
It would be madness!!! For one, it would immensely clog up the
filesystem's (inode) tables. The number of inodes in a ext-fs is fixed at
creation and cannot be changed or increased. So now you are marrying the
number of emails you have to a prior design choice regarding the number of
inodes you want to have.
And if you had thousands of users with thousands of emails each, or more,
...you could easily see you would run into problems of scalability right
The only reason you can say mbox files are more prone to corruption is
because you are now depending on the functioning of the filesystem proper
which has seen much greater development than anything any individual tool
could probably allow or mandate. So you are abusing something that is just
well-tested and fully-developed even though it is not the proper place for
that functionality. And then you say that that is safer. It's only safer
if you mangle your own mbox files, or allow other tools to do so.
Any reasonable "folder" implementation should have indices as PART of the
container itself. And then your tool should have a sublayer that handles
ALL access into individual "sections" such as would be, for example,
emails. If that sublayer is developed well enough, there should be no
issues more than in a filesystem that is equally well-developed. SQLite
should be such a "sublayer". Perhaps it is not perfectly ideal but it
should do the job for now...
Right? Maybe I'm wrong about that... I was abit hesitant about SQLite for
this from the beginning.... Any regular database stores indices separate
from the data itself, I believe. That may seem common sense, but it is
not, because you have now created a dependency that is, from the level of
the filesystem, external. Thunderbird just flags deleted emails as
"removed" in its index and/or folder file while not actually removing the
email itself so as to save on rewriting costs, which is not a bad
approach. It then offers to "compact" the folder once in a while, which,
again in principle, is not all that bad. What I would personally do, is
create an index at the beginning capable of storing an X amount of emails
(indices to emails). The last block would then be a pointer to the next
index-block further down. This is much like the ext-fs inode blocks. In
this way, if you grab parts of a large inbox file you can still make sense
of them as long as they contain these index-blocks which, of course, are
also easy to rebuild given uncorrupted data-segments. I would put X at
500. If every email requires about 1k worth of meta-data, you'd get a
block of 500k. But something like that needs a bit of development of
course. Ideally you would not have to write such a library yourself....
Regards, and BYE.
On Fri, 19 Sep 2014, Zak Smith wrote:
> Maildir is a robust and standardized way for multiple programs to
> (simultaneously) access mail messages on (mostly) unix systems. There
> is nothing wrong or unstable about it.
> You are projecting emotions about your loss of data. You should blame
> some combination of Thunderbird, IMAPSize, your workflow and/or your
> mistakes, and lack of a viable backups on your data loss.
> There is nothing wrong with having a large number of files. That's
> what a filesystem is designed to do. If you are worried about
> updating metadata too often, enable noatime in fstab.
> Using the trivial example, mbox files are much less reliable with
> regard to data loss or corruption compared to maildir. Any other more
> consolidated method (ie database files) will dramatically reduce
> interoperability and then lock you into one tool set.
> In the case of some messages becoming corrupted and/or deleted in even
> a huge mailbox, restoring the damaged messages from a viable backup
> can be done with a simple shell script. It all works just fine on
> On Fri, Sep 19, 2014 at 08:22:19PM +0200, Bart Schouten wrote:
>> I'll tell you a very short story. Believe it or not, but I lost
> # Zak Smith mobile 970-232-4468
More information about the OfflineIMAP-project