IDLE again

Fri Dec 4 08:26:38 GMT 2009

On Thu, Dec 3, 2009 at 10:32 PM,  <exarkun at twistedmatrix.com> wrote:
> On 05:45 am, dom.lobue at gmail.com wrote:
>>
>> Jean-Paul,
>>
>> I read over the IMAP IDLE RFC and went through the IMAP4 twisted
>> library and I've sketched out a rough outline of how to implement
>> IDLE. I've run into some things in Twisted however that I don't really
>> understand well, and I'm hoping you can point me in the correct
>> direction.
>
> Cool.  That was quick. :)  Before I get into things, since this thread is
> likely to go into a lot of Twisted-specific details which may not be
> generally interesting, if there's anyone who'd like off the cc list, please
> speak up. :)
>>
>> First, just to verify my understanding: the IMAP4Client class is a
>> Protocol class.
>
> Yep.  And to expand on that, instances of Protocol classes typically have a
> one-to-one relationship with a connection.
>>
>> All IMAP commands are represented by at least two
>> methods in the IMAP4Client class - one for what to do when the command
>> is received from the server, and one for when the command is sent to
>> the server.
>
> Generally, though there are some exceptions.  For example, AUTHENTICATE is
> implemented with one method that starts by possibly sending a CAPABILITY
> (IMAP4Client.authenticate) command, then another method which will actually
> send AUTHENTICATE (IMAP4Client.__cbAuthenticate), then two more methods for
> dealing with the response to the AUTHENTICATE (IMAP4Client.__cbContinueAuth
> and IMAP4Client.__cbAuthTLS).
>
> Another way to look at it is like this.  For each supported protocol action,
> there is at least one public method on IMAP4Client to initiate this action
> by sending some bytes to the server.  All bytes received by IMAP4Client from
> the server are parsed according to the state the client is in, what commands
> are outstanding, etc.  Depending on the state and the bytes, callbacks might
> be invoked as a result of this, possibly delivering the results of a
> protocol action initiated earlier to the calling application code.
>
> This doesn't disagree with what you said too much, it just re-states it in
> slightly more general terms.
>>
>> Assuming this assertion to be true, the broad strokes of
>> the IDLE implementation is as follows:
>>
>> -IDLE is engaged: command is sent to server turning on IDLE, schedule
>> an IDLE reset in 29 minutes, and an attribute ( _IDLE_Enabled for
>> example ) is set to True.
>
> Basically, yes.  One subtle point, though - since the server might reject
> the IDLE command, the client shouldn't assume it has entered the IDLE state
> until it receives a positive acknowledgement of the command from the server
> (eg the "+ idling" line from the RFC).
>>
>> -In the command dispatcher code: checks if IDLE is enabled or not. If
>> enabled, it appends IDLE to all incoming commands and sends a DONE to
>> the server before any new commands are sent. (On the incoming commands
>> part: what I mean is if the method originally to be called was
>> "incoming_exists", instead it would go to "incoming_existsIDLE".)
>
> This part could probably bear some elaboration.  I think the question here
> is how the unsolicited information should best be made available to the
> application code which caused the IDLE to be issued (if I've misunderstood
> what you were getting at here, let me know).  One possibility is the
> existing IMailboxListener interface - if you look at the very end of the
> implementation of IMAP4Client, you'll find three no- op methods,
> modeChanged, flagsChanged, and newMessages.  These are intended for
> subclasses to override and are already called by IMAP4Client when
> unsolicited information is given by the server in response to a command.  It
> may make sense to direct data provided during an IDLE to these callbacks, or
> others similar to them.
>>
>> -When IDLE is confirmed off by server: delete scheduled IDLE reset.
>
> Yep.
>>
>> And that's basically it I think. Fancy stuff like downloading the
>> messages that IDLE notifies you about are handled in the
>> ClientFactory, right?
>
> Perhaps by a factory, or perhaps by something else.  When I've said
> "application code" above, this is what I'm talking about - the code that
> someone else has written which uses IMAP4Client somehow in order to do
> something IMAP4 related.  In our case, offlineimap would be the application
> code. :)  It doesn't make much difference to the IMAP4Client implementation
> who or what is using it, so it could be a ClientFactory or another protocol
> or a GUI or any number of other things.
>>
>> Some things that I'm not all that clear on and could use your help to
>> understand:
>> How do you cancel a previously queued/scheduled callback?
>
> It would probably make sense to be more specific here.  "Callback" might
> mean a lot of things.
>
> Deferreds, the central callback-management API used in Twisted, don't
> directly support cancellation (though we consider adding such support from
> time to time).  Generally APIs which want to offer cancellation do it by
> some other means separate from the Deferred they return.  For example, a
> number of APIs accept a "timeout" parameter which is a form of cancellation.
>  These APIs internally use the timed call features of the Twisted reactor to
> make the operation fail if it does not complete within the given time frame,
> resulting in an "errback" on the Deferred (just a callback for errors).
>
> Actually canceling an operation depends on what the operation is and how
> it's implemented.  For example, the IMAP4 protocol itself offers no
> mechanism for canceling a command which the server has already received and
> begun processing, aside from prematurely closing the connection.  So if you
> issue a FETCH, you don't have much of a way to avoid receiving the results.
>
> I'm not sure with what aim you bring up cancellation, so I don't think I can
> be any more specific than this now.  Let me know if I didn't actually answer
> your question.
>>
>> How are multiple connections to the same server handled? And more
>> importantly: how do you have a command in one session use a callback
>> on another already-open connection?
>
> This is simpler than it seems.  The answer is mostly what you'd expect if
> you asked it about a non-Twisted-based app.  If you want protocol instance A
> to do something to protocol instance B, you make sure A has a reference to B
> and then you have A call a method on B.  There are lots of approaches to
> making sure that reference is available, but you don't really need anything
> fancier than an attribute somewhere - using the factory is a common
> approach, since  ClientFactory sets itself as the "factory" attribute on
> each protocol instance it creates.
>
> Hope that was helpful,
> Jean-Paul
>

Jean-Paul,

That was most helpful, thanks!

I was rushing out the door, so I forgot some of the questions I wanted
to ask, and why some of my explanations were so bereft.

To answer your questions:
>>
>> Some things that I'm not all that clear on and could use your help to
>> understand:
>> How do you cancel a previously queued/scheduled callback?
>
> It would probably make sense to be more specific here.  "Callback" might
> mean a lot of things.

Specifically I'm talking about cancelling the scheduled reset of IDLE here.

>>
>> -In the command dispatcher code: checks if IDLE is enabled or not. If
>> enabled, it appends IDLE to all incoming commands and sends a DONE to
>> the server before any new commands are sent. (On the incoming commands
>> part: what I mean is if the method originally to be called was
>> "incoming_exists", instead it would go to "incoming_existsIDLE".)
>
> This part could probably bear some elaboration.  I think the question here
> is how the unsolicited information should best be made available to the
> application code which caused the IDLE to be issued (if I've misunderstood
> what you were getting at here, let me know).  One possibility is the
> existing IMailboxListener interface - if you look at the very end of the
> implementation of IMAP4Client, you'll find three no- op methods,
> modeChanged, flagsChanged, and newMessages.  These are intended for
> subclasses to override and are already called by IMAP4Client when
> unsolicited information is given by the server in response to a command.  It
> may make sense to direct data provided during an IDLE to these callbacks, or
> others similar to them.

Abstraction is not my strong point :(
>From what I read of imaplib2, it looks like that library would start
IDLE on all its connections, and as soon as it was notified of a new
incoming message, it would disengage IDLE and immediately start
downloading that message. I believe this method caused the stalled bug
that got finally got imaplib2 canned from OfflineIMAP (imaplib2
thought it was no longer in IDLE, when in fact it was so all commands
were ignored by the server).

My planned use of IDLE is to have one session open and dedicated to
running IDLE all the time. Any updates that come in over IDLE are sent
to a threadpool which either downloads the new message or deletes the
local copy in order to stay in sync with the remote.

Personally I think this is the only way to go as it keeps a clear
separation of concerns, and keeps everything just that much simpler.
But then again, I'm biased. :)

Abstraction aside, were I building this just for myself I'd refactor
IMAP4Client into three classes: IMAP4ClientBase(basic.LineReceiver,
policies.TimeoutMixin), IMAP4ClientIDLE(IMAP4ClientBase), and
IMAP4Client(IMAP4Client). Whatever IMAP4Client and IMAP4ClientIDLE can
both use is put in IMAP4ClientBase, and the rest stays in IMAP4Client.

That, unfortunately, just forces everyone else to build their
applications my way though. Like I said, abstraction is not my strong
suite. :(

As I was typing this all out though, a possible solution occurred to
me: What if  IMAP4Client and IMAP4ClientIDLE were mixins instead? ex-
class myIDLEClient(IMAP4ClientIDLE, IMAP4ClientBase)
class myFullIMAPClient(IMAP4ClientIDLE, IMAP4Client, IMAP4ClientBase)

Thoughts?

Some questions I forgot to ask earlier: What exactly is the purpose of
a "Factory"? In some examples I've found, I saw methods that could
have gone in the Protocol subclass go instead in a ClientFactory. In
other examples there was no ClientFactory at all!
What is the exact purpose of an "Interface"?
Somewhat relatedly: As far as I can tell, the main "types" or "parts"
of a Twisted application are: reactor, Protocol, Interface (auth?),
and ClientFactory. Are there any I'm missing? And how exactly do all
these parts fit together?

Sorry for the base questions, but I've looked all over the
documentation and through the Twisted Essentials book and I can't find
a clear-cut answer to these questions and its driving me nuts!

Last question for the night (I swear!) - what kind of parsing and
dispatch system/method/technique did you have in mind to replace what
is currently in the library?

I looked over the commandDispatch method and I see what you were
talking about. But outside of using an actual parsing library like
pyparsing (which I think is disallowed in the twisted coding
guidlines, right?), or providing a static list of all possible IMAP
commands, I'm not sure what the options are.

Also, I don't follow your concern for backwards compatibility. Correct
me if I'm wrong, but method names are based upon the command that
triggers them. So even if you use a different method to parse the IMAP
command, the method names wouldn't change, right?

-- 
Dominic LoBue