[pymvpa] Emanuele? Re: Q about IterativeReliefOnline and more.......

Thu Oct 28 10:54:06 UTC 2010

Hi,

Sorry for the late answer and thanks Yarik for anticipating me.
Patrik's message wasn't lost, just in my long queue ;-)
Comments below inline.

On 10/27/2010 04:22 PM, Yaroslav Halchenko wrote:
> ...
> Emanuele,
>
> looking at the code of IterativeReliefOnline, it seems that it is
> "online" but also wrapped into outside convergence loop which (as far as I see
> it) demolishes the notion of online training:
>
>           while change>  self.threshold and iteration<  self.max_iter:
>              if __debug__:
>                  debug('IRELIEF', "Iteration %d" % iteration)
>
>              for t in range(NS):
>                  counter += 1.0
>                  n = random_sequence[t]
>                  ...
>
>    

You are right, the current implementation does not provide much of the 
"online"
aspect of the algorithm. My original intent was just having an additional
implementation of I-Relief that scaled better than the non-online one. 
In other
words I added the online algorithm because I needed something faster than
the plain I-Relief.

Since I had no other interests than scalability I did not spend time on the
actual "online" interface. Moreover I saw no real need for that interface
within PyMVPA despite the presence of few other online algorithms.

But yes, it would be cool to have such interface and to play with it as
Patrik would like to do. Maybe it does not require much work (see below).

> Also as a side-question: what was original purpose of w_guess --
> shouldn't it be used as a starting point (allowing actually online
> training in terms of consecutive calls to irelief with new data) -- now
> it is not used (besides conditioning initialization of w, and would lead
> to failure if you set it to something) ?
>
>    

You are right. The I-Relief online algorithm is meant to have an initial 
guess
that can be used for the online mechanisms and not only for random
initialization. The implementation provides that feature but again this
is not a full/nice online interface to do online feature scoring. But it 
is close :-)

So as a first step Patrik could pass a batch of data to the online 
I-Relief, get
w as answer, then provide the next batch together with w as w_guess (initial
guess) in order to have a preliminary online interface that can solve his
problem in the short term.

Yarik, most probably you are the best one to talk about design of interfaces
in PyMVPA. Are you considering the possibility to have a common "online"
interface for all the current and future algorithms that can be used in the
online-fashion? What would be the best thing to do?
Given the very nice architecture of PyMVPA I would not like something
specific just for I-Relief.

Regards,

Emanuele

P.S. About giving a bad initialization (w_guess) that could lead to 
failure. I do
not expect failure since the orginal author of the I-Relief algorithm 
proved it
to be convex (see reference in the docstring of the implementation). Sa 
a bad
initialization can slow down the convergence rate but I can't expect 
failure. I
mean excluding numerical instability issues ;-)