[pymvpa] time complexity of IterativeRelief feature selection
Emanuele Olivetti
emanuele at relativita.com
Wed Jan 12 15:33:31 UTC 2011
On 01/12/2011 01:42 PM, Brian Murphy wrote:
>
> I've been running an IterativeRelief feature selection for five days now, and it still
> hasn't completed. Does anyone have experience to help me get a ball-park estimate of how
> long it should take? My dataset is ~300 samples by ~25,000 features. I see on the API
> documentation that the algorithm has complexity "O(T*N^2*I), where T is the number of
> iterations, N the number of instances, I the number of features". Any idea what the
> number of iterations should/might be? I don't see a parameter to set this,
>
Hi Brian,
My guess is that at least one of the following situations occurs:
- The initial guess you are starting from is not good for your
problem. Did you normalize data?
- The threshold you are using is too low and so it take ages, maybe
without any real gain. Try to increase it. Again data nomalization
should play an important role.
- Your problem is not so friendly towards optimization :-) so a
stochastic gradient strategy like IterativeReliefOnline might help.
In any case I strongly suggest you to enable the debug mode
and observe the evolution of the convergence statistics. It will
tell you/us more on where is the problem.
Emanuele
More information about the Pkg-ExpPsy-PyMVPA
mailing list