[pymvpa] time complexity of IterativeRelief feature selection
MS Al-Rawi
rawi707 at yahoo.com
Thu Jan 13 12:47:47 UTC 2011
Hi Brian,
Although I'm not familiar with how IterativeRelief works, but, if the
complexity has O(T*N^(2*L)) then it might need few hundred years (or maybe much
more) to finish since the complexity here is ~ O(2^50,000) ... If L is 40,
you'll need roughly 1099511627776 operations, now, suppose each operation takes
1E-7 second ... then, for L =40 our processor will take:
1099511627776 *1E-7 /3600 ~= 30.5 hours
Now the problem is that 2^n is neither linear nor polynomial, in fact, it has
exponential growth. So my guess is that the complexity of O(T*N^(2*L)) is the
for the direct implementation of the algorithm and without any optimization to
speedup the process.
Regards,
-Rawi
________________________________
From: Emanuele Olivetti <emanuele at relativita.com>
To:Development and support of PyMVPA <pkg-exppsy-pymvpa at lists.alioth.debian.org>
Sent:Wed, January 12, 2011 3:33:31 PM
Subject:Re: [pymvpa] time complexity of IterativeRelief feature selection
On 01/12/2011 01:42 PM, Brian Murphy wrote:
>
> I've been running an IterativeRelief feature selection for five days now, and
>it still hasn't completed. Does anyone have experience to help me get a
>ball-park estimate of how long it should take? My dataset is ~300 samples by
>~25,000 features. I see on the API documentation that the algorithm has
>complexity "O(T*N^2*I), where T is the number of iterations, N the number of
>instances, I the number of features". Any idea what the number of iterations
>should/might be? I don't see a parameter to set this,
>
Hi Brian,
My guess is that at least one of the following situations occurs:
- The initial guess you are starting from is not good for your
problem. Did you normalize data?
- The threshold you are using is too low and so it take ages, maybe
without any real gain. Try to increase it. Again data nomalization
should play an important role.
- Your problem is not so friendly towards optimization :-) so a
stochastic gradient strategy like IterativeReliefOnline might help.
In any case I strongly suggest you to enable the debug mode
and observe the evolution of the convergence statistics. It will
tell you/us more on where is the problem.
Emanuele
_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20110113/b1482499/attachment.htm>
More information about the Pkg-ExpPsy-PyMVPA
mailing list