[Pkg-exppsy-pymvpa] model selection and other questions

Mon Mar 31 14:24:37 UTC 2008

Hi Emanuele,

On Mon, Mar 31, 2008 at 03:49:57PM +0200, Emanuele Olivetti wrote:
> Hi all,
> 
> I've just installed PyMVPA after reading Yaroslav's thread
> on SciPy-dev. I work in machine learning and started to work
> recently on fMRI data. Since I'd like to contribute to
> scikits.learn (or better, to the effort of building machine
> learning tools to NumPy/SciPy) I'm wondering which will be
> the main machine learning framework for that community. As
> long as I've understood there is still debate among learn
> and PyMVPA but the latter is getting momentum (i.e., there
> are people actually working on it).
> 
> So here are some questions:
> - Is this the right place to discuss of PyMVPA? According to
> the mailing list archives there is more mailing on SciPy-dev
> than here.
Yes, this is definitely the place to discuss about PyMVPA.

> - After the Paris' sprint, is there evidence that PyMVPA
> will actually substitute scikits.learn in near future?
The current situation is that it will _not_ replace scikits.learn. We
intend to keep PyMVPA with its workflow separate from scikits.learn. In
Paris we agreed that scikits should rather be a collection of generic
algorithms. So in the future PyMVPA will be based on functionality in
scikits.learn, but the actual user interface with its workflow will not
necessarily be part of it.
One reason is that PyMVPA is somewhat focused on neuroimaging data and
scikits.learn has (or should have) a much wider focus.

> - Is there some model selection solution currently available
> in PyMVPA? I mean something to infer hyperparameters of the
> classifiers (like the sigma of the RBF kernel in SVMs) from
> data? Libsvm provides some extra tools for grid search; I
> work with optimization techniques that can be generalized to
> many classifiers etc. What about PyMVPA?
Having done the first release, this is now one of our main development targets.
We have plans for a generic optimzation interface, i.e. an OptimizedClassifier
that can perform a number of model selection algorithms. However, to be
really flexible, we have to do some work on unifying the parameter
interface of our classifiers. This will be done during the final phase of
the integration of the shogun toolbox (http://www.shogun-toolbox.org/),
which is already somewhat usable.

So, there probably won't be a separate algorithm sitting in top of the
classifiers (like libsvm grid search script), but another meta-classifier
that enables parameter optimization for every other type of classifier
and additionally can be used as a classifier on its own.

Michael

-- 
GPG key:  1024D/3144BE0F Michael Hanke
http://apsy.gse.uni-magdeburg.de/hanke
ICQ: 48230050