[pymvpa] Feature selection for multiple classifiers

Wed Jul 30 11:35:34 UTC 2014

On Jul 30, 2014, at 12:24 AM, Richard Dinga <dinga92 at gmail.com> wrote:

> I have a question in regards of feature selection if more than one classifier is involved, because there are more than two classes. If I understand it correctly,  in multi-class problem PyMVPA will train a classifier for every possible pair of classes and result is decided by vote. So if I select beforehand 100 best voxels by anova, all the classifiers would be trained on them and there is possibility that in this subset wouldn't be informative voxels for all the classes. How can I do it in a way that every classifier would choose best voxels for it's own pair of classes?

You could do nested crossvalidation.

An example is here:

http://dev.pymvpa.org/examples/nested_cv.html

and a paper describing the importance of nesting is here:

http://nilab.cimec.unitn.it/people/olivetti/work/papers/olivetti2010brain.pdf

> 
> And related question: lets say I have 8 classes and I created tree like this 
> clf = TreeClassifier(FeatureSelectionClassifier(LinearCSVMC(), fsel),
>                      {'a': ((1,2), LinearCSVMC()),
>                       'b': ((3,4), LinearCSVMC()),
>                       'c': ((5,6), LinearCSVMC()),
>                       'd': ((7,8), LinearCSVMC())})
> 
> The first classifier would select best voxels for dividing in 8 or 4 classes?

Into four classes.

> And on which voxels would be the secondary classifiers trained?

I'm pretty sure it would use /all/ voxels (features).