[pymvpa] Using FeatureSelectionClassifier for feature elmination

Thu Jul 17 02:04:04 UTC 2008

On Jul 16, 2008, at 16:36 , Yaroslav Halchenko wrote:

> then the rest is easy: feature_selector is a helper which selects
> features for us (discard 20% in current example), and  
> update_sensitivity says
> that we would like to reassess sensitivity of rfesvm at each step of  
> RFE, since
> otherwise we could simply compute sensitivity with all the features  
> and start
> pruning without explicitly retraining classifier for each selected  
> subset of
> features to get a new sensitivity (implicitly we still need to  
> retrain it
> because we use ConfusionBasedError, thus it would alter while we are  
> altering
> subset of the features. update_sensitivity = False is specific if we  
> use
> sensitivity_analyzer which is not classifier based, ie smth like  
> Anova.

so when we call clf.train(dataset_1) where clf is a SplitClassifier,  
it selects features and trains several classifiers using the splits?   
When we then call clf.predict(dataset_test.samples), it uses only  
those features selected during training?  (This is sort of the point  
of confusion for me -- i.e., when we call 'predict' on the trained  
classifier, that it uses the selected features).  Probably this is  
really obvious, but I just want to be 100% sure about this.

Thanks for the email -- generally it cleared up almost all of my  
questions.  I'd be happy to write up the docs for RFE based on what  
you wrote, once I get a better hang of using the package.

Cheers,
James.