[pymvpa] RFE + SplitClassifier

Fri Jul 20 06:35:25 UTC 2012

Dear all,

I'm currently working on my master thesis and using the PyMVPA toolbox for
the analysis of my fMRI data. My script for Recursive Feature Elimination
(RFE) is working with a CrossValidation but unfortunately not with a
SplitClassifier. Could you please give me some advice on that?

In my script (see below) I use the RFE example from the documentation. If
I add a CrossValidation I get an error value for each validation step. But
I'm also interested in the sensitivity maps of each step and I couldn't
figure out if that is possible with CrossValidation. Therefore, I tried to
use a SplittClassifier but I always get the same error message in
self.train(ds).

Could someone tell me the difference between SplitClassifier and
CrossValidation? I assumed that the SplitClassifier also does a
cross-validation internally. What do I have to change in my code to make
it work?

Thank you very much in advance,
Matthias Hampel

    rfesvm_split = SplitClassifier(LinearCSVMC(), OddEvenPartitioner())

    rfe = RFE(rfesvm_split.get_sensitivity_analyzer(
            # take sensitivities per each split, L2 norm, mean, abs them
            postproc=ChainMapper([ FxMapper('features', l2_normed),
                                   FxMapper('samples', np.mean),
                                   FxMapper('samples', np.abs)])),
                  # use the error stored in the confusion matrix of split
classifier
                  ConfusionBasedError(rfesvm_split, confusion_state='stats'),
                  # we just extract error from confusion, so need to split
dataset
                  Repeater(2),
                  # select 20% of the best on each step
                  fselector=FractionTailSelector(
                      0.20,
                      mode='select', tail='upper'),
                  # and stop whenever error didn't improve for up to 10 steps
                  stopping_criterion=NBackHistoryStopCrit(BestDetector(),
10),
                  # we just extract it from existing confusion
                  train_pmeasure=False,
                  # but we do want to update sensitivities on each step
                  update_sensitivity=True)

    clf = FeatureSelectionClassifier(
            LinearCSVMC(),
            # on features selected via RFE
            rfe,
            # custom description
            descr='LinSVM+RFE(splits_avg)' )

    sclf = SplitClassifier(clf, enable_ca=['stats'])
    cv_sensana = sclf.get_sensitivity_analyzer()
    sens = cv_sensana(dataset)