[pymvpa] Sensitivity map with RFE?

Fri Jul 19 11:44:07 UTC 2013

Great, thank you Yaroslav!
This works perfectly.

Regarding the more philosophical question of what I want to do with it: I am 
leaving for holidays now, so I will have plenty of time to think about this ;-)

My idea was, since I am doing cross-subject classification, to use RFE and 
obtain a sensitivity map for each fold=1 subject left out, as a sensible measure 
of cross-subject generalization; and then maybe also, following the ideas 
developed in the PyMVPA Manual for feature selection, to take the per feature 
maximum of absolute sensitivities in any of the maps or some other 
representative measures to obtain one map I can more easily inspect for spatial 
anatomical sensitivity.

Hope it makes sense...

All the best,
Marco

> Yaroslav Halchenko debian at onerussian.com
> Thu Jul 18 21:46:00 UTC 2013
>
> here would be the complete snippet I am pushing in as a "usecase" unittest to
> obtain both sensitivities per each split and generalization errors from the
> cross-validation...  I will also file a bug report so we do not forget to
> address this issue:
>
>     clfsvm = LinearCSVMC()
>
>     rfesvm = RFE(clfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample()),
>                  CrossValidation(
>                      clfsvm,
>                      NFoldPartitioner(),
>                      errorfx=mean_mismatch_error, postproc=mean_sample()),
>                  Repeater(2),
>                  fselector=FractionTailSelector(0.70, mode='select', tail='upper'),
>                  stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
>                  update_sensitivity=True)
>
>     fclfsvm = FeatureSelectionClassifier(clfsvm, rfesvm)
>
>     sensanasvm = fclfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample())
>
>
>     # manually repeating/splitting so we do both RFE sensitivity and classification
>     senses, errors = [], []
>     for i, pset in enumerate(NFoldPartitioner().generate(fds)):
>         # split partitioned dataset
>         split = [d for d in Splitter('partitions').generate(pset)]
>         senses.append(sensanasvm(split[0])) # and it also should train the classifier so we would ask it about error
>         errors.append(mean_mismatch_error(fclfsvm.predict(split[1]), split[1].targets))
>
>     senses = vstack(senses)
>     errors = vstack(errors)
>
> probably the same could have been accomplished via a callback to
> CrossValidation.
>
> Also -- this construct is targetting a "correct" RFE procedure to
> estimate generalization errors, that is why stopping for RFE in each
> training split is deduced based on the nested cross-validation.  But if
> you do not care about generalization somehow and just want some
> sensitivity map based on RFE -- you probably could just sensanasvm(fds)
> to get it.... so the question would be -- what are you going to do with
> these results? ;)
 >
 >
 > On 07/18/2013 04:49 PM, marco tettamanti wrote:
> Sorry Yaroslav,
> but that surpasses my coding skills :-(
>
>> Yaroslav Halchenko debian at onerussian.com
>> Thu Jul 18 13:29:36 UTC 2013
>   >
>>          senses = []
>>          for i, pset in enumerate(NFoldPartitioner().generate(dataset)):
>>              # split partitioned dataset
>>              split = [d for d in Splitter('partitions').generate(pset)]
>>
>>              senses.append(senssvm(split[0]))
>
> Do you mean, something like:
>
> ----------------------------------------------------------
> clfsvm = SplitClassifier(LinearCSVMC(), NFoldPartitioner())
>
> rfesvm = RFE(clfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample()),
> ConfusionBasedError(clfsvm, confusion_state='stats'), Repeater(2),
> fselector=FractionTailSelector(0.30, mode='select', tail='upper'),
> stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
> train_pmeasure=False, update_sensitivity=True)
>
> fclfsvm = FeatureSelectionClassifier(clfsvm, rfesvm)
>
> sensanasvm = fclfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample())
>
> cv_sensana_svm = RepeatedMeasure(sensanasvm, NFoldPartitioner())
>
> senses = []
> for i, pset in enumerate(NFoldPartitioner().generate(fds)):
> # split partitioned dataset
> split = [d for d in Splitter('partitions').generate(pset)]
> senses.append(cv_sensana_svm(split[0]))
>    ----------------------------------------------------------
>
> This seems to run, senses incorporates 6 splits, but then I do not know how to
> proceed further to get the sensitivity map, with something like:
>
>
> print senses.samples    # print senses[0].samples ?
> ov = MapOverlap()
> stabil_overlap_fraction_svm = ov(senses.samples>  0)
> print stabil_overlap_fraction_svm
> niftiresults = map2nifti(fds, senses)
>
>
> Thank you again!
> Marco
>
>

-- 
Marco Tettamanti, Ph.D.
Nuclear Medicine Department & Division of Neuroscience
San Raffaele Scientific Institute
Via Olgettina 58
I-20132 Milano, Italy
Phone ++39-02-26434888
Fax ++39-02-26434892
Email: tettamanti.marco at hsr.it
Skype: mtettamanti