[pymvpa] Sensitivity map with RFE?
Yaroslav Halchenko
debian at onerussian.com
Thu Jul 18 21:46:00 UTC 2013
here would be the complete snippet I am pushing in as a "usecase" unittest to
obtain both sensitivities per each split and generalization errors from the
cross-validation... I will also file a bug report so we do not forget to
address this issue:
clfsvm = LinearCSVMC()
rfesvm = RFE(clfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample()),
CrossValidation(
clfsvm,
NFoldPartitioner(),
errorfx=mean_mismatch_error, postproc=mean_sample()),
Repeater(2),
fselector=FractionTailSelector(0.70, mode='select', tail='upper'),
stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
update_sensitivity=True)
fclfsvm = FeatureSelectionClassifier(clfsvm, rfesvm)
sensanasvm = fclfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample())
# manually repeating/splitting so we do both RFE sensitivity and classification
senses, errors = [], []
for i, pset in enumerate(NFoldPartitioner().generate(fds)):
# split partitioned dataset
split = [d for d in Splitter('partitions').generate(pset)]
senses.append(sensanasvm(split[0])) # and it also should train the classifier so we would ask it about error
errors.append(mean_mismatch_error(fclfsvm.predict(split[1]), split[1].targets))
senses = vstack(senses)
errors = vstack(errors)
probably the same could have been accomplished via a callback to
CrossValidation.
Also -- this construct is targetting a "correct" RFE procedure to
estimate generalization errors, that is why stopping for RFE in each
training split is deduced based on the nested cross-validation. But if
you do not care about generalization somehow and just want some
sensitivity map based on RFE -- you probably could just sensanasvm(fds)
to get it.... so the question would be -- what are you going to do with
these results? ;)
On Thu, 18 Jul 2013, marco tettamanti wrote:
> Sorry Yaroslav,
> but that surpasses my coding skills :-(
> >Yaroslav Halchenko debian at onerussian.com
> >Thu Jul 18 13:29:36 UTC 2013
> > senses = []
> > for i, pset in enumerate(NFoldPartitioner().generate(dataset)):
> > # split partitioned dataset
> > split = [d for d in Splitter('partitions').generate(pset)]
> > senses.append(senssvm(split[0]))
> Do you mean, something like:
> ----------------------------------------------------------
> clfsvm = SplitClassifier(LinearCSVMC(), NFoldPartitioner())
> rfesvm =
> RFE(clfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample()),
> ConfusionBasedError(clfsvm, confusion_state='stats'), Repeater(2),
> fselector=FractionTailSelector(0.30, mode='select', tail='upper'),
> stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
> train_pmeasure=False, update_sensitivity=True)
> fclfsvm = FeatureSelectionClassifier(clfsvm, rfesvm)
> sensanasvm = fclfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample())
> cv_sensana_svm = RepeatedMeasure(sensanasvm, NFoldPartitioner())
> senses = []
> for i, pset in enumerate(NFoldPartitioner().generate(fds)):
> # split partitioned dataset
> split = [d for d in Splitter('partitions').generate(pset)]
> senses.append(cv_sensana_svm(split[0]))
> ----------------------------------------------------------
> This seems to run, senses incorporates 6 splits, but then I do not
> know how to proceed further to get the sensitivity map, with
> something like:
> print senses.samples # print senses[0].samples ?
> ov = MapOverlap()
> stabil_overlap_fraction_svm = ov(senses.samples > 0)
> print stabil_overlap_fraction_svm
> niftiresults = map2nifti(fds, senses)
> Thank you again!
> Marco
--
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-ExpPsy-PyMVPA
mailing list