[pymvpa] SVM + RFE

Mon Jan 27 17:44:34 UTC 2014

Hi Arman,

sorry about that -- we should figure it out since it seems to indeed
match our docs and provided unit-test.

FWIW -- we will beef up our development to kick out a new release some
time soon, that one should include a helper class SplitRFE which would
make it easier to specify a typical RFE procedure.

Meanwhile -- would you be kind to let us know:
- current version of used PyMVPA you use?

- log output if you enable debugging for RFE:
  export MVPA_DEBUG=RFE.* 

  or in python script

  debug.active += ["RFE.*"]
- details on your dataset:

  print  MyData.summary()

OR instead of last two:

just take few non-degenerate features of it (ds = MyData[:, :20]),
h5save and share with us with your code snippet (verify it runs and
produces the same error on smaller ds) so we could reproduce/analyze ?

Cheers!

On Mon, 27 Jan 2014, Arman Eshaghi wrote:

>    Dear all,�
>    I'm struggling with recursive feature selection, a meta classifier (clf),
>    and the final cross validation. Below is what I have done and what I get
>    as error. I would very much appreciate if you could help me here.�
>    #from rfe manual
>    rfe = RFE(rfesvm_split.get_sensitivity_analyzer(postproc = ChainMapper([
>    FxMapper('features', l2_normed), FxMapper('samples', np.mean),
>    FxMapper('samples', np.abs)])), ConfusionBasedError(rfesvm_split,
>    confusion_state='stats'),
>    Repeater(2), fselector=FractionTailSelector(0.50, mode = 'select', tail=
>    'upper'), stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
>    train_pmeasure=False, update_sensitivity=True)
>    #Meta-classifier with SVM as the final classifier and rfe as feature
>    selector
>    clf=FeatureSelectionClassifier(LinearCSVMC(), rfe, descr='SVM+RFE')
>    #cross-validation
>    cvte=CrossValidation(clf, HalfPartitioner(), enable_ca=['stats'])
>    #running the analysis
>    results=cvte(MyData)
>    ERRORS:
>    In [37]: results=cvte(gm_lt)
>    ---------------------------------------------------------------------------
>    IndexError � � � � � � � � � � � � � � � �Traceback (most recent call
>    last)
>    <ipython-input-37-a9da6f9cc192> in <module>()
>    ----> 1 results=cvte(gm_lt)
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
>    __call__(self, ds)
>    � � 257 � � � � � � � � � � � � � � � � � �"used and auto training is
>    disabled."
>    � � 258 � � � � � � � � � � � � � � � � � �% str(self))
>    --> 259 � � � � return super(Learner, self).__call__(ds)
>    � � 260�
>    � � 261�
>    /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self,
>    ds)
>    � � 109�
>    � � 110 � � � � self._precall(ds)
>    --> 111 � � � � result = self._call(ds)
>    � � 112 � � � � result = self._postcall(ds, result)
>    � � 113�
>    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self,
>    ds)
>    � � 495 � � � � # always untrain to wipe out previous stats
>    � � 496 � � � � self.untrain()
>    --> 497 � � � � return super(CrossValidation, self)._call(ds)
>    � � 498�
>    � � 499�
>    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self,
>    ds)
>    � � 324 � � � � � � � � ca.datasets.append(sds)
>    � � 325 � � � � � � # run the beast
>    --> 326 � � � � � � result = node(sds)
>    � � 327 � � � � � � # callback
>    � � 328 � � � � � � if not self._callback is None:
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
>    __call__(self, ds)
>    � � 257 � � � � � � � � � � � � � � � � � �"used and auto training is
>    disabled."
>    � � 258 � � � � � � � � � � � � � � � � � �% str(self))
>    --> 259 � � � � return super(Learner, self).__call__(ds)
>    � � 260�
>    � � 261�
>    /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self,
>    ds)
>    � � 109�
>    � � 110 � � � � self._precall(ds)
>    --> 111 � � � � result = self._call(ds)
>    � � 112 � � � � result = self._postcall(ds, result)
>    � � 113�
>    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self,
>    ds)
>    � � 589 � � � � � � � � � � for i in
>    dstrain.get_attr(splitter.get_space())[0].unique])
>    � � 590 � � � � # ask splitter for first part
>    --> 591 � � � � measure.train(dstrain)
>    � � 592 � � � � # cleanup to free memory
>    � � 593 � � � � del dstrain
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self,
>    ds)
>    � � 130 � � � � � � # things might have happened during pretraining
>    � � 131 � � � � � � if ds.nfeatures > 0:
>    --> 132 � � � � � � � � result = self._train(ds)
>    � � 133 � � � � � � else:
>    � � 134 � � � � � � � � warning("Trying to train on dataset with no
>    features present")
>    /usr/lib64/python2.6/site-packages/mvpa2/clfs/meta.pyc in _train(self,
>    dataset)
>    � �1346 � � � � # XXX: should training be done using whole dataset or just
>    samples
>    � �1347 � � � � # YYY: in some cases labels might be needed, thus better
>    full dataset
>    -> 1348 � � � � self.__mapper.train(dataset)
>    � �1349�
>    � �1350 � � � � # for train() we have to provide dataset -- not just
>    samples to train!
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self,
>    ds)
>    � � 130 � � � � � � # things might have happened during pretraining
>    � � 131 � � � � � � if ds.nfeatures > 0:
>    --> 132 � � � � � � � � result = self._train(ds)
>    � � 133 � � � � � � else:
>    � � 134 � � � � � � � � warning("Trying to train on dataset with no
>    features present")
>    /usr/lib64/python2.6/site-packages/mvpa2/featsel/rfe.pyc in _train(self,
>    ds)
>    � � 246 � � � � � � # Compute sensitivity map
>    � � 247 � � � � � � if self.__update_sensitivity or sensitivity == None:
>    --> 248 � � � � � � � � sensitivity = self._fmeasure(wdataset)
>    � � 249 � � � � � � � � if len(sensitivity) > 1:
>    � � 250 � � � � � � � � � � raise ValueError(
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
>    __call__(self, ds)
>    � � 251 � � � � � � � � � � debug('LRN', "Auto-training %s on %s",
>    � � 252 � � � � � � � � � � � � � (self, ds))
>    --> 253 � � � � � � � � self.train(ds)
>    � � 254 � � � � � � else:
>    � � 255 � � � � � � � � # we always have to have trained before using a
>    learner
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self,
>    ds)
>    � � 130 � � � � � � # things might have happened during pretraining
>    � � 131 � � � � � � if ds.nfeatures > 0:
>    --> 132 � � � � � � � � result = self._train(ds)
>    � � 133 � � � � � � else:
>    � � 134 � � � � � � � � warning("Trying to train on dataset with no
>    features present")
>    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _train(self,
>    dataset)
>    � � 805 � � � � � � � � � � True: �"although it was trained previously"}
>    � � 806 � � � � � � � � � �[clf.trained]))
>    --> 807 � � � � return clf.train(dataset)
>    � � 808�
>    � � 809�
>    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self,
>    ds)
>    � � 130 � � � � � � # things might have happened during pretraining
>    � � 131 � � � � � � if ds.nfeatures > 0:
>    --> 132 � � � � � � � � result = self._train(ds)
>    � � 133 � � � � � � else:
>    � � 134 � � � � � � � � warning("Trying to train on dataset with no
>    features present")
>    /usr/lib64/python2.6/site-packages/mvpa2/clfs/meta.pyc in _train(self,
>    dataset)
>    � �1268�
>    � �1269 � � � � � � if ca.is_enabled("stats"):
>    -> 1270 � � � � � � � � predictions = clf.predict(split[1])
>    � �1271 � � � � � � � �
>    self.ca.stats.add(split[1].sa[targets_sa_name].value,
>    � �1272 � � � � � � � � � � � � � � � � � � � � � predictions,
>    IndexError: list index out of range

> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik