[pymvpa] SVM + RFE

Arman Eshaghi arman.eshaghi at gmail.com
Mon Jan 27 18:32:26 UTC 2014


Hi,
Thanks a lot for your response. Here are the details for my data and the
analysis, my code is exactly what I have sent in my first email. I tested
Haxby data with my code and it works perfectly fine. I am also happy to
send you codes with perhaps a part of data (all data is too huge).

All the best,
Arman

*My version of PyMVPA*

In [59]: mvpa2.suite.versions
Out[59]:
{'ctypes': SmartVersion ('1.1.0'),
 'ipython': SmartVersion ('1.1.0'),
 'lxml': SmartVersion ('2.2.3'),
 'matplotlib': SmartVersion ('0.99.1.1'),
 'nibabel': SmartVersion ('1.3.0'),
 'numpy': SmartVersion ('1.4.1'),
 'reportlab': SmartVersion ('2.3'),
 'scipy': SmartVersion ('0.7.2'),
 'skl': SmartVersion ('0.14.1')}


*Debug RFE*
[RFEC] DBG:                    Initiating RFE with training on <Dataset:
55x857366 at float32, <sa:
chunks,partitions,targets,time_coords,time_indices>, <fa: voxel_indices>,
<a:
imghdr,imgtype,lastpartitionset,lastsplit,mapper,partitions_set,repetitons,voxel_dim...>
and testing using <Dataset: 55x857366 at float32, <sa:
chunks,partitions,targets,time_coords,time_indices>, <fa: voxel_indices>,
<a:
imghdr,imgtype,lastpartitionset,lastsplit,mapper,partitions_set,repetitons,voxel_dim...>
[RFEC] DBG:                    Step 0: nfeatures=857366
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-63-a9da6f9cc192> in <module>()
----> 1 results=cvte(gm_lt)

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self,
ds)
    257                                    "used and auto training is
disabled."
    258                                    % str(self))
--> 259         return super(Learner, self).__call__(ds)
    260
    261

/usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self, ds)
    109
    110         self._precall(ds)
--> 111         result = self._call(ds)
    112         result = self._postcall(ds, result)
    113

/usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self,
ds)
    495         # always untrain to wipe out previous stats
    496         self.untrain()
--> 497         return super(CrossValidation, self)._call(ds)
    498
    499

/usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self,
ds)
    324                 ca.datasets.append(sds)
    325             # run the beast
--> 326             result = node(sds)
    327             # callback
    328             if not self._callback is None:

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self,
ds)
    257                                    "used and auto training is
disabled."
    258                                    % str(self))
--> 259         return super(Learner, self).__call__(ds)
    260
    261

/usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self, ds)
    109
    110         self._precall(ds)
--> 111         result = self._call(ds)
    112         result = self._postcall(ds, result)
    113

/usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self,
ds)
    589                     for i in
dstrain.get_attr(splitter.get_space())[0].unique])
    590         # ask splitter for first part
--> 591         measure.train(dstrain)
    592         # cleanup to free memory
    593         del dstrain

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self, ds)
    130             # things might have happened during pretraining
    131             if ds.nfeatures > 0:
--> 132                 result = self._train(ds)
    133             else:
    134                 warning("Trying to train on dataset with no
features present")

/usr/lib64/python2.6/site-packages/mvpa2/clfs/meta.pyc in _train(self,
dataset)
   1346         # XXX: should training be done using whole dataset or just
samples
   1347         # YYY: in some cases labels might be needed, thus better
full dataset
-> 1348         self.__mapper.train(dataset)
   1349
   1350         # for train() we have to provide dataset -- not just
samples to train!

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self, ds)
    130             # things might have happened during pretraining
    131             if ds.nfeatures > 0:
--> 132                 result = self._train(ds)
    133             else:
    134                 warning("Trying to train on dataset with no
features present")

/usr/lib64/python2.6/site-packages/mvpa2/featsel/rfe.pyc in _train(self, ds)
    246             # Compute sensitivity map
    247             if self.__update_sensitivity or sensitivity == None:
--> 248                 sensitivity = self._fmeasure(wdataset)
    249                 if len(sensitivity) > 1:
    250                     raise ValueError(

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self,
ds)
    251                     debug('LRN', "Auto-training %s on %s",
    252                           (self, ds))
--> 253                 self.train(ds)
    254             else:
    255                 # we always have to have trained before using a
learner

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self, ds)
    130             # things might have happened during pretraining
    131             if ds.nfeatures > 0:
--> 132                 result = self._train(ds)
    133             else:
    134                 warning("Trying to train on dataset with no
features present")

/usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _train(self,
dataset)
    805                     True:  "although it was trained previously"}
    806                    [clf.trained]))
--> 807         return clf.train(dataset)
    808
    809

/usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self, ds)
    130             # things might have happened during pretraining
    131             if ds.nfeatures > 0:
--> 132                 result = self._train(ds)
    133             else:
    134                 warning("Trying to train on dataset with no
features present")

/usr/lib64/python2.6/site-packages/mvpa2/clfs/meta.pyc in _train(self,
dataset)
   1268
   1269             if ca.is_enabled("stats"):
-> 1270                 predictions = clf.predict(split[1])
   1271
self.ca.stats.add(split[1].sa[targets_sa_name].value,
   1272                                           predictions,

IndexError: list index out of range


*print  MyData.summary()*

In [52]: print gm_lt.summary()
Dataset: 97x857366 at float32, <sa: chunks,targets,time_coords,time_indices>,
<fa: voxel_indices>, <a: imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
stats: mean=0.201735 std=0.44915 var=0.201735 min=0 max=1

Counts of targets in each chunk:
  chunks\targets  m   n
                 --- ---
       0.0        24  18
       1.0        25  30

Summary for targets across chunks
  targets mean std min max #chunks
    m     24.5 0.5  24  25    2
    n      24   6   18  30    2

Summary for chunks across targets
  chunks mean std min max #targets
    0     21   3   18  24     2
    1    27.5 2.5  25  30     2
Sequence statistics for 97 entries from set ['m', 'n']
Counter-balance table for orders up to 2:
Targets/Order O1     |  O2     |
      m:      47  2  |  45  4  |
      n:       1 46  |   2 44  |
Correlations: min=-0.73 max=0.92 mean=-0.01 sum(abs)=42


On Mon, Jan 27, 2014 at 9:14 PM, Yaroslav Halchenko
<debian at onerussian.com>wrote:

> Hi Arman,
>
> sorry about that -- we should figure it out since it seems to indeed
> match our docs and provided unit-test.
>
> FWIW -- we will beef up our development to kick out a new release some
> time soon, that one should include a helper class SplitRFE which would
> make it easier to specify a typical RFE procedure.
>
> Meanwhile -- would you be kind to let us know:
> - current version of used PyMVPA you use?
>
> - log output if you enable debugging for RFE:
>   export MVPA_DEBUG=RFE.*
>
>   or in python script
>
>   debug.active += ["RFE.*"]
> - details on your dataset:
>
>   print  MyData.summary()
>
> OR instead of last two:
>
> just take few non-degenerate features of it (ds = MyData[:, :20]),
> h5save and share with us with your code snippet (verify it runs and
> produces the same error on smaller ds) so we could reproduce/analyze ?
>
> Cheers!
>
> On Mon, 27 Jan 2014, Arman Eshaghi wrote:
>
> >    Dear all,�
> >    I'm struggling with recursive feature selection, a meta classifier
> (clf),
> >    and the final cross validation. Below is what I have done and what I
> get
> >    as error. I would very much appreciate if you could help me here.�
> >    #from rfe manual
> >    rfe = RFE(rfesvm_split.get_sensitivity_analyzer(postproc =
> ChainMapper([
> >    FxMapper('features', l2_normed), FxMapper('samples', np.mean),
> >    FxMapper('samples', np.abs)])), ConfusionBasedError(rfesvm_split,
> >    confusion_state='stats'),
> >    Repeater(2), fselector=FractionTailSelector(0.50, mode = 'select',
> tail=
> >    'upper'), stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
> >    train_pmeasure=False, update_sensitivity=True)
> >    #Meta-classifier with SVM as the final classifier and rfe as feature
> >    selector
> >    clf=FeatureSelectionClassifier(LinearCSVMC(), rfe, descr='SVM+RFE')
> >    #cross-validation
> >    cvte=CrossValidation(clf, HalfPartitioner(), enable_ca=['stats'])
> >    #running the analysis
> >    results=cvte(MyData)
> >    ERRORS:
> >    In [37]: results=cvte(gm_lt)
> >
>  ---------------------------------------------------------------------------
> >    IndexError � � � � � � � � � � � � � � � �Traceback (most recent call
> >    last)
> >    <ipython-input-37-a9da6f9cc192> in <module>()
> >    ----> 1 results=cvte(gm_lt)
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> >    __call__(self, ds)
> >    � � 257 � � � � � � � � � � � � � � � � � �"used and auto training is
> >    disabled."
> >    � � 258 � � � � � � � � � � � � � � � � � �% str(self))
> >    --> 259 � � � � return super(Learner, self).__call__(ds)
> >    � � 260�
> >    � � 261�
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in
> __call__(self,
> >    ds)
> >    � � 109�
> >    � � 110 � � � � self._precall(ds)
> >    --> 111 � � � � result = self._call(ds)
> >    � � 112 � � � � result = self._postcall(ds, result)
> >    � � 113�
> >    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in
> _call(self,
> >    ds)
> >    � � 495 � � � � # always untrain to wipe out previous stats
> >    � � 496 � � � � self.untrain()
> >    --> 497 � � � � return super(CrossValidation, self)._call(ds)
> >    � � 498�
> >    � � 499�
> >    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in
> _call(self,
> >    ds)
> >    � � 324 � � � � � � � � ca.datasets.append(sds)
> >    � � 325 � � � � � � # run the beast
> >    --> 326 � � � � � � result = node(sds)
> >    � � 327 � � � � � � # callback
> >    � � 328 � � � � � � if not self._callback is None:
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> >    __call__(self, ds)
> >    � � 257 � � � � � � � � � � � � � � � � � �"used and auto training is
> >    disabled."
> >    � � 258 � � � � � � � � � � � � � � � � � �% str(self))
> >    --> 259 � � � � return super(Learner, self).__call__(ds)
> >    � � 260�
> >    � � 261�
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in
> __call__(self,
> >    ds)
> >    � � 109�
> >    � � 110 � � � � self._precall(ds)
> >    --> 111 � � � � result = self._call(ds)
> >    � � 112 � � � � result = self._postcall(ds, result)
> >    � � 113�
> >    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in
> _call(self,
> >    ds)
> >    � � 589 � � � � � � � � � � for i in
> >    dstrain.get_attr(splitter.get_space())[0].unique])
> >    � � 590 � � � � # ask splitter for first part
> >    --> 591 � � � � measure.train(dstrain)
> >    � � 592 � � � � # cleanup to free memory
> >    � � 593 � � � � del dstrain
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> train(self,
> >    ds)
> >    � � 130 � � � � � � # things might have happened during pretraining
> >    � � 131 � � � � � � if ds.nfeatures > 0:
> >    --> 132 � � � � � � � � result = self._train(ds)
> >    � � 133 � � � � � � else:
> >    � � 134 � � � � � � � � warning("Trying to train on dataset with no
> >    features present")
> >    /usr/lib64/python2.6/site-packages/mvpa2/clfs/meta.pyc in _train(self,
> >    dataset)
> >    � �1346 � � � � # XXX: should training be done using whole dataset or
> just
> >    samples
> >    � �1347 � � � � # YYY: in some cases labels might be needed, thus
> better
> >    full dataset
> >    -> 1348 � � � � self.__mapper.train(dataset)
> >    � �1349�
> >    � �1350 � � � � # for train() we have to provide dataset -- not just
> >    samples to train!
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> train(self,
> >    ds)
> >    � � 130 � � � � � � # things might have happened during pretraining
> >    � � 131 � � � � � � if ds.nfeatures > 0:
> >    --> 132 � � � � � � � � result = self._train(ds)
> >    � � 133 � � � � � � else:
> >    � � 134 � � � � � � � � warning("Trying to train on dataset with no
> >    features present")
> >    /usr/lib64/python2.6/site-packages/mvpa2/featsel/rfe.pyc in
> _train(self,
> >    ds)
> >    � � 246 � � � � � � # Compute sensitivity map
> >    � � 247 � � � � � � if self.__update_sensitivity or sensitivity ==
> None:
> >    --> 248 � � � � � � � � sensitivity = self._fmeasure(wdataset)
> >    � � 249 � � � � � � � � if len(sensitivity) > 1:
> >    � � 250 � � � � � � � � � � raise ValueError(
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> >    __call__(self, ds)
> >    � � 251 � � � � � � � � � � debug('LRN', "Auto-training %s on %s",
> >    � � 252 � � � � � � � � � � � � � (self, ds))
> >    --> 253 � � � � � � � � self.train(ds)
> >    � � 254 � � � � � � else:
> >    � � 255 � � � � � � � � # we always have to have trained before using
> a
> >    learner
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> train(self,
> >    ds)
> >    � � 130 � � � � � � # things might have happened during pretraining
> >    � � 131 � � � � � � if ds.nfeatures > 0:
> >    --> 132 � � � � � � � � result = self._train(ds)
> >    � � 133 � � � � � � else:
> >    � � 134 � � � � � � � � warning("Trying to train on dataset with no
> >    features present")
> >    /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in
> _train(self,
> >    dataset)
> >    � � 805 � � � � � � � � � � True: �"although it was trained
> previously"}
> >    � � 806 � � � � � � � � � �[clf.trained]))
> >    --> 807 � � � � return clf.train(dataset)
> >    � � 808�
> >    � � 809�
> >    /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in
> train(self,
> >    ds)
> >    � � 130 � � � � � � # things might have happened during pretraining
> >    � � 131 � � � � � � if ds.nfeatures > 0:
> >    --> 132 � � � � � � � � result = self._train(ds)
> >    � � 133 � � � � � � else:
> >    � � 134 � � � � � � � � warning("Trying to train on dataset with no
> >    features present")
> >    /usr/lib64/python2.6/site-packages/mvpa2/clfs/meta.pyc in _train(self,
> >    dataset)
> >    � �1268�
> >    � �1269 � � � � � � if ca.is_enabled("stats"):
> >    -> 1270 � � � � � � � � predictions = clf.predict(split[1])
> >    � �1271 � � � � � � � �
> >    self.ca.stats.add(split[1].sa[targets_sa_name].value,
> >    � �1272 � � � � � � � � � � � � � � � � � � � � � predictions,
> >    IndexError: list index out of range
>
> > _______________________________________________
> > Pkg-ExpPsy-PyMVPA mailing list
> > Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> >
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
>
> --
> Yaroslav O. Halchenko, Ph.D.
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Senior Research Associate,     Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140127/180f9213/attachment-0001.html>


More information about the Pkg-ExpPsy-PyMVPA mailing list