[pymvpa] Sensitivity map with RFE?

Thu Jul 18 08:57:51 UTC 2013

Dear Yaroslav,

On 07/17/2013 11:16 PM, marco tettamanti wrote:
> I think that what you are suggesting is that I should go to GitHub and build a
> new pymvpa version from the master. I never did that, but I can try to find out
> how. There should be a way to build a .deb package from git, right?

Oh no, I see, that requires building from source!
I think I have now made it.

And indeed!!
The bug with the  "RuntimeError: Cannot reverse-map data since the original data 
shape is unknown. Either set `dshape` in the constructor, or call train()." 
being produced from RFE senssvm is now gone.

I can now obtain a sensitivity map with RFE and the results look meaningful.

Thank you a lot for helping me!

I also confirm that the problem with sensvm producing the same result for each 
fold still persists:

In [92]: print senssvm.samples
[[ 0.04880246  0.06381383  0.0809152  ...,  0.          0.          0.        ]
  [ 0.04880246  0.06381383  0.0809152  ...,  0.          0.          0.        ]
  [ 0.04880246  0.06381383  0.0809152  ...,  0.          0.          0.        ]
  [ 0.04880246  0.06381383  0.0809152  ...,  0.          0.          0.        ]
  [ 0.04880246  0.06381383  0.0809152  ...,  0.          0.          0.        ]
  [ 0.04880246  0.06381383  0.0809152  ...,  0.          0.          0.        ]]

The 6 sensitivities maps for each fold are identical with each other.

Best wishes and all the best,
Marco

> Anyway, below is the info about my system.
>
> Thank you again and all the best,
> Marco
>
> In [3]: mvpa2.wtf()
> Out[3]:
> Current date:   2013-07-17 23:01
> PyMVPA:
>    Version:       2.2.0
>    Hash:          ad955620e460965ce83c652bc690bea4dc2e21eb
>    Path:          /usr/lib/pymodules/python2.7/mvpa2/__init__.pyc
>    Version control (GIT):
>    GIT information could not be obtained due
> "/usr/lib/pymodules/python2.7/mvpa2/.. is not under GIT"
> SYSTEM:
>    OS:            posix Linux 3.9-1-amd64 #1 SMP Debian 3.9.8-1
>    Distribution:  debian/jessie/sid
>
> EXTERNALS:
>
>    Present:       atlas_fsl, cPickle, ctypes, good scipy.stats.rdist, good
> scipy.stats.rv_continuous._reduce_func(floc,fscale), good
> scipy.stats.rv_discrete.ppf, griddata, gzip, h5py, ipython, liblapack.so,
> libsvm, libsvm verbosity control, lxml, matplotlib, mdp, mdp ge 2.4, nibabel,
> nose, numpy, numpy_correct_unique, pprocess, pylab, pylab plottable, pywt, pywt
> wp reconstruct, reportlab, running ipython env, scipy, skl, weave
>
>    Absent:        atlas_pymvpa, cran-energy, elasticnet, glmnet, hcluster, lars,
> mass, nipy, nipy.neurospin, openopt, pywt wp reconstruct fixed, rpy2, sg ge
> 0.6.4, sg ge 0.6.5, sg_fixedcachesize, shogun, shogun.krr, shogun.lightsvm,
> shogun.mpd, shogun.svmocas, shogun.svrlight, statsmodels
>    Versions of critical externals:
>
>     reportlab   : 2.5
>
>     nibabel     : 1.3.0
>
>     matplotlib  : 1.1.1rc2
>
>     scipy       : 0.12.0
>
>     pprocess    : 0.5
>
>     ipython     : 0.13.2
>
>     skl         : 0.13.1
>
>     mdp         : 3.4
>
>     numpy       : 1.7.1
>
>     ctypes      : 1.1.0
>
>     matplotlib  : 1.1.1rc2
>
>     lxml        : 3.2.0
>
>     nifti       : failed to query due to "nifti is not a known dependency key."
>
>     numpy       : 1.7.1
>
>     pywt        : 0.2.0
>
>    Matplotlib backend: TkAgg
>
> RUNTIME:
>
>    PyMVPA Environment Variables:
>
>     PYTHONPATH          :
> ":/usr/lib/python2.7/lib-old:/home/marco/data/bll/abstrasomat_pymvpa/abstrasomat_pymvpa_36s:/usr/lib/python2.7/plat-x86_64-linux-gnu:/usr/lib/python2.7/lib-tk:/usr/lib/python2.7/lib-dynload:/usr/bin:.:/usr/lib/python2.7/dist-packages:/usr/lib/python2.7/dist-packages/PIL:/usr/lib/pymodules/python2.7:/usr/lib/python2.7/dist-packages/IPython/extensions:/usr/lib/python2.7:/usr/lib/python2.7/dist-packages/wx-2.8-gtk2-unicode:/home/marco/.python27_compiled:/usr/lib/python2.7/dist-packages/gtk-2.0:/usr/local/lib/python2.7/dist-packages"
>    PyMVPA Runtime Configuration:
>     [general]
>     verbose = 1
>
>     [externals]
>     have running ipython env = yes
>     have numpy = yes
>     have scipy = yes
>     have matplotlib = yes
>     have h5py = yes
>     have reportlab = yes
>     have weave = yes
>     have good scipy.stats.rdist = yes
>     have good scipy.stats.rv_discrete.ppf = yes
>     have good scipy.stats.rv_continuous._reduce_func(floc,fscale) = yes
>     have pylab = yes
>     have lars = no
>     have elasticnet = no
>     have glmnet = no
>     have skl = yes
>     have ctypes = yes
>     have libsvm = yes
>     have shogun = no
>     have openopt = no
>     have nibabel = yes
>     have mdp = yes
>     have mdp ge 2.4 = yes
>     have statsmodels = no
>     have pywt = yes
>     have cpickle = yes
>     have gzip = yes
>     have cran-energy = no
>     have griddata = yes
>     have nipy.neurospin = no
>     have lxml = yes
>     have atlas_fsl = yes
>     have atlas_pymvpa = no
>     have hcluster = no
>     have ipython = yes
>     have liblapack.so = yes
>     have libsvm verbosity control = yes
>     have mass = no
>     have nipy = no
>     have nose = yes
>     have numpy_correct_unique = yes
>     have pprocess = yes
>     have pylab plottable = yes
>     have pywt wp reconstruct = yes
>     have pywt wp reconstruct fixed = no
>     have rpy2 = no
>     have sg ge 0.6.4 = no
>     have sg ge 0.6.5 = no
>     have sg_fixedcachesize = no
>     have shogun.krr = no
>     have shogun.lightsvm = no
>     have shogun.mpd = no
>     have shogun.svmocas = no
>     have shogun.svrlight = no
>    Process Information:
>     Name: ipython
>     State:        R (running)
>     Tgid: 12965
>     Pid:  12965
>     PPid: 12923
>     TracerPid:    0
>     Uid:  1000    1000    1000    1000
>     Gid:  1000    1000    1000    1000
>     FDSize:       256
>     Groups:       6 20 24 25 27 29 30 44 46 100 104 113 114 116 1000 1002
>     VmPeak:         719900 kB
>     VmSize:         712644 kB
>     VmLck:               0 kB
>     VmPin:               0 kB
>     VmHWM:          101504 kB
>     VmRSS:          100268 kB
>     VmData:         233488 kB
>     VmStk:             136 kB
>     VmExe:            2280 kB
>     VmLib:           60212 kB
>     VmPTE:            1096 kB
>     VmSwap:              0 kB
>     Threads:      3
>     SigQ: 0/254508
>     SigPnd:       0000000000000000
>     ShdPnd:       0000000000000000
>     SigBlk:       0000000000000000
>     SigIgn:       0000000001001000
>     SigCgt:       0000000180000002
>     CapInh:       0000000000000000
>     CapPrm:       0000000000000000
>     CapEff:       0000000000000000
>     CapBnd:       0000001fffffffff
>     Seccomp:      0
>     Cpus_allowed: ff
>     Cpus_allowed_list:    0-7
>     Mems_allowed: 00000000,00000001
>     Mems_allowed_list:    0
>     voluntary_ctxt_switches:      1338
>     nonvoluntary_ctxt_switches:   171
>
>
>
>> [pymvpa] Sensitivity map with RFE?
>> Yaroslav Halchenko debian at onerussian.com
>> Wed Jul 17 18:49:41 UTC 2013
>>
>> Previous message: [pymvpa] Sensitivity map with RFE?
>> Next message: [pymvpa] how to train/test on a single partition?
>> Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>>
>> sorry about the delay -- and thanks for buzzing back.
>>
>> I have tried your snippet with 2.2.0 version as available in Debian --
>> reproduced your failure. But it works with the current master of PyMVPA, so we
>> fixed something relevant since the release -- could you give it a try with your
>> data? let us know if you need instructions (but then tell us about your system
>> -- output of mvpa2.wtf() should be sufficient ;) )
>>
>> another BUT:
>> it seems that your construct hits previously reported and presumably
>> "fixed" issue:
>> https://github.com/PyMVPA/PyMVPA/pull/53
>>
>> so whenever you check your code, please check -- is your senssvm
>> "degenerate" in that all of its samples are the same, e.g. in my case on
>> a dummy dataset
>>
>> *(Pydb) print senssvm.samples
>> [[ 2.01502448 2.77870027 0.73660355 2.43673417 0.83915333 0.30925885]
>> [ 2.01502448 2.77870027 0.73660355 2.43673417 0.83915333 0.30925885]
>> [ 2.01502448 2.77870027 0.73660355 2.43673417 0.83915333 0.30925885]
>> [ 2.01502448 2.77870027 0.73660355 2.43673417 0.83915333 0.30925885]]
>>
>>
>>
>> On Wed, 17 Jul 2013, marco tettamanti wrote:
>>
>>> Dear all, I apologize for the repost, but I didn't get any reply and I am
>>> pretty stuck with this issue. Is there anybody that could kindly provide any
>>> further advice?
>>
>>> Thank you and very best wishes, Marco
>> --
>> Yaroslav O. Halchenko, Ph.D.
>> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
>> Senior Research Associate, Psychological and Brain Sciences Dept.
>> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
>> Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
>> WWW: http://www.linkedin.com/in/yarik
>>
>
>
>
> -------- Original Message --------
> Subject: Re: Re: Sensitivity map with RFE?
> Date: Fri, 12 Jul 2013 15:13:07 +0200
> From: marco tettamanti<tettamanti.marco at hsr.it>
> To: pkg-exppsy-pymvpa at lists.alioth.debian.org
> <pkg-exppsy-pymvpa at lists.alioth.debian.org>
>
>
> On 07/17/2013 10:28 AM, marco tettamanti wrote:
>> Dear all, I apologize for the repost, but I didn't get any reply and I am
>> pretty stuck with this issue.
>> Is there anybody that could kindly provide any further advice?
>>
>> Thank you and very best wishes,
>> Marco
>>
>> -------- Original Message --------
>> Subject: Re: Re: Sensitivity map with RFE?
>> Date: Fri, 12 Jul 2013 15:13:07 +0200
>> From: marco tettamanti<tettamanti.marco at hsr.it>
>> To: pkg-exppsy-pymvpa at lists.alioth.debian.org
>> <pkg-exppsy-pymvpa at lists.alioth.debian.org>
>>
>> Dear Roberto,
>> thank you very much for your reply!
>> I have tried adding a line as you suggested, but nothing changed.
>>
>> Actually, I thought from the PyMVPA manual that it is the
>> 'RepeatedMeasure(sensanasvm, NFoldPartitioner())' that takes care of the
>> cross-validation.
>>
>> I have now tried modifying my code to make it more more similar to the example
>> in the RFE help documentation, with two different outcomes:
>>
>> 1) The following is basically equivalent to the snippet I sent previously and,
>> at least in my intentions, modelled upon the sensitivity analysis with feature
>> selection.
>> With this, I still get the error message "RuntimeError: Cannot reverse-map data
>> since the original data shape is unknown. Either set `dshape` in the
>> constructor, or call train()."
>>
>> #-----------------------------
>> clfsvm = SplitClassifier(LinearCSVMC(), NFoldPartitioner())
>>
>> rfesvm = RFE(clfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample()),
>> ConfusionBasedError(clfsvm, confusion_state='stats'), Repeater(2),
>> fselector=FractionTailSelector(0.30, mode='select', tail='upper'),
>> stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
>> train_pmeasure=False, update_sensitivity=True)
>>
>> fclfsvm = FeatureSelectionClassifier(clfsvm, rfesvm)
>>
>> sensanasvm = fclfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample())
>>
>> cv_sensana_svm = RepeatedMeasure(sensanasvm, NFoldPartitioner())
>>
>> senssvm = cv_sensana_svm(fds)
>>
>> print senssvm.shape
>> #-----------------------------
>>
>>
>> 2) I have also tried a different solution, which however I do not think that is
>> suited to produce a sensitivity map. The following does not yield any errors and
>> produces the correct map dimensionality. However, not suprisingly, the result
>> does not make any sense, as I get a uniform value spread across all brain mask
>> voxels.
>>
>> #-----------------------------
>> clfsvm = SplitClassifier(LinearCSVMC(), NFoldPartitioner())
>>
>> rfesvm = RFE(clfsvm.get_sensitivity_analyzer(postproc=maxofabs_sample()),
>> ConfusionBasedError(clfsvm, confusion_state='stats'), Repeater(2),
>> fselector=FractionTailSelector(0.30, mode='select', tail='upper'),
>> stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
>> train_pmeasure=False, update_sensitivity=True)
>>
>> fclfsvm = FeatureSelectionClassifier(clfsvm, rfesvm)
>>
>> cvtesvm = CrossValidation(fclfsvm, NFoldPartitioner(), errorfx=lambda p,
>> t:np.mean(p == t), postproc=maxofabs_sample(), enable_ca=['confusion', 'stats'])
>>
>> cv_sensana_svm = RepeatedMeasure(cvtesvm, NFoldPartitioner())
>>
>> senssvm = cv_sensana_svm(fds)
>> #-----------------------------
>>
>>
>>
>> Thank you all and very best wishes,
>> Marco
>>
>>> Date: Fri, 12 Jul 2013 11:36:21 +0200
>>> From: Roberto Guidotti<robbenson18 at gmail.com>
>>> To: Development and support of PyMVPA
>>> 	<pkg-exppsy-pymvpa at lists.alioth.debian.org>
>>> Subject: Re: [pymvpa] Sensitivity map with RFE?
>>> Message-ID:
>>> 	<CAGj93cHSQy-SqXE7H7YwP4tGnCEvBHW7ypSX8ENNF6FWLrk_jQ at mail.gmail.com>
>>> Content-Type: text/plain; charset="iso-8859-1"
>>>
>>> Dear Marco,
>>>
>>>>    From your snippet I notice that you get the sensitivity analyzer without
>>> run any CrossValidation/Classification.
>>>
>>> So try to run
>>> -------
>>> err = fclfsvm(fds) #That let you to cross validate/classify/select features
>>> on your dataset
>>> -------
>>> This let you to train your object and then you can get the sensitivity of
>>> your classifier.
>>>
>>> I've never used RFE, but this is what I get by your snippet.
>>>
>>> Ciao
>>> Roberto
>>>
>>>>>    Dear all,
>>>>>    is there any manner to obtain a sensitivity map with RFE, similar to what
>>>>>    can be done with feature selection?
>>>>>
>>>>>    I am trying to use the following code:
>>>>>
>>>>>    #-----------------------------**--------------------
>>>>>    clfsvm = LinearCSVMC()
>>>>>
>>>>>    rfesvm = RFE(clfsvm.get_sensitivity_**analyzer(postproc=maxofabs_**sample()),
>>>>>    CrossValidation(clfsvm, NFoldPartitioner(), errorfx=mean_mismatch_error,
>>>>>    postproc=mean_sample()), Repeater(2), fselector=**FractionTailSelector(0.30,
>>>>>    mode='select', tail='upper'), stopping_criterion=**NBackHistoryStopCrit(**BestDetector(),
>>>>>    10), update_sensitivity=True)
>>>>>
>>>>>    fclfsvm = FeatureSelectionClassifier(**clfsvm, rfesvm)
>>>>>
>>>>>    sensanasvm = fclfsvm.get_sensitivity_**analyzer(postproc=maxofabs_**
>>>>>    sample())
>>>>>
>>>>>    cv_sensana_svm = RepeatedMeasure(sensanasvm, NFoldPartitioner())
>>>>>
>>>>>    senssvm = cv_sensana_svm(fds)
>>>>>    #-----------------------------**--------------------
>>>>>
>>>>>    However, after a while I get the following error:
>>>>>
>>>>>    RuntimeError: Cannot reverse-map data since the original data shape is
>>>>>    unknown. Either set `dshape` in the constructor, or call train().
>>>>>
>>>>>
>>>>>    Thank you in advance for any help!
>>>>>    Best wishes,
>>>>>    Marco
>>
>

-- 
Marco Tettamanti, Ph.D.
Nuclear Medicine Department & Division of Neuroscience
San Raffaele Scientific Institute
Via Olgettina 58
I-20132 Milano, Italy
Phone ++39-02-26434888
Fax ++39-02-26434892
Email: tettamanti.marco at hsr.it
Skype: mtettamanti