[pymvpa] regression analysis to predict subject-specific score

Yaroslav Halchenko debian at onerussian.com
Sat Mar 10 03:50:53 UTC 2012


Hi David, sorry about the delay

you seems to be not missing anything and it should have worked.

Could you please verify that following code works for you:

from mvpa2.suite import *
from mvpa2.testing.datasets import datasets as testing_datasets

regr = SVM(svm_impl='NU_SVR', kernel= RbfLSKernel())
cv = CrossValidation(regr, NFoldPartitioner(),
                     postproc=mean_sample(),
                     errorfx=corr_error,
                     enable_ca=['training_stats', 'stats'])
print np.asscalar(cv(testing_datasets['chirp_linear']))

if it does on that testing dataset but doesn't on your dataset, that is
interesting and I would appreciate if you just share it with me to figure
out WTF:

h5save('/tmp/send2yarik.hdf5', dataset_roi)

BTW, regarding
> No details due to large number of targets or chunks. Increase maxc and maxt if desiredNumber of unique targets > 20 thus no sequence statistics

you might like to group your samples in smaller number of chunks to avoid
performing lengthy NFold cross-validation which is not optimized for magic of
SVM here.... moreover if you actually have 1 sample per chunk that correlation
error is pointless here since correlation can't be computed between two
observables with only 1 sample each.

Cheers

> Hi,

> I was trying to apply this code to one of my datasets, but it fails with a FailedToPredictError. I can get the code to work fine with the testing_dataset below, so I'm not sure what's wrong with my data. The data sets seem comparable, but I suspect I am missing something small...

> Thanks,
> David


> In [90]: print summary(dataset_roi)
> Dataset: 217x81 at float32, <sa: chunks,targets,time_coords,time_indices>, <fa: voxel_indices>, <a: imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
> stats: mean=-0.0284916 std=0.949083 var=0.900759 min=-3.14012 max=3.75654
> No details due to large number of targets or chunks. Increase maxc and maxt if desiredNumber of unique targets > 20 thus no sequence statistics



> In [87]: clf = SVM(svm_impl='NU_SVR',kernel= RbfLSKernel())

> In [88]: cv = CrossValidation(clf, NFoldPartitioner(), postproc=mean_sample(), errorfx=corr_error, enable_ca=['training_stats', 'stats'])

> In [89]: print cv(dataset_roi)
> ERROR: An unexpected error occurred while tokenizing input
> The following traceback may be corrupted or invalid
> The error message is: ('EOF in multi-line statement', (114, 0))

> ERROR: An unexpected error occurred while tokenizing input
> The following traceback may be corrupted or invalid
> The error message is: ('EOF in multi-line statement', (7, 0))

> ---------------------------------------------------------------------------
> FailedToPredictError                      Traceback (most recent call last)
> /mnt/BIAC/.users/smith/munin.dhe.duke.edu/Huettel/Imagene.02/Analysis/Framing/MVPA/<ipython-input-89-fa2a8d3342c0> in <module>()
> ----> 1 print cv(dataset_roi)

> /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self, ds)
>     235                                    "used and auto training is disabled."
>     236                                    % str(self))
> --> 237         return super(Learner, self).__call__(ds)
>     238 
>     239 

> /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self, ds)
>      74 
>      75         self._precall(ds)
> ---> 76         result = self._call(ds)
>      77         result = self._postcall(ds, result)
>      78 

> /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self, ds)
>     470         # always untrain to wipe out previous stats

>     471         self.untrain()
> --> 472         return super(CrossValidation, self)._call(ds)
>     473 
>     474 

> /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self, ds)
>     303                 ca.datasets.append(sds)
>     304             # run the beast

> --> 305             result = node(sds)
>     306             # callback

>     307             if not self._callback is None:

> /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self, ds)
>     235                                    "used and auto training is disabled."
>     236                                    % str(self))
> --> 237         return super(Learner, self).__call__(ds)
>     238 
>     239 

> /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self, ds)
>      74 
>      75         self._precall(ds)
> ---> 76         result = self._call(ds)
>      77         result = self._postcall(ds, result)
>      78 

> /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self, ds)
>     558                     for i in dstrain.get_attr(splitter.get_space())[0].unique])
>     559         # ask splitter for first part

> --> 560         measure.train(dstrain)
>     561         # cleanup to free memory

>     562         del dstrain

> /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self, ds)
>     135 
>     136         # and post-proc

> --> 137         result = self._posttrain(ds)
>     138 
>     139         # finally flag as trained


> /usr/lib64/python2.6/site-packages/mvpa2/clfs/base.pyc in _posttrain(self, dataset)
>     265                 # training_stats... sad

>     266                 self.__changedData_isset = False
> --> 267             predictions = self.predict(dataset)
>     268             self.ca.reset_changed_temporarily()
>     269             self.ca.training_stats = self.__summary_class__(

> /usr/lib64/python2.6/site-packages/mvpa2/clfs/base.pyc in wrap_samples(obj, data, *args, **kwargs)
>      46     def wrap_samples(obj, data, *args, **kwargs):
>      47         if is_datasetlike(data):
> ---> 48             return fx(obj, data, *args, **kwargs)
>      49         else:
>      50             return fx(obj, Dataset(data), *args, **kwargs)

> /usr/lib64/python2.6/site-packages/mvpa2/clfs/base.pyc in predict(self, dataset)
>     422                 raise FailedToPredictError, \
>     423                       "Failed to convert predictions from numeric into " \
> --> 424                       "literals: %s" % e
>     425 
>     426         self._postpredict(dataset, result)

> FailedToPredictError: Failed to convert predictions from numeric into literals: 89.333273242895899


> On Aug 11, 2011, at 12:00 PM, Yaroslav Halchenko wrote:

> > indeed we have no good tutorial/example for regressions yet besides one
> > for GPR

> > doc/examples/gpr.py

> > also we interface to SVR regressions (from libsvm and shogun) and started
> > to add interfaces to scikits.learn.  Some samples of them are available from
> > regrswh, so  on my system:

> > *In [5]: print '\n'.join([str(x) for x in regrswh[:]])
> > <libsvm epsilon-SVR>
> > <libsvm nu-SVR>
> > <sg.LinSVMR()/libsvr>
> > <skl.PLSRegression_1d()>
> > <skl.LARS()>
> > <skl.LassoLARS()>
> > <GPR(kernel='linear')>
> > <GPR(kernel='sqexp')>

> > As for "howto" -- just the same way you use classifiers -- then ConfusionMatrix
> > in .stats would be replaced with RegressionStatistics. e.g.

> > In [5]: from mvpa.suite import *

> > In [6]: from mvpa.testing.datasets import datasets as testing_datasets

> > In [7]: cve = CrossValidation(regrswh[:][0], NFoldPartitioner(), postproc=mean_sample(), errorfx=corr_error, enable_ca=['training_stats', 'stats'])

> > In [8]: print cve(testing_datasets['chirp_linear'])
> > <Dataset: 1x1 at float64, <sa: cvfolds>>

> > In [9]: print cve.ca.stats
> > Statistics  Mean  Std   Min       Max
> > ---------- ----- ----- -----     -----
> > Data:
> >   RMP_t   0.668 0.015 0.639     0.681
> >   STD_t   0.661 0.015 0.631     0.675
> >   RMP_p   0.644 0.043 0.593     0.731
> >   STD_p   0.637 0.042 0.583     0.721
> > Results:
> >    CCe     0.06 0.016 0.036     0.084
> >   RMSE    0.232 0.027 0.184     0.266
> > RMSE/RMP_t 0.348 0.043  0.27      0.4
> > Summary:
> >    CCe     0.06         p=  3.65268e-137
> >   RMSE     0.23
> > RMSE/RMP_t  0.35
> > # of sets   6


> > On Mon, 08 Aug 2011, Zhen Zonglei wrote:

> >>   Hi Guys:



> >>   I am trying to do multivariate regression analysis to predict
> >>   subject-specific score with pymvpa 0.6. But, I did not find some
> >>   examples about this in the manual. What regressions are implemented in
> >>   the toolbox? Could you please show me how to do regression analysis in
> >>   the toolbox?


> >>   Best


> >>   Zonglei Zhen

> >> _______________________________________________
> >> Pkg-ExpPsy-PyMVPA mailing list
> >> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> >> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa


> > -- 
> > =------------------------------------------------------------------=
> > Keep in touch                                     www.onerussian.com
> > Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic

> > _______________________________________________
> > Pkg-ExpPsy-PyMVPA mailing list
> > Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa


> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa


-- 
=------------------------------------------------------------------=
Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic



More information about the Pkg-ExpPsy-PyMVPA mailing list