[pymvpa] regression analysis to predict subject-specific score

Mon Mar 12 15:53:32 UTC 2012

Thanks for the help. The code you sent works on the 'chirp_linear' dataset, but it fails with the FailedToPredictError on my data. Also, thanks for the point about the NFoldPartitioner. I'll switch over to the HalfPartitioner.

I just sent you the data using the command you provided. Let me know if you don't get it.

Cheers,
David

On Mar 9, 2012, at 10:50 PM, Yaroslav Halchenko wrote:

> Hi David, sorry about the delay
> 
> you seems to be not missing anything and it should have worked.
> 
> Could you please verify that following code works for you:
> 
> from mvpa2.suite import *
> from mvpa2.testing.datasets import datasets as testing_datasets
> 
> regr = SVM(svm_impl='NU_SVR', kernel= RbfLSKernel())
> cv = CrossValidation(regr, NFoldPartitioner(),
>                     postproc=mean_sample(),
>                     errorfx=corr_error,
>                     enable_ca=['training_stats', 'stats'])
> print np.asscalar(cv(testing_datasets['chirp_linear']))
> 
> if it does on that testing dataset but doesn't on your dataset, that is
> interesting and I would appreciate if you just share it with me to figure
> out WTF:
> 
> h5save('/tmp/send2yarik.hdf5', dataset_roi)
> 
> BTW, regarding
>> No details due to large number of targets or chunks. Increase maxc and maxt if desiredNumber of unique targets > 20 thus no sequence statistics
> 
> you might like to group your samples in smaller number of chunks to avoid
> performing lengthy NFold cross-validation which is not optimized for magic of
> SVM here.... moreover if you actually have 1 sample per chunk that correlation
> error is pointless here since correlation can't be computed between two
> observables with only 1 sample each.
> 
> Cheers
> 
>> Hi,
> 
>> I was trying to apply this code to one of my datasets, but it fails with a FailedToPredictError. I can get the code to work fine with the testing_dataset below, so I'm not sure what's wrong with my data. The data sets seem comparable, but I suspect I am missing something small...
> 
>> Thanks,
>> David
> 
> 
>> In [90]: print summary(dataset_roi)
>> Dataset: 217x81 at float32, <sa: chunks,targets,time_coords,time_indices>, <fa: voxel_indices>, <a: imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
>> stats: mean=-0.0284916 std=0.949083 var=0.900759 min=-3.14012 max=3.75654
>> No details due to large number of targets or chunks. Increase maxc and maxt if desiredNumber of unique targets > 20 thus no sequence statistics
> 
> 
> 
>> In [87]: clf = SVM(svm_impl='NU_SVR',kernel= RbfLSKernel())
> 
>> In [88]: cv = CrossValidation(clf, NFoldPartitioner(), postproc=mean_sample(), errorfx=corr_error, enable_ca=['training_stats', 'stats'])
> 
>> In [89]: print cv(dataset_roi)
>> ERROR: An unexpected error occurred while tokenizing input
>> The following traceback may be corrupted or invalid
>> The error message is: ('EOF in multi-line statement', (114, 0))
> 
>> ERROR: An unexpected error occurred while tokenizing input
>> The following traceback may be corrupted or invalid
>> The error message is: ('EOF in multi-line statement', (7, 0))
> 
>> ---------------------------------------------------------------------------
>> FailedToPredictError                      Traceback (most recent call last)
>> /mnt/BIAC/.users/smith/munin.dhe.duke.edu/Huettel/Imagene.02/Analysis/Framing/MVPA/<ipython-input-89-fa2a8d3342c0> in <module>()
>> ----> 1 print cv(dataset_roi)
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self, ds)
>>    235                                    "used and auto training is disabled."
>>    236                                    % str(self))
>> --> 237         return super(Learner, self).__call__(ds)
>>    238 
>>    239 
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self, ds)
>>     74 
>>     75         self._precall(ds)
>> ---> 76         result = self._call(ds)
>>     77         result = self._postcall(ds, result)
>>     78 
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self, ds)
>>    470         # always untrain to wipe out previous stats
> 
>>    471         self.untrain()
>> --> 472         return super(CrossValidation, self)._call(ds)
>>    473 
>>    474 
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self, ds)
>>    303                 ca.datasets.append(sds)
>>    304             # run the beast
> 
>> --> 305             result = node(sds)
>>    306             # callback
> 
>>    307             if not self._callback is None:
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in __call__(self, ds)
>>    235                                    "used and auto training is disabled."
>>    236                                    % str(self))
>> --> 237         return super(Learner, self).__call__(ds)
>>    238 
>>    239 
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/base/node.pyc in __call__(self, ds)
>>     74 
>>     75         self._precall(ds)
>> ---> 76         result = self._call(ds)
>>     77         result = self._postcall(ds, result)
>>     78 
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/measures/base.pyc in _call(self, ds)
>>    558                     for i in dstrain.get_attr(splitter.get_space())[0].unique])
>>    559         # ask splitter for first part
> 
>> --> 560         measure.train(dstrain)
>>    561         # cleanup to free memory
> 
>>    562         del dstrain
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/base/learner.pyc in train(self, ds)
>>    135 
>>    136         # and post-proc
> 
>> --> 137         result = self._posttrain(ds)
>>    138 
>>    139         # finally flag as trained
> 
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/clfs/base.pyc in _posttrain(self, dataset)
>>    265                 # training_stats... sad
> 
>>    266                 self.__changedData_isset = False
>> --> 267             predictions = self.predict(dataset)
>>    268             self.ca.reset_changed_temporarily()
>>    269             self.ca.training_stats = self.__summary_class__(
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/clfs/base.pyc in wrap_samples(obj, data, *args, **kwargs)
>>     46     def wrap_samples(obj, data, *args, **kwargs):
>>     47         if is_datasetlike(data):
>> ---> 48             return fx(obj, data, *args, **kwargs)
>>     49         else:
>>     50             return fx(obj, Dataset(data), *args, **kwargs)
> 
>> /usr/lib64/python2.6/site-packages/mvpa2/clfs/base.pyc in predict(self, dataset)
>>    422                 raise FailedToPredictError, \
>>    423                       "Failed to convert predictions from numeric into " \
>> --> 424                       "literals: %s" % e
>>    425 
>>    426         self._postpredict(dataset, result)
> 
>> FailedToPredictError: Failed to convert predictions from numeric into literals: 89.333273242895899
> 
> 
>> On Aug 11, 2011, at 12:00 PM, Yaroslav Halchenko wrote:
> 
>>> indeed we have no good tutorial/example for regressions yet besides one
>>> for GPR
> 
>>> doc/examples/gpr.py
> 
>>> also we interface to SVR regressions (from libsvm and shogun) and started
>>> to add interfaces to scikits.learn.  Some samples of them are available from
>>> regrswh, so  on my system:
> 
>>> *In [5]: print '\n'.join([str(x) for x in regrswh[:]])
>>> <libsvm epsilon-SVR>
>>> <libsvm nu-SVR>
>>> <sg.LinSVMR()/libsvr>
>>> <skl.PLSRegression_1d()>
>>> <skl.LARS()>
>>> <skl.LassoLARS()>
>>> <GPR(kernel='linear')>
>>> <GPR(kernel='sqexp')>
> 
>>> As for "howto" -- just the same way you use classifiers -- then ConfusionMatrix
>>> in .stats would be replaced with RegressionStatistics. e.g.
> 
>>> In [5]: from mvpa.suite import *
> 
>>> In [6]: from mvpa.testing.datasets import datasets as testing_datasets
> 
>>> In [7]: cve = CrossValidation(regrswh[:][0], NFoldPartitioner(), postproc=mean_sample(), errorfx=corr_error, enable_ca=['training_stats', 'stats'])
> 
>>> In [8]: print cve(testing_datasets['chirp_linear'])
>>> <Dataset: 1x1 at float64, <sa: cvfolds>>
> 
>>> In [9]: print cve.ca.stats
>>> Statistics  Mean  Std   Min       Max
>>> ---------- ----- ----- -----     -----
>>> Data:
>>>  RMP_t   0.668 0.015 0.639     0.681
>>>  STD_t   0.661 0.015 0.631     0.675
>>>  RMP_p   0.644 0.043 0.593     0.731
>>>  STD_p   0.637 0.042 0.583     0.721
>>> Results:
>>>   CCe     0.06 0.016 0.036     0.084
>>>  RMSE    0.232 0.027 0.184     0.266
>>> RMSE/RMP_t 0.348 0.043  0.27      0.4
>>> Summary:
>>>   CCe     0.06         p=  3.65268e-137
>>>  RMSE     0.23
>>> RMSE/RMP_t  0.35
>>> # of sets   6
> 
> 
>>> On Mon, 08 Aug 2011, Zhen Zonglei wrote:
> 
>>>>  Hi Guys:
> 
> 
> 
>>>>  I am trying to do multivariate regression analysis to predict
>>>>  subject-specific score with pymvpa 0.6. But, I did not find some
>>>>  examples about this in the manual. What regressions are implemented in
>>>>  the toolbox? Could you please show me how to do regression analysis in
>>>>  the toolbox?
> 
> 
>>>>  Best
> 
> 
>>>>  Zonglei Zhen
> 
>>>> _______________________________________________
>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
> 
> 
>>> -- 
>>> =------------------------------------------------------------------=
>>> Keep in touch                                     www.onerussian.com
>>> Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic
> 
>>> _______________________________________________
>>> Pkg-ExpPsy-PyMVPA mailing list
>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
> 
> 
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
> 
> 
> -- 
> =------------------------------------------------------------------=
> Keep in touch                                     www.onerussian.com
> Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic
> 
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa