[pymvpa] Individual measures for subjects

Fri Jan 31 17:52:28 UTC 2014

On Fri, 31 Jan 2014, Arman Eshaghi wrote:

>    MyData is structural MRI data coming from fmri_dataset function. There are
>    two chunks, and similar to clf.predictions (in tutorial), I'm wondering
>    whether I can get each predicted label, because I want to compare AUC in

so each sample is a subject. ok
cvte.stats.sets would have sets of original targets and their
predictions per each cross-validation split.

also if you set your errorfx=None I guess you would also get raw
predictions (and possibly original targets) in your  results... yeap:

In [2]: cv = CrossValidation(kNN(), HalfPartitioner(attr='chunks'), errorfx=None, enable_ca=['stats'])

In [3]: from mvpa2.testing.datasets import datasets as tdatasets

In [4]: results = cv(tdatasets['uni2small'])

In [5]: results
Out[5]: <Dataset: 24x1@|S2, <sa: cvfolds,targets>>

In [6]: print results.targets, results.samples
['L0' 'L0' 'L0' 'L0' 'L0' 'L0' 'L1' 'L1' 'L1' 'L1' 'L1' 'L1' 'L0' 'L0' 'L0'
 'L0' 'L0' 'L0' 'L1' 'L1' 'L1' 'L1' 'L1' 'L1'] [['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L1']
 ['L1']
 ['L1']
 ['L1']
 ['L1']
 ['L1']
 ['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L0']
 ['L1']
 ['L1']
 ['L1']
 ['L1']
 ['L1']]

*In [8]: print cv.ca.stats.sets
[(array(['L0', 'L0', 'L0', 'L0', 'L0', 'L0', 'L1', 'L1', 'L1', 'L1', 'L1',
       'L1'], 
      dtype='|S2'), array(['L0', 'L0', 'L0', 'L0', 'L0', 'L0', 'L1', 'L1', 'L1', 'L1', 'L1',
       'L1'], 
      dtype='|S2'), [{'L0': 1.0, 'L1': 1.0}, {'L0': 1.0, 'L1': 1.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}]), (array(['L0', 'L0', 'L0', 'L0', 'L0', 'L0', 'L1', 'L1', 'L1', 'L1', 'L1',
       'L1'], 
      dtype='|S2'), array(['L0', 'L0', 'L0', 'L0', 'L0', 'L0', 'L0', 'L1', 'L1', 'L1', 'L1',
       'L1'], 
      dtype='|S2'), [{'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 2.0, 'L1': 0.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}, {'L0': 0.0, 'L1': 2.0}])]

and here are some snippets for you for AUC (you need a classifier which
would provide estimates, not just final decisions):

*In [10]: print cv.ca.stats.stats['AUC']
[nan, nan]

*In [11]: cv = CrossValidation(SMLR(enable_ca=['estimates']), HalfPartitioner(attr='chunks'), errorfx=None, enable_ca=['stats'])

In [12]: results = cv(tdatasets['uni2small'])

In [13]: print cv.ca.stats.stats['AUC']
[1.0, 1.0]

In [14]: tdatasets['uni2small'].samples += np.random.normal(size=tdatasets['uni2small'].shape)*0.5

In [15]: results = cv(tdatasets['uni2small'])

In [16]: print cv.ca.stats.stats['AUC']
[0.81944444444444442, 0.81944444444444442]

*In [17]: results = cv(tdatasets['uni4small'])

In [18]: print cv.ca.stats.stats['AUC']
[1.0, 1.0, 1.0, 1.0]

*In [19]: tdatasets['uni4small'].samples += np.random.normal(size=tdatasets['uni4small'].shape)*0.5

In [20]: results = cv(tdatasets['uni4small'])

In [21]: print cv.ca.stats.stats['AUC']
[0.64814814814814814, 0.68518518518518512, 0.76388888888888884, 0.55092592592592593]

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik