[pymvpa] Cross Validation Output

Mon Apr 15 22:07:23 UTC 2013

Hi, Yaroslav.

As follows:

>>> print fds.summary()
Dataset: 39x172800 at float32, <sa:
chunks,targets,time_coords,time_indices>, <fa: voxel_indices>, <a:
imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
stats: mean=-0.00263811 std=0.566497 var=0.320918 min=-5.24132 max=5.29731
No details due to large number of targets or chunks. Increase maxc and
maxt if desired
Summary for targets across chunks
  targets  mean std min max #chunks
 Control  0.513 0.5  0   1     20
  Patient   0.487 0.5  0   1     19
Sequence statistics for 39 entries from set ['Control', 'Patient']
Counter-balance table for orders up to 2:
Targets/Order O1     |  O2     |
   Control:   19  1  |  18  2  |
    Patient:     0 18  |   0 17  |
Correlations: min=-0.95 max=0.9 mean=-0.026 sum(abs)=19

Obviously, it's not doing a great job, but it would be nice to know
that it's at least doing what I expect!

Hope this helps.

Thanks!
-Paul

On Mon, Apr 15, 2013 at 5:58 PM, Yaroslav Halchenko
<debian at onerussian.com> wrote:
> print fds.summary()
> ?
>
> On Mon, 15 Apr 2013, Paul Robinson wrote:
>
>> Hello!
>
>> I am just getting started with pyMVPA, and am a little concerned by
>> the output I am seeing of my classifier. Hopefully someone here can
>> tell me where I've gone astray. Example code I am executing is as
>> follows:
>
>> -------------------------------- Begin python
>> ---------------------------------------------------
>> > import sys, os, math
>> > import numpy as np
>> > from mvpa2.tutorial_suite import *
>
>> # Setup the data directory
>> > datapath=os.path.join('/home', 'UserName', 'Subject_Data', 'Study')
>
>> # This is just a text file containing the group label for each subject:
>> > attr = SampleAttributes(os.path.join(datapath, 'attributes.txt'))
>
>> # Just to see that we do in fact have two unique targets and n
>> independent samples (chunks should be independent)
>> > print np.unique(attr.targets)
>> > print np.unique(attr.chunks)
>
>> # Import data
>> > fds = fmri_dataset(samples=os.path.join(datapath, 'all_masked.nii.gz'), targets=attr.targets, chunks=attr.chunks)
>> # Wanna see how big this sucker is:
>> >print fds.shape
>
>> # Setup an SVM (or kNN) classifier:
>> > clf = LinearCSVMC()
>> # > clf = kNN(k=1, dfx=one_minus_correlation, voting='majority')
>
>> # Perform cross validation
>> > cvte = CrossValidation(clf, NFoldPartitioner(attr='chunks'), errorfx=lambda p, t: np.mean(p == t), enable_ca=['stats'])
>> > cv_results = cvte(fds)
>
>> # Get some output
>> > print cvte.ca.stats.as_string(description=True)
>
>
>> ----------------------------- End python
>> -------------------------------------------------------------
>
>> My data are a 4D stack of statistical maps (t-maps, all
>> co-registered), and I have an attributes.txt file with the first
>> column being the target labels ('control' or 'patient') and the second
>> column denoting the chunks (each row is a separate subject, so the
>> second column is just 0...[# of subjects]). First, is this the
>> appropriate way to set up my attributes file? E.g.:
>
>> Control 0
>> Control 1
>> ...
>> Patient 13
>> Patient 14
>> ...
>
>
>> I thought I'd just start with this before adding volumetric or
>> behavioural data, but I noticed what appeared to be a binary output
>> for each iteration of cross validation; like this:
>
>
>> >>> print cv_results.samples
>> [[ 0.]
>>  [ 0.]
>>  [ 1.]
>>  [ 0.]
>>  [ 1.]
>>  [ 1.]
>>  [ 0.]
>> ...
>
>
>> From the tutorial I was expecting some number between 0 and 1 on each
>> pass, but maybe I shouldn't if each line corresponds to a fold...
>
>> Anyway, if anyone can see anything obviously wrong with this (either
>> in setup or approach) I'd be most grateful for pointers. Please let me
>> know, too, if there's other information that would be helpful for you
>> to know in order to address the question.
>
>
>
>> Thanks very much,
>> Paul
>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
>
> --
> Yaroslav O. Halchenko
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Senior Research Associate,     Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa