[pymvpa] extracting individual sample predictions from CV

Fri Jun 3 14:31:52 UTC 2011

Hi,

thanks for those tips. The info from Francisco is very useful, and I'll 
read up on that now. And here is the short bit of code I now have that 
extracts the sample-wise predictions, assuming you are using a standard 
n-fold partitioning strategy with sequential chunks:

cv = CrossValidation(...
...)
results = cv(ds)

sPredictions = N.array([]) # sample-wise predictions
sTargets = N.array([]) # ... and targets, for validation
for fold in range(numFolds):
sPredictions = N.append(sPredictions, cv.ca.stats.sets[fold][1])
sTargets = N.append(sTargets, cv.ca.stats.sets[fold][0])

print 'do both copies of targets correspond?',(sTargets == ds.targets).all()
print 'do both versions of accuracy 
correspond?',1-mean(results),'vs',sum(sPredictions == 
ds.targets)/float(ds.shape[0])

Is this stuff documented somewhere? I can't find "stats.set" in the 
manual anywhere, and in the reference for CrossValidation it does 
mention that it has a conditional attribute called "stats", but I can't 
find a link that describes the type/content of the stats member. If I 
could also find a structure that remembers where the samples in each 
fold come from, then it would be possible to write a more general for 
other partitioning strategies (e.g. if you want to use interleaved folds).

best,

Brian

-- 
Brian Murphy
Post-Doctoral Researcher
Language, Interaction and Computation Lab
Centre for Mind/Brain Sciences
University of Trento
http://clic.cimec.unitn.it/brian/