[pymvpa] Biased estimates by leave-one-out cross-validations in PyMVPA 2

Fri Apr 20 18:17:19 UTC 2012

Dear PyMVPA experts,

Isn't a leave-one-out cross-validation supposed to produce a smaller bias
yet a larger variance in comparison to N-fold cross-validations when N<# of
samples?

I ran a sanity check on binary classification of 200 random samples. 4-fold
cross-validations produced unbiased estimates (~50% correct), whereas
leave-one-out cross-validations consistently produced below-than-chance
classification performances (~40% correct). Why?

Any insight on this will be highly appreciated!

My code is listed below:

from mvpa2.suite import *
clf = LinearCSVMC();
cv_chunks = CrossValidation(clf, NFoldPartitioner(attr='chunks'))
cv_events = CrossValidation(clf, NFoldPartitioner(attr='events'))
acc_chunks=[]
acc_events=[]
for i in range(200):
 print i
 ds=Dataset(np.random.rand(200))
 ds.sa['targets']=np.remainder(range(200),2)
 ds.sa['events']=range(200)
 ds.sa
['chunks']=np.concatenate((np.ones(50),np.ones(50)*2,np.ones(50)*3,np.ones(50)*4))
 ds_chunks=cv_chunks(ds)
 acc_chunks.append(1-np.mean(ds_chunks))
 ds_events=cv_events(ds)
 acc_events.append(1-np.mean(ds_events))

>>>print np.mean(acc_chunks), np.std(acc_chunks)
0.50025 0.0442542370853
>>>print np.mean(acc_events), np.std(acc_events)
0.40674 0.189247516232

Thanks!
Dale
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20120420/6ad8d487/attachment.html>