[pymvpa] balancing leave-one-out
Ben Acland
benacland at gmail.com
Mon Aug 18 00:53:12 UTC 2014
... is probably not exactly the right subject for this question, but here goes.
I'm having some issues with leave-one-out cross-validation. Specifically, I've got two subject groups and am having trouble getting NFoldPartitioner to leave the same number of samples from each group in the training dataset.
A little more detail. My dataset has two subject groups and three trial types. I've already made sure that the dataset (ds) is balanced. Below, I make a dataset containing only one trial type (whose name is contained in the variable 'name'), then try to set up cross-validation as described:
clf_ds = ds[ds.sa.name == ename, ds.fa.event_offsetidx == offset]
clf_ds.sa["targets"] = clf_ds.sa.sub_group
rep = mp.Repeater(count=PERM_COUNT)
perm = mp.AttributePermutator('sub_group', limit={'partitions': 1}, count=1)
part = mp.NFoldPartitioner(attr="subject")
clf = mp.LinearCSVMC()
null_cv = CrossValidation(
clf,
mp.ChainNode([part, perm],
space=part.get_space()),
postproc=mean_sample())
distr_est = mp.MCNullDist(rep,
tail='left',
measure=null_cv,
enable_ca=['dist_samples'])
cvmcc = mp.CrossValidation(clf,
part,
postproc=mean_sample(),
null_dist=distr_est,
enable_ca=['stats'])
result = cvte(clf_ds)
So, I'm clearly doing something wrong. If I build the generator myself, then look at its output, I get this:
gen = mp.ChainNode([part, perm], space=part.get_space())
asdf = gen.generate(clf_ds)
dd = asdf.next()
[(p, sg, len(np.where(np.logical_and(dd.sa.partitions==p, dd.sa.sub_group==sg))[0])) for p in np.unique(dd.sa.partitions) for sg in np.unique(ds.sa.sub_group)]
[(1, 'ctrl', 89), (1, 'scz', 83), (2, 'ctrl', 0), (2, 'scz', 6)]
Wah wahh, not what we're looking for.
I tried another approach using Balancer, but got a number of other exceptions. I can go into that if needed, but there's gotta be a simple way to do what I'm trying to do. To be specific, for each fold I want to leave out one sample from group 'ctrl,' and one from group 'scz.'
The answer is probably sitting right there staring me in the face, but if so then it's been staring me in the face for days now with no result. Please help!
Thanks,
Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140817/fa2c76ec/attachment-0001.html>
More information about the Pkg-ExpPsy-PyMVPA
mailing list