[pymvpa] Train and test on different classes from a dataset

Mon Jan 14 17:59:52 UTC 2013

Dear Jan,

I was hoping to be able to compile a working demo for you, but I
couldn't make it so far -- please see below for some pointer...

On Fri, Jan 11, 2013 at 01:58:51PM +0100, Jan Derrfuss wrote:
> I would like to train a classifier on classes A vs. B and test it on
> classes C vs. D from the same dataset. The dataset consists of 6
> chunks and my aim is to perform a crossvalidation searchlight
> analysis (i.e., train A vs. B on chunks 1 to 5, predict C vs. D for
> chunk 6; then train A vs. B for chunks 1-4 and 6, predict C vs. D
> for chunk 5; and so on). When testing, a prediction of class A
> should be considered correct if the label is C and incorrect if the
> label is D (and vice versa for class B).

My approach would be to generate a new sample attribute that assigns the
same label to A and C samples and a different one to B and D labels.

Now you can run a cross-validation over the six original runs (nfold or
whatever you prefer). Combine the partitioner with a Sifter

http://pymvpa.org/generated/mvpa2.generators.base.Sifter.html

to filter out the samples you do not want to test or train on. Please
have a look at the surce code of this class, for some reason the
examples are not rendered in the HTML version.

Please let me know if you can't put the pieces together and I'll try
harder to create some running example code....

Cheers,

Michael

-- 
Michael Hanke
http://mih.voxindeserto.de