[pymvpa] custom cross-validation procedure: train on individual blocks, test on averaged blocks?

Yaroslav Halchenko debian at onerussian.com
Wed Mar 7 03:51:30 UTC 2012

interesting question...

quick answer: we don't have 1 liner pre-crafted solution but I see few
possible resolutions for you ;-)  you are hitting a tiny problem though
(which was recently brought up by M.Casey email) that output number of
predictions from the classifier cannot be different from # of samples of
input data... so it can't be a MappedClassifier, but if your goal is
just to assess such a cross-validation then you could do it with just a
bit of coding... let's discuss imho the easiest approach

I. creating custom sample-attribute based on partitioning and targets
   followed by mean_group_sample

so here would be the code for you to test (and report back) either it
does what you want:

class TestTogetherTrainAlone(Mapper):
    def _forward_dataset(self, ds):
        out = ds.copy()
        out.sa['custom'] = ds.sa.partitions.copy()
        # 1 is the "training" and 2 is the "testing" we would like to mean
        # so let's enforce separate partitions instead of 1
        partition1 = ds.sa.partitions == 1
        # 10 is just a large enough number > 2 ;)
        out.sa.custom[partition1] = 10 + np.arange(np.sum(partition1))
        return out

cv = CrossValidation(ChainMapper(
                           mean_group_sample(['targets', 'custom']),
                           CLASSIFIER], space='targets'),

>    Hi all,
>    I would like to do the following cross-validation procedure in pymvpa.
>    Here is my toy example: Say I have 3 runs in a block-design experiment. I
>    have two conditions, A and B, and in each run I have 3 blocks of each
>    condition. E.g.:
>    Run 1: A A B A B B
>    Run 2: A A A B B B
>    Run 3: A A B B A B
>    I would like to do a leave-one-out classification, but on each fold, I
>    would like to train on individual blocks, and test on averaged blocks in
>    the left out run. So I feed individual blocks of 'A' and 'B' from two runs
>    to train the classifier, but on the left out run,  I average all the 'A's
>    and 'B's, and test the classifier on each of these. So I test the
>    classifier twice instead of 6 times on each fold.
>    How do I do this? Is this possible by just using the CrossValidation()
>    function? Or do I have to rewrite it...
>    Thanks!
>    -Edmund Chong

> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic

More information about the Pkg-ExpPsy-PyMVPA mailing list