[pymvpa] how to train/test on a single partition?

Sun Jul 14 16:29:26 UTC 2013

Hello,

I'm trying to do MVPA where training and testing are done one
different parts of a dataset using a partitioner and a sifter. I
define the test chunk manually in a for loop (for reasons not
important for the present question), and given the value for test_run,
the partitioner is defined by:

sifter=Sifter([('partitions', 2),
             ('chunks', [test_run])])

par=ChainNode([NFoldPartitioner(attr='chunks'),
                                           sifter])

Also a classifier using feature selection "clf_featsel" is defined,
and I want to define a measure that can be passed into a searchlight.
However the current partitioner only gives a single partition, and
CrossValidation does not like that as it uses a TransferMeasure that
requires at least two splits.

Currently my code to compute classification accuracy is using the following

def accuracy(ds):
    v=mean_match_accuracy(ds.samples.ravel(),ds.sa.targets)
    return AttrDataset(np.asarray([v]), fa=ds.fa, a=ds.a)

class TrainTestMeasure(Measure):
    is_trained = True
    def __init__(self, node, generator,postproc=None, **kwargs):
        Measure.__init__(self, **kwargs)
        self._node=node
        self._generator=generator
        self._postproc=postproc

    def _train(self, ds):
        Measure._train(self, ds)

    def _call(self, ds):
        rs=[]
        for d in self._generator.generate(ds):
            d_train=d[d.sa.partitions==1]
            d_test=d[d.sa.partitions==2]

            self._node.train(d_train)
            r=self._node(d_test)
            if not self._postproc is None:
                r=self._postproc(r)
            rs.append(r)
        return hstack(rs)

and the final measure defined by

 cv=TrainTestMeasure(clf_featsel,par,accuracy)

which is then fed to a searchlight.

This seem to work well, however I assume I overlooked something to
achieve the same using built-in classes and functions in PyMVPA. Are
there any suggestions on how to use built-in PyMVPA functionality to
achieve the same?

Thank you for your consideration,
best,
Nick