[pymvpa] Bad accuracy results for cross subject classification

Mon Jan 12 10:38:47 UTC 2015

On 12 Jan 2015, at 09:23, gal star <gal.star3051 at gmail.com> wrote:

> I am wondering how to perform classification over
> a cross subject nifti file i've created using FSL. […]
> 
> I've tried to use NFoldPartitioner with LinearCSVMC, though 
> the results are always 100% accuracy (which looks fishy).
> 
> I've assured there is no contamination in the data (no scan from trainset
> exists also in testset).
> 
> What is the correct way to perform cross subject classification
> over functional datasets? 

At the very least you want to have different values for .sa.chunks for different subjects. In that way, you will not train and test on the same subject. This can simply be achieved by NFoldPartitioner.

A bit more advanced is using a partitioning scheme where you test on data from one subject in one run, after training on the data from all other runs in all other subjects. This may reduce run-specific effects shared across subjects, if different subjects do exactly the same task during corresponding runs. For this you could use a Sifter [1]. Its use is a bit more complicated (hopefully less so in the future [2])

If you need more help, it would be useful if you could post a minimally short snippet of your script that shows how you load the data, assign .sa.targets and .sa.chunks, define the partitioning scheme, and run the cross validation.

[1] http://www.pymvpa.org/generated/mvpa2.generators.base.Sifter.html
[2] https://github.com/PyMVPA/PyMVPA/issues/261