[pymvpa] Sanity check

Thu Sep 24 02:49:49 UTC 2015

On Wed, 23 Sep 2015, Raúl Hernández wrote:

> Hi, I’m trying to evaluate on trial by trial basis how well a region can
> predict the stimulus being presented to compare it with the participant’s
> judgment of the stimulus. So I’m training the classifier with data from all the
> trials on all the runs except by the one that I want to predict.

> I’m getting really good classifications better than when I was predicting one
> run using all the others. Supposedly it should be a little better as I’m
> training with a little more data but I’m worried I’m doing something wrong.

> Could anyone let me know if I’m making some sort of mistake?

> I know that there should be a more efficient way to do it but I wanted
> something easy, this is my code:

> predictions = [] #this is a vector that will contain the predictions of the
> classifier

> for i,dsTest in enumerate(ds): #go through all the trials on ds and separate
> one to test

>     clf = LinearCSVMC()

>     fclf = FeatureSelectionClassifier(clf, fsel)

>     dsTrain = []

>     dsTrain.append(ds[0:i]) #separates the training data

>     dsTrain.append(ds[i:-1])

minor but note that :-1 would select all but last

$> python -c 'print range(2)[:-1]'
[0]

you didn't have to do manual splitting but could've simply assigned some attribute like

ds.sa['trials'] = np.arange(len(ds))

and made use of NFoldPartitioner(attr='trials') and then CrossValidation... all standard stuff

back to more optimistic results, as Jo pointed out, to carry out most
trustworthy analysis you should have trained/cross-validated across runs.
Also ds.summary() output last tables could provide  you some related
information on trial orders ... which could also contribute to "optimistic"
result (depending on the output of cause.. ;) )
-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik