[pymvpa] How to build the null-distribution

Wed Aug 27 01:38:13 UTC 2014

Hi all,
I have two tasks including two same conditions in the fMRI experiment. Now I want to train a clf(LinearCSVMC) in task1 and then to classify task2. Similarly, I will also train a clf in task2 then to classify task 2.  I want to achieve and test the signficance of prediction accuracy of both classifiers. My questions in the final.
My script:
import npumpy as np
from mvpa2.suite import *
from mvpa2.mappers.detrend import poly_detrend
from mvpa2.datasets.mri import fmri_dataset
from mvpa2.datasets.miscfx import remove_invariant_features
data = [0,1,2,3,4,5]
for i in data:
     #input
     clf = LinearCSVMC()
     FMRIFile1  = 'data/task1.nii.gz'
     FMRIFile2  = 'data/task2.nii.gz'
     LabelFile1 = 'attributes/task1.txt'
     LabelFile2 = 'attributes/task2.txt'
     maskFile   = 'mask/mask.nii.gz' 
     MotionParameterFile1 = 'headmotion/task1.par' 
     MotionParameterFile2 = 'headmotion/task2.par' 

     #load data
     attrs1 = SampleAttributes(LabelFile1)
     ds1 = fmri_dataset(samples=FMRIFile1, targets=attrs1.targets, chunks=attrs1.chunks, mask = maskFile)
     attrs2 = SampleAttributes(LabelFile2)
     ds2 = fmri_dataset(samples=FMRIFile2, targets=attrs2.targets, chunks=attrs2.chunks, mask = maskFile)

     # Detrend with motion correction parameter
     mc1 = McFlirtParams(MotionParameterFile1)
     for param in mc1:
          ds1.sa['mc_' + param] = mc1[param]
     mc2 = McFlirtParams(MotionParameterFile2)
     for param in mc1:
          ds2.sa['mc_' + param] = mc2[param]

     # detrend some dataset with mc params as additonal regressors
     res = poly_detrend(ds1, opt_regs=['mc_x', 'mc_y', 'mc_z', 'mc_rot1', 'mc_rot2', 'mc_rot3'])
     res = poly_detrend(ds2, opt_regs=['mc_x', 'mc_y', 'mc_z', 'mc_rot1', 'mc_rot2', 'mc_rot3'])

     # do chunkswise linear detrending on dataset
     poly_detrend(ds1, polyord=1, chunks_attr='chunks')
     poly_detrend(ds2, polyord=1, chunks_attr='chunks')

     # do z-score data - zscore dataset relative to baseline ('rest') mean
     zscore(ds1, chunks_attr='chunks')
     zscore(ds2, chunks_attr='chunks')

     # select classes for mvpa analysis
     ds1 = ds1[np.array([l%6 == i for l in ds1.sa.time_indices], dtype='bool')]
     ds1 = remove_invariant_features(ds1)
     ds2 = ds2[np.array([l%6 == i for l in ds2.sa.time_indices], dtype='bool')]
     ds2 = remove_invariant_features(ds2) 

    #do classification
     clf.train(ds1)
     predictions1 = clf.predict(ds2.samples)
     acc1 = np.mean(predictions1 == ds2.sa.targets) 
     clf.train(ds2)
     predictions2 = clf.predict(ds1.samples)
     acc2 = np.mean(predictions2 == ds1.sa.targets)

My Questions:
(1)My second is to test the significance of prediction accuracy. So I must fristly estimate the null-distribution using permutation testing for these prediction accuacy. However, the PyMVPA manual about this thesis is appropriate to estimate the null-distribution under those conditions that there is only one task and train a clf on some runs and then to use this trained clf to predict the remaining run. However, now I have two independent tasks and two independent data. So I can not use the script in the manual directly. However, I am new  to machine learning and do not know how to write the appropriate scirpt. How should I modify and refresh my script to build the null-distribution using permutation testing for the current prediction accuracy?
(2)How should I do to conduct searchlight analysis?
Any advise should be appreciated!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140827/48a6ebdf/attachment.html>