[pymvpa] classification based on individual parameter estimates from FSL

Hanson, Gavin Keith ghanson0 at ku.edu
Wed Jul 16 03:24:09 UTC 2014

I don’t think your rig, as it’s set up, will do what you want.
I attempted to do something similar in my own work, and I think I have a solution for you.

First, merge your two tasks together into 1 bold image
fslmerge -t bold_taska.nii.gz bold_taskb.nii.gz bold_taskab.nii.gz
Also double up your attr.txt, so it corresponds to the new double-length bold image
Now you have a dataset with shape (608, whatever)
ds=fmri_dataset(samples=‘bold_taskab.nii.gz’, targets=attr.targets, chunks=attr.chunks, mask=‘mask.nii.gz’)
now, do this:

ds.sa[‘task’]=np.repeat([‘A’,’B’], 304)

which will label the first half of your data as “A” and the second half as “B"

Now zscore, making sure you’re conscious of your task assignment
zscore(ds, chunks_attr=‘task’)

Now set up the SL
partitioner=HalfPartitioner(count=2, selection_strategy=‘first’, attr=‘task’)
cv=CrossValidation(clf, partitioner)
sl=sphere_searchlight(cv, radius=3, postproc=mean_sample(), nproc=16)

The HalfPartitioner as its set up will split you data into 2 chunks based on your new ‘task’ attribute. It’ll train on task A and test on task B, then visa versa. Usually you want the average errors of that, but if you’re really set on just training on A and testing on B, then omit the postproc=mean_sample() bit, and you’ll get per-fold error in the res dataset, and you can find the fold corresponding to what you want.
Anyway, hope that helps.
- Gavin

On Jul 15, 2014, at 6:11 PM, David Soto <d.soto.b at gmail.com<mailto:d.soto.b at gmail.com>> wrote:

Hi, I hope you have enjoyed the worldcup :)

I am trying a searchlight pipeline for the first time now,  it has been running
for some 6-8 hours and remains on with little RAM and CPU used . To recapitulate, I am training a SVM on FSL copes from task A
regarding classes X & Y and then testing the model on FSL copes from task B regarding the same classes.
The shape of training and testing datasets is  (304, 902629)

My searchlight pipeline is the following, would you please let me know if this is OK?

from mvpa2.suite import *
attr = SampleAttributes(os.path.join(datapath1, 'attr.txt'))
ds = fmri_dataset(samples=os.path.join(datapath1, 'bold_taska.nii.gz'),
targets=attr.targets, chunks=attr.chunks)

ts = fmri_dataset(samples=os.path.join(datapath1, 'bold_taskb.nii.gz'),
targets=attr.targets, chunks=attr.chunks)


clf= LinearCSVMC()
predictions = clf.predict(ts.samples)
#validation= np.mean(predictions== ts.sa.targets)
sl = sphere_searchlight(predictions, radius=3, space='voxel_indices',
sl_map = sl(ds)

the ipython gui currently says

[SLC] DBG:                            Starting off 4 child processes for nblocks=4

On Fri, Jul 4, 2014 at 2:44 PM, David Soto <d.soto.b at gmail.com<mailto:d.soto.b at gmail.com>> wrote:
great thanks!

best of luck in the semifinals!


On Fri, Jul 4, 2014 at 2:33 PM, Michael Hanke <mih at debian.org<mailto:mih at debian.org>> wrote:

On Tue, Jul 01, 2014 at 12:25:40AM +0100, David Soto wrote:
> Hi Michael, indeed ..well done for germany today! :).
> Thanks for the reply and the suggestion on KNN
> I should have been  more clear that for each subject I have the
> following *block
> *sequences
> ababbaabbaabbaba in TASK 1
> ababbaabbaabbaba in TASK 2
> this explains that I have  8 a-betas and 8 b-betas for each task
> AND for each subject..so if i concatenate & normalize all the beta data
> across subjects I will have 8 x 19 (subjects)= 152 beta images for class a
> and the same for class b

Ah, I guess you model each task with two regressors (hrf + derivative?).
You can also use a basis function set and get even more betas...
> then could I use SVM searchlight trained to discriminate a from b in  task1
> betas and tested in the task2 betas?

yes, no problem.



PS: Off to enjoy the quarter finals ... ;-)

Michael Hanke

