[pymvpa] classification based on individual parameter estimates from FSL

David Soto d.soto.b at gmail.com
Thu Jul 31 20:49:11 UTC 2014


Hi, I keep plugging away with this pretty basic classification
to recap the inputs are  FEAT parameter estimates for each subject (N=19)
for each of two conditions (the key classification targets) and across 2
cognitive tasks- the idea is train the classifier in task 1 and the test it
in task 2.


I run a whole-brain classification using SVM i.e.

>ds = fmri_dataset(samples=os.path.join(datapath1, 'data.nii.gz'),

targets=attr.targets, chunks=attr.chunks,

mask=os.path.join(datapath1, 'mask.nii.gz'))


>zscore(ds, chunks_attr='task')

>clf=LinearCSVMC()

>partitioner=HalfPartitioner(count=2, selection_strategy='first',
attr='task')

>cv=CrossValidation(clf, partitioner)


>res=cv(ds)


Here I get a whole-brain classification accuracy of around 68%

(though did not assess significance)

Then I run a searchlight analyses and looking at the classification
accuracy maps it appears like a chance distribution with mean 50% and the
max classification accuracy

around 56%- I wonder how it be that none of the searchlights reaches the
level of wholebrain classification ? and if this is the case then can it be
the wholebrain classification meaningful at all?


The same pattern emerges with searchlight spheres of 0 1 2 and 3 radius.

Below is the example searchlight code that I use


>ds = fmri_dataset(samples=os.path.join(datapath1, 'data.nii.gz'),

targets=attr.targets, chunks=attr.chunks,

mask=os.path.join(datapath1, 'mask.nii.gz'),

add_fa={'unmbral_glm': os.path.join(datapath1, 'mask.nii.gz')})


>zscore(ds, chunks_attr='task')


> clf=LinearCSVMC()


> partitioner=HalfPartitioner(count=2, selection_strategy='first',
attr='task')


> cv=CrossValidation(clf, partitioner)


>center_ids = ds.fa.unmbral_glm.nonzero()[0]


>sl=sphere_searchlight(cv, radius=3, space='voxel_indices',
center_ids=center_ids, postproc=mean_sample(), nproc=16)


> res=sl(ds)


cheers

david


On Wed, Jul 16, 2014 at 4:24 AM, Hanson, Gavin Keith <ghanson0 at ku.edu>
wrote:

>  I don’t think your rig, as it’s set up, will do what you want.
> I attempted to do something similar in my own work, and I think I have a
> solution for you.
>
>  First, merge your two tasks together into 1 bold image
> fslmerge -t bold_taska.nii.gz bold_taskb.nii.gz bold_taskab.nii.gz
> Also double up your attr.txt, so it corresponds to the new double-length
> bold image
> Now you have a dataset with shape (608, whatever)
> ds=fmri_dataset(samples=‘bold_taskab.nii.gz’, targets=attr.targets,
> chunks=attr.chunks, mask=‘mask.nii.gz’)
> now, do this:
>
>  ds.sa[‘task’]=np.repeat([‘A’,’B’], 304)
>
>  which will label the first half of your data as “A” and the second half
> as “B"
>
>  Now zscore, making sure you’re conscious of your task assignment
> zscore(ds, chunks_attr=‘task’)
>
>  Now set up the SL
> clf=LinearCSVMC()
> partitioner=HalfPartitioner(count=2, selection_strategy=‘first’,
> attr=‘task’)
> cv=CrossValidation(clf, partitioner)
> sl=sphere_searchlight(cv, radius=3, postproc=mean_sample(), nproc=16)
> res=sl(ds)
>
>  The HalfPartitioner as its set up will split you data into 2 chunks
> based on your new ‘task’ attribute. It’ll train on task A and test on task
> B, then visa versa. Usually you want the average errors of that, but if
> you’re really set on just training on A and testing on B, then omit the
> postproc=mean_sample() bit, and you’ll get per-fold error in the res
> dataset, and you can find the fold corresponding to what you want.
> Anyway, hope that helps.
> - Gavin
>
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Gavin Hanson, B.S.
> Research Assistant
> Department of Psychology
> University of Kansas
> 1415 Jayhawk Blvd., 426 Fraser Hall
> Lawrence, KS 66045
>
>  On Jul 15, 2014, at 6:11 PM, David Soto <d.soto.b at gmail.com> wrote:
>
>   Hi, I hope you have enjoyed the worldcup :)
>
>  I am trying a searchlight pipeline for the first time now,  it has been
> running
> for some 6-8 hours and remains on with little RAM and CPU used . To
> recapitulate, I am training a SVM on FSL copes from task A
>  regarding classes X & Y and then testing the model on FSL copes from
> task B regarding the same classes.
>  The shape of training and testing datasets is  (304, 902629)
>
> My searchlight pipeline is the following, would you please let me know if
> this is OK?
>  cheers,
> ds
>
> from mvpa2.suite import *
> datapath1='/home/dsoto/Documents/fmri/rawprepro_wmintrosp'
> attr = SampleAttributes(os.path.join(datapath1, 'attr.txt'))
> ds = fmri_dataset(samples=os.path.join(datapath1, 'bold_taska.nii.gz'),
> targets=attr.targets, chunks=attr.chunks)
>
> ts = fmri_dataset(samples=os.path.join(datapath1, 'bold_taskb.nii.gz'),
> targets=attr.targets, chunks=attr.chunks)
>
> zscore(ds)
> zscore(ts)
>
>  clf= LinearCSVMC()
> clf.train(ds)
> predictions = clf.predict(ts.samples)
> #validation= np.mean(predictions== ts.sa.targets)
> sl = sphere_searchlight(predictions, radius=3, space='voxel_indices',
> postproc=mean_sample())
> sl_map = sl(ds)
>
>
>  the ipython gui currently says
>
>
> * [SLC] DBG:                            Starting off 4 child processes for
> nblocks=4 *
>
>
> On Fri, Jul 4, 2014 at 2:44 PM, David Soto <d.soto.b at gmail.com> wrote:
>
>>  great thanks!
>>
>>  best of luck in the semifinals!
>>
>> cheers
>> ds
>>
>>
>> On Fri, Jul 4, 2014 at 2:33 PM, Michael Hanke <mih at debian.org> wrote:
>>
>>> Hi,
>>>
>>> On Tue, Jul 01, 2014 at 12:25:40AM +0100, David Soto wrote:
>>> > Hi Michael, indeed ..well done for germany today! :).
>>> > Thanks for the reply and the suggestion on KNN
>>> > I should have been  more clear that for each subject I have the
>>>  > following *block
>>> > *sequences
>>> > ababbaabbaabbaba in TASK 1
>>> > ababbaabbaabbaba in TASK 2
>>> >
>>> > this explains that I have  8 a-betas and 8 b-betas for each task
>>> > AND for each subject..so if i concatenate & normalize all the beta data
>>> > across subjects I will have 8 x 19 (subjects)= 152 beta images for
>>> class a
>>> > and the same for class b
>>>
>>>  Ah, I guess you model each task with two regressors (hrf + derivative?).
>>> You can also use a basis function set and get even more betas...
>>> >
>>> > then could I use SVM searchlight trained to discriminate a from b in
>>>  task1
>>> > betas and tested in the task2 betas?
>>>
>>>  yes, no problem.
>>>
>>> Cheers,
>>>
>>> Michael
>>>
>>> PS: Off to enjoy the quarter finals ... ;-)
>>>
>>>
>>> --
>>> Michael Hanke
>>> http://mih.voxindeserto.de
>>>
>>> _______________________________________________
>>> Pkg-ExpPsy-PyMVPA mailing list
>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>
>>
>>
>>
>>  --
>> http://www1.imperial.ac.uk/medicine/people/d.soto/
>>
>
>
>
> --
> http://www1.imperial.ac.uk/medicine/people/d.soto/
>  _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
>
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>



-- 
http://www1.imperial.ac.uk/medicine/people/d.soto/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140731/efaeb90d/attachment.html>


More information about the Pkg-ExpPsy-PyMVPA mailing list