[pymvpa] hyperalignment inquiry

Wed Aug 3 16:33:51 UTC 2016

Hi David,

You have to train the queryengine, here is a little modified version of
what you pasted.
However, remember that the datasets should be approximately similar in
coverage. I noticed that your first subject has ~80,000 non-invariant
features and some other subjects have ~30,000. That's a huge margin.

I am not sure how hyperalignment performs with so many missing regions.
Swaroop

 d[num]=remove_invariant_features(d[num])

     # You don't need this line below
    #voxel_indices=d[num].fa.voxel_indices

    testqe=IndexQueryEngine(voxel_indices=Sphere(3))
    # Train the testqe with that dataset
    testqe.train(d[num])

    qe.append(testqe.query_byid)

On Tue, Aug 2, 2016 at 6:30 AM, David Soto <d.soto.b at gmail.com> wrote:

> Hi Swaroop, thanks for looking into it...
>
> There does not seem be a clear documentation on how to get the query
> engines ..from what I have seen this is what I did:
>
> d=[]
> qe=[]
> num=-1
> for subj in range(1,5):
>     num=num+1
>     print(subj)
>     datapath='/Users/dsoto/Dropbox/4hyperal/subj%d' %(int(subj))
>     d.append(fmri_dataset(samples=os.path.join(datapath,
>  'reg4mm.nii.gz'),
>                       targets=attr.targets, chunks=attr.chunks,
>                       mask=os.path.join(path0, 'mni4mm.nii.gz')))
>
>     d[num]=remove_invariant_features(d[num])
>
>     voxel_indices=d[num].fa.voxel_indices
>
>     testqe=IndexQueryEngine(voxel_indices=Sphere(3))
>
>     qe.append(testqe.query_byid)
>
> qe_merged = [qe[0],qe[1],qe[2],qe[3]]
>
> ds_merged = [d[0],d[1],d[2],d[3]]
>
> which I then was hoping to pass later onto
>     slhyper = SearchlightHyperalignment(queryengine=qe_merged)
>
> however there does not appear to be an output for each of the 'qe' above,
> even though I can see the voxel_ids...
>
> any tips would be very useful!
>
> thanks!
>
>
> [image: ][image: ][image: ]
> -------------------
>
> On 1 August 2016 at 19:02, Swaroop Guntupalli <swaroopgj at gmail.com> wrote:
>
>> Hi David,
>>
>> Thanks for the data. The issue is that when using searchlight
>> hyperalignment without supplying queryengines but providing radius (as in
>> your case), we compute a single spherical volume queryengine on the
>> reference subject, which means all the datasets need to be aligned and have
>> exact same number of features. The issue in your script is that you remove
>> invariance features separately for each subject, which results in different
>>  numbers of features for each subject.
>>
>> There are two ways to resolve it:
>> 1) after removing invariant features, supply list of spherical volume
>> queryengines corresponding to subjects, or
>> 2) find the union of invariant features in all subjects, and then remove
>> the same set of features from all subjects, and proceed with hyperalignment
>> as you have currently.
>>
>> Hope that helps.
>> Best,
>> Swaroop
>>
>> On Mon, Aug 1, 2016 at 6:30 AM, David Soto <d.soto.b at gmail.com> wrote:
>>
>>> Hi Swaroop, I have run the analyses using the individual copes that were
>>> co-registered in FSL to the individual T1 (using BBR) and then to MNI2mm
>>> template using a 12 DOF.  The problem persists even after doing
>>> remove_invariant_features().
>>> I get a similar  'Indexerror: index 226822 is out of bounds for axis 1
>>> with size 198603'
>>>
>>> I have uploaded all the datafiles which you can find in the link below,
>>> just in case you could have a look. If you open the zip then you will find
>>> all the relevant files, including the pymvpa script that I use to run the
>>> analyses ('hyperalscript.py'), the individual data within a folder and also
>>> the attributes file and the MNI brain mask
>>> https://drive.google.com/open?id=0B7-qbWx3g9KfUU1jOFVpQ05oRWM
>>>
>>> any help would be much appreciated...
>>>
>>> thanks!
>>> david
>>>
>>> On 29 July 2016 at 18:28, Swaroop Guntupalli <swaroopgj at gmail.com>
>>> wrote:
>>>
>>>> That is correct. But the searchlight centers across subjects are
>>>> assumed to be aligned.
>>>>
>>>>
>>>> On Fri, Jul 29, 2016 at 8:24 AM, David Soto <d.soto.b at gmail.com> wrote:
>>>>
>>>>> Hi, thanks I will try that,  I understand therefore that the number
>>>>> of features per subject need not be equal across subjects for searchlight
>>>>> hyperalignment - but please correct me if am wrong.
>>>>> best
>>>>> david
>>>>>
>>>>> On 29 July 2016 at 16:17, Swaroop Guntupalli <swaroopgj at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> If you are using searchlight hyperalignment, it is advisable to align
>>>>>> the data across subjects using anatomy first. Simplest would be to be align
>>>>>> them to an MNI template and then run the searchlight hyperalignment.
>>>>>> Our tutorial dataset is affine aligned to MNI template.
>>>>>>
>>>>>> Best,
>>>>>> Swaroop
>>>>>>
>>>>>> On Thu, Jul 28, 2016 at 10:51 AM, David Soto <d.soto.b at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Swaroop, I managed to get the dataset in the right format as
>>>>>>> per the hyperaligmentsearchlight tutorial
>>>>>>> however when I run the hyperaligment I get the following error (IndexError:
>>>>>>> index 46268 is out of bounds for axis 1 with size 43506, see
>>>>>>> further below)...to recap the dataset is a concatenation of each subject
>>>>>>> data, each in individual native space, so number of features are different
>>>>>>> across subjects...
>>>>>>> The code I use is the same as in the tutorial, namely, any feedback
>>>>>>> would be great, thanks, david
>>>>>>> cv = CrossValidation(clf, NFoldPartitioner(attr='subject'),
>>>>>>>                      errorfx=mean_match_accuracy)
>>>>>>>
>>>>>>> for test_run in range(nruns):
>>>>>>>     ds_train = [sd[sd.sa.chunks != test_run, :] for sd in ds_all]
>>>>>>>     ds_test = [sd[sd.sa.chunks == test_run, :] for sd in ds_all]
>>>>>>>
>>>>>>>     slhyper = SearchlightHyperalignment(radius=3, featsel=0.4,
>>>>>>> sparse_radius=3)
>>>>>>>     slhypmaps = slhyper(ds_train)
>>>>>>>     ds_hyper = [h.forward(sd) for h, sd in zip(slhypmaps, ds_test)]
>>>>>>>
>>>>>>>     ds_hyper = vstack(ds_hyper)
>>>>>>>     zscore(ds_hyper, chunks_attr='subject')
>>>>>>>     res_cv = cv(ds_hyper)
>>>>>>>     bsc_slhyper_results.append(res_cv)
>>>>>>>
>>>>>>> OUTPUT MESSAGE.........
>>>>>>> Performing classification analyses...
>>>>>>>   between-subject (searchlight hyperaligned)...
>>>>>>>
>>>>>>> ------------------------------------------------------------
>>>>>>> ---------------
>>>>>>> IndexError                                Traceback (most recent
>>>>>>> call last)
>>>>>>> <ipython-input-191-85bdb873d4f1> in <module>()
>>>>>>>      24     # Searchlight Hyperalignment returns a list of mappers
>>>>>>> corresponding to
>>>>>>>      25     # subjects in the same order as the list of datasets we
>>>>>>> passed in.
>>>>>>> ---> 26     slhypmaps = slhyper(ds_train)
>>>>>>>      27
>>>>>>>      28     # Applying hyperalignment parameters is similar to
>>>>>>> applying any mapper in
>>>>>>>
>>>>>>> /usr/local/lib/python2.7/site-packages/mvpa2/algorithms/searchlight_hyperalignment.pyc
>>>>>>> in __call__(self, datasets)
>>>>>>>     626             node_blocks = np.array_split(roi_ids,
>>>>>>> params.nblocks)
>>>>>>>     627             p_results = [self._proc_block(block, datasets,
>>>>>>> hmeasure, queryengines)
>>>>>>> --> 628                          for block in node_blocks]
>>>>>>>     629         results_ds = self.__handle_all_results(p_results)
>>>>>>>     630         # Dummy iterator for, you know, iteration
>>>>>>>
>>>>>>> /usr/local/lib/python2.7/site-packages/mvpa2/algorithms/searchlight_hyperalignment.pyc
>>>>>>> in _proc_block(self, block, datasets, featselhyper, queryengines, seed,
>>>>>>> iblock)
>>>>>>>     387                 continue
>>>>>>>     388             # selecting neighborhood for all subject for
>>>>>>> hyperalignment
>>>>>>> --> 389             ds_temp = [sd[:, ids] for sd, ids in
>>>>>>> zip(datasets, roi_feature_ids_all)]
>>>>>>>     390             if self.force_roi_seed:
>>>>>>>     391                 roi_seed = np.array(roi_feature_ids_all[self.params.ref_ds])
>>>>>>> == node_id
>>>>>>>
>>>>>>> /usr/local/lib/python2.7/site-packages/mvpa2/datasets/base.pyc in
>>>>>>> __getitem__(self, args)
>>>>>>>     139
>>>>>>>     140         # let the base do the work
>>>>>>> --> 141         ds = super(Dataset, self).__getitem__(args)
>>>>>>>     142
>>>>>>>     143         # and adjusting the mapper (if any)
>>>>>>>
>>>>>>> /usr/local/lib/python2.7/site-packages/mvpa2/base/dataset.pyc in
>>>>>>> __getitem__(self, args)
>>>>>>>     445         if isinstance(self.samples, np.ndarray):
>>>>>>>     446             if np.any([isinstance(a, slice) for a in args]):
>>>>>>> --> 447                 samples = self.samples[args[0], args[1]]
>>>>>>>     448             else:
>>>>>>>     449                 # works even with bool masks (although
>>>>>>> without
>>>>>>>
>>>>>>> IndexError: index 46268 is out of bounds for axis 1 with size 43506
>>>>>>>
>>>>>>> On 28 July 2016 at 00:25, Swaroop Guntupalli <swaroopgj at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> If you have limited data, you can use a part of it (however you
>>>>>>>> split the data for training and testing)
>>>>>>>> to train hyperalignment, and also use the same part to train the
>>>>>>>> classifier and then apply hyperalignment and test classifier on the
>>>>>>>> left-out part. Yes, you can artificially create 2 chunks (or more if you
>>>>>>>> prefer).
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 27, 2016 at 3:17 PM, David Soto <d.soto.b at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> sounds great thanks, a further thing is that I have seen that in
>>>>>>>>> order to preclude  circularity issues, hyperalinment is implemented on a
>>>>>>>>> subset of training chunks and then the transformation is applied to the
>>>>>>>>> full datasets prior to classification analyses.  Given that I have no
>>>>>>>>> proper chunks/runs here, but only 56 betas across trials, would it be okay
>>>>>>>>> to train hyperaligment just on half of the 56 betas, eg artificially split
>>>>>>>>> the data set in 2 chunks  each containing 14 betas of class A and 14 of
>>>>>>>>> class B? Or would it be just OK to train hyperaligment on the 56 betas in
>>>>>>>>> the first instance?
>>>>>>>>> thanks!
>>>>>>>>> david
>>>>>>>>>
>>>>>>>>> On 28 July 2016 at 00:00, Swaroop Guntupalli <swaroopgj at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The hyperalignment example on PyMVPA uses one beta map for each
>>>>>>>>>> category per run.
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 27, 2016 at 2:57 PM, Swaroop Guntupalli <
>>>>>>>>>> swaroopgj at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> Beta maps should work fine for hyperalignment. The more maps (or
>>>>>>>>>>> TRs) there are, better the estimate.
>>>>>>>>>>> We used within-subject hyperalignment in Haxby et al. 2011,
>>>>>>>>>>> which uses maps from 6 categories (we used 3 successive betas per condition
>>>>>>>>>>> I think).
>>>>>>>>>>>
>>>>>>>>>>> vstack() merges multiple datasets into a single dataset, and if
>>>>>>>>>>> there is any voxel count (nfeatures) mismatch across subjects, it won't
>>>>>>>>>>> work (as evidenced by the error).
>>>>>>>>>>> Hyperalignment takes in a list of datasets, one per each
>>>>>>>>>>> subject.
>>>>>>>>>>> So, you can make that a list as
>>>>>>>>>>> ds_all =[ds1, ds2, ...., ds16]
>>>>>>>>>>> and use for Hyperalignment()
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Swaroop
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 27, 2016 at 2:28 PM, David Soto <d.soto.b at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> hi,
>>>>>>>>>>>>
>>>>>>>>>>>> in my experiment I have 28 betas in condition A and 28
>>>>>>>>>>>> parameter estimate images and 28  in condition B for each subject (N=16 in
>>>>>>>>>>>> total).
>>>>>>>>>>>>
>>>>>>>>>>>> i have performed across-subjects SVM-based searchlight
>>>>>>>>>>>> classification using MNI-registered individual beta images and I would like
>>>>>>>>>>>> to repeat and confirm my results using searchlight based on hyperaligned
>>>>>>>>>>>> data.
>>>>>>>>>>>>
>>>>>>>>>>>> i am not aware of any paper using hyperaligment on  beta images
>>>>>>>>>>>> but I think this should be possible, any advise please would be nice
>>>>>>>>>>>>
>>>>>>>>>>>> i've created individual datasets concatenating the 28 betas in
>>>>>>>>>>>> condition A and the 28 in condition (in the actual experiment condition A
>>>>>>>>>>>> and B can appear randomly on each trial). I have 16 nifti datasets, one per
>>>>>>>>>>>> subject, with each in individual native anatomical space. In trying to get
>>>>>>>>>>>> a dataset in the same format as in the hyperlignment tutorial I use
>>>>>>>>>>>> fmri_dataset on each individual wholebrain 48 betas  and then try to merged
>>>>>>>>>>>> then all i.e. ds_merged = vstack((d1, d2, d3, d4, d5, d6, d7,
>>>>>>>>>>>> d8, d9, d10, d11, d12, d13, d14, d15,d16)) but this gives the following
>>>>>>>>>>>> error pasted at the end,
>>>>>>>>>>>> which I think it is becos the number of voxels is different
>>>>>>>>>>>> across subjects. This is one issue.
>>>>>>>>>>>>
>>>>>>>>>>>> Another is that the function vstack does appear to produce the
>>>>>>>>>>>> list of individual datasets that is in the hyperligment tutorial dataset,
>>>>>>>>>>>> but a list of individual betas, I would be grateful to receive some tips.
>>>>>>>>>>>>
>>>>>>>>>>>> thanks!
>>>>>>>>>>>> david
>>>>>>>>>>>> ------------------------------------------------------------
>>>>>>>>>>>> ---------------
>>>>>>>>>>>> ValueError                                Traceback (most
>>>>>>>>>>>> recent call last)
>>>>>>>>>>>> <ipython-input-64-2fef46542bfc> in <module>()
>>>>>>>>>>>>      19 h5save('/home/dsoto/dsoto/fmri/wmlearning/h5.hdf5',
>>>>>>>>>>>> [d1,d2])
>>>>>>>>>>>>      20 #ds_merged = vstack((d1, d2, d3, d4, d5, d6, d7,d8,d9,
>>>>>>>>>>>> d10, d11, d12, d13, d14, d15, d16))
>>>>>>>>>>>> ---> 21 ds_merged = vstack((d1, d2))
>>>>>>>>>>>>
>>>>>>>>>>>> /usr/local/lib/python2.7/site-packages/mvpa2/base/dataset.pyc
>>>>>>>>>>>> in vstack(datasets, a)
>>>>>>>>>>>>     687                              "datasets have varying
>>>>>>>>>>>> attributes.")
>>>>>>>>>>>>     688     # will puke if not equal number of features
>>>>>>>>>>>> --> 689     stacked_samp = np.concatenate([ds.samples for ds in
>>>>>>>>>>>> datasets], axis=0)
>>>>>>>>>>>>     690
>>>>>>>>>>>>     691     stacked_sa = {}
>>>>>>>>>>>>
>>>>>>>>>>>> ValueError: all the input array dimensions except for the
>>>>>>>>>>>> concatenation axis must match exactly
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>>>>>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>>>>>>>>
>>>>>>>>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>>>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>>>>>>
>>>>>>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>>>>>
>>>>>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>>>>
>>>>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>>>
>>>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>>
>>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>>
>>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>>
>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>>
>>>
>>>
>>> _______________________________________________
>>> Pkg-ExpPsy-PyMVPA mailing list
>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>>
>>
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>
>
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20160803/58c4da3e/attachment-0001.html>