[pymvpa] searchlight

Fri May 15 17:20:53 UTC 2015

Thanks for all you answers.

The problem might be that I am using some resampled data extracted on
cortical surface as well as coordinates in subcortical ROIs, so this is
similar to grayordinates in HCP but it is not HCP data, so I have not
really a way to go back to voxel space.
Resampling occurs anyway during preprocessing so I thought it would be ok.

I have not per say correlated the number of neighbors within radius with
searchlight results but just displaying it makes me have concerns about
potential bias as I know classifiers are sensitive to feature space
dimension.
The distribution of feature number is really widespread due to the surface
constraint, for example a 6mm radius have nfeat from 6 to 118 which
definitely have an influence on classification.
Will try to find solution to that.

cheers

basile

On Fri, May 15, 2015 at 10:55 AM, Christopher J Markiewicz <effigies at bu.edu>
wrote:

> On 05/15/2015 09:25 AM, basile pinsard wrote:
> > Hi MVPA experts,
> >
> > I have a theoretical question that arised from recent analysis using
> > searchlight (either surface or voxel based):
> >
> > What is the most sensible feature selection strategy between:
> > - a radius with variable number of features included, which will make
> > the different classifiers trained on different amount of dimensions;
> > - a fixed number of closest voxels/surface_nodes that would represent
> > different surface/volume/spatial_extent depending on the localization.
>
> From my reading, the more common is the former. This is probably
> because, without evidence that your results particularly correlate with
> searchlight size, the more interpretable figure is one in which each
> voxel represents a statistic taken over a fixed spatial extent.
>
> A fixed number of voxels (I agree with Jo that one should always use
> voxels; even if you are using surface nodes to define a neighborhood,
> these should be mapped back to voxels to avoid smoothing and resampling)
> is beneficial if you are using an error metric that is sensitive to
> dimensionality, such as mean squared error.
>
> With a surface searchlight of radius 9mm, I get a distribution of
> searchlight sizes (in one subject) that's approximately normal(66, 8). I
> have not found that the cross-validation training error of classifiers
> (linear SVM, mostly) is particularly sensitive to searchlight size. On
> the other hand, attempting to use the same searchlights with regression
> problems produces results that correlate strongly (positively or
> negatively, depending on regression algorithm) with number of voxels.
>
> > I had the examples with surfaces, for which I used a spherical templates
> > (similar to 32k surfaces in HCP dataset) transformed into subject space.
> > I computed the number of neighbors for each node with a fixed radius and
> > noted a differential sampling resolution in the brain, which somewhat
> > overlay with my network of interest (motor) and thus my concerns.
>
> Do your preliminary results correlate with searchlight size across
> several regions? That would be my primary indication that this is a
> concern.
>
> > With voxel based searchlight, depending on masking voxels on the borders
> > of the mask will have less neighbors in a fixed radius sphere.
>
> Could you smear the mask with your searchlight, i.e. extend it in all
> directions? You'll still be including (presumably) uninformative voxels,
> but at least you won't be dimensionality itself that gets you.
>
> > PyMVPA has only this strategy for now, but I read many papers with fixed
> > amount of features in Searchlight.
> >
> > What do you think?
> >
> > I did an ugly modification to have a temporary fixed feature number
> > (closest) on surface but it should be optimized:
> >
> https://github.com/bpinsard/PyMVPA/commit/1af58ea8a57882ed57059491c19d83bed43e0bce
>
> If I'm reading this right (I haven't dug into the PyMVPA surface
> searchlight implementation), this is selecting a maximum number of
> surface nodes, and then mapping that to voxels, and will still end up
> with variable numbers of voxels, depending on the density of surface nodes.
>
> What about sorting nodes based on distance, mapping to voxels, and then
> taking the first max_features voxels?
>
> --
> Christopher J Markiewicz
> Ph.D. Candidate, Quantitative Neuroscience Laboratory
> Boston University
>
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>

-- 
Basile Pinsard

*PhD candidate, *
Laboratoire d'Imagerie Biomédicale, UMR S 1146 / UMR 7371, Sorbonne
Universités, UPMC, INSERM, CNRS
*Brain-Cognition-Behaviour Doctoral School **, *ED3C*, *UPMC, Sorbonne
Universités
Biomedical Sciences Doctoral School, Faculty of Medicine, Université de
Montréal
CRIUGM, Université de Montréal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20150515/b19d882f/attachment-0001.html>