[pymvpa] searchlight for data with different runs with different masks

Sat Jan 16 04:28:35 UTC 2016

Thanks again Yaroslav.

I agree that the classifier might end up giving 0 or very small balanced
accuracy (or micro accuracy) values but I think thats still a better
measure than using overall accuracy  (or macro accuracy). There are couple
of other measures that can be useful for imbalanced datasets:

1. A-mean: arithmetic mean, same as average class-wise accuracy or
micro-accuracy
2. G-mean: geometric mean instead of arithmetic mean above
3. F-measure
4. Area under the ROC curve

Of course a better solution would be using a classifier that can handle
imbalanced datasets, as you suggested. I have previously used SVMperf that
can optimize AU-ROC:
https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html

Not sure how easy it is to incorporate new classifiers in PyMVPA but I
could give it a try with some guideline.

Best regards,
Kaustubh

On Fri, Jan 15, 2016 at 11:08 PM, Yaroslav Halchenko <debian at onerussian.com>
wrote:

>
> On Fri, 15 Jan 2016, Kaustubh Patil wrote:
>
> >    Thanks Yaroslav.
>
> >    I tried your solution and it seems to work for this particular
> dataset but
> >    unfortunately not for other datasets as the labels cannot be balanced
> >    easily.
>
> >    Maybe it's possible to directly calculate balanced measures in the
> CV? I
> >    guess I will have to change the code to do that, any suggestions
> where to
> >    start?
>
> some toolboxes compute 'mean of within class accuracies' (not mean
> overall accuracy) which allows to account for disbalance.  I guess we
> could code it quite easily if you like
>
> BUT the problem really would remain:  with small number of samples
> classifier might just take the "majority" label since it would minimize
> error more than low performace decision.  So you would hurt yourself
> more than help.
>
> another solution is to try a classifier which provides weighting
> to the classes, e.g. as GNB with default prior setting does.  you could
> try it and see how it goes.  It is not the greatest classifier but a
> start. then you could add similar class weighting to some other
> classifiers supporting that.
>
> >    Best regards
> >    On Sat, Dec 19, 2015 at 3:51 PM, Yaroslav Halchenko
> >    <debian at onerussian.com> wrote:
>
> >      On Sat, 19 Dec 2015, Kaustubh Patil wrote:
>
> >      > Hi,
>
> >      > I want to use PyMVPA for whole-brain searchlight analysis on some
> >      existing
> >      > data. The data has been already preprocessed (skull stripping,
> motion
> >      > correction etc.). Each subject data contains 10 runs and each run
> was
> >      processed
> >      > separately, so there is a separate full brain boolean mask for
> each
> >      run.
>
> >      > My question is what is the recommended/correct a way to use this
> data
> >      to
> >      > perform run-wise cross-validation searchlight?
>
> >      you have a problem here, since you have done per run preprocessing,
> in
> >      particular motion-correction, your volumes are misaligned across
> runs.
> >      (used FSL, didn't you? )
>
> >      ideally, you redo preprocessing while motion correcting to the same
> >      volume across all the runs.A  Alternatively, you reslice all the
> runs
> >      into the same space (could well be the common space your toolkit
> used
> >      for analysis across runs -- common anatomical or MNI) and then do
> >      analysis there, while again unifying your mask, which must be the
> same
> >      across all the runs.
> >      > As I understand, each run has to be in the same space (same
> number of
> >      voxels)
> >      > so that training and test can be performed, so the whole brain
> masks
> >      have to be
> >      > somehow aligned. How would you recommend doing this?
>
> >      it is not a mere 'number of voxels' problem but rather that you have
> >      misaligned across runs volumes.A  if just voxel number -- choose
> >      intersection of all masks.
> --
> Yaroslav O. Halchenko
> Center for Open Neuroscience     http://centerforopenneuroscience.org
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20160115/46fc131f/attachment-0001.html>