[pymvpa] searchlight for data with different runs with different masks

Yaroslav Halchenko debian at onerussian.com
Tue Jan 5 20:56:19 UTC 2016


On Tue, 05 Jan 2016, Kaustubh Patil wrote:

>    Hi Yaroslav,

>    I hope you had some good downtime during holidays.

>    I am wondering if there is any straightforward solution for getting
>    balanced accuracy using PyMVPA?

ah -- thanks for the buzz

looking at the dataset summary I would recommend

1. assign coarser chunks (e.g. group each two chunks), smth like

ds.sa['runs'] = ds.sa.chunks  # so you still have a "record"
ds.sa.chunks  = ds.sa.chunks // 2

this is so you don't have some runs where you have only 2 samples of a
category, which would widen distribution of mean_error for that CV fold

2. use Balancer to balance those training/testing splits
so if you want to do nfold, just use following partitioner

    partitioner = ChainNode([NFoldPartitioner(cvtype=1),
                             Balancer(attr='targets',
                                      count=5, # 5 is abitrarily "high"
                                      limit='partitions',
                                      apply_selection=True
                                      )],
                            space='partitions')

instead of a plain NFoldPartitioner

Let me know how it goes

>    Best regards
>    Kaustubh

>    PS: Happy new year!!!
>    On Sat, Dec 19, 2015 at 11:50 PM, Kaustubh Patil
>    <kaustubh.patil at gmail.com> wrote:

>      Hi Yaroslav, thanks for help.

>      Here is summary for the dataset of one subject:

>      Dataset: 180x71039 at float64, <sa: chunks,regressors,targets,volumes>,
>      <fa: voxel_indices>, <a:
>      add_regs,imgaffine,imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
>      stats: mean=3.55452e-15 std=1 var=1 min=-4.07657 max=3.95942

>      Counts of targets in each chunk:
>      A  chunks\targetsA  0A A  1
>      A A A A A A A A A A A A A A A A  --- ---
>      A A A A A A A  1A A A A A A A A  7A A  11
>      A A A A A A A  2A A A A A A A A  16A  2
>      A A A A A A A  3A A A A A A A A  16A  2
>      A A A A A A A  4A A A A A A A A  13A  5
>      A A A A A A A  5A A A A A A A A  9A A  9
>      A A A A A A A  6A A A A A A A A  9A A  9
>      A A A A A A A  7A A A A A A A A  13A  5
>      A A A A A A A  8A A A A A A A A  15A  3
>      A A A A A A A  9A A A A A A A A  13A  5
>      A A A A A A  10A A A A A A A A  9A A  9

>      Summary for targets across chunks
>      A  targets mean std min max #chunks
>      A A A  0A A A A A  12A  3.1A  7A A  16A A A  10
>      A A A  1A A A A A A  6A  3.1A  2A A  11A A A  10

>      Summary for chunks across targets
>      A  chunks mean std min max #targets
>      A A A  1A A A A A  9A A  2A A  7A A  11A A A A  2
>      A A A  2A A A A A  9A A  7A A  2A A  16A A A A  2
>      A A A  3A A A A A  9A A  7A A  2A A  16A A A A  2
>      A A A  4A A A A A  9A A  4A A  5A A  13A A A A  2
>      A A A  5A A A A A  9A A  0A A  9A A  9A A A A A  2
>      A A A  6A A A A A  9A A  0A A  9A A  9A A A A A  2
>      A A A  7A A A A A  9A A  4A A  5A A  13A A A A  2
>      A A A  8A A A A A  9A A  6A A  3A A  15A A A A  2
>      A A A  9A A A A A  9A A  4A A  5A A  13A A A A  2
>      A A  10A A A A A  9A A  0A A  9A A  9A A A A A  2
>      Sequence statistics for 180 entries from set [0, 1]
>      Counter-balance table for orders up to 2:
>      Targets/OrderA  O1A A A A  |A A  O2A A A A  |
>      A A A A A  0:A A A A A  119A  1A  |A  118A  2A  |
>      A A A A A  1:A A A A A A  0A  59A  |A A  0A  58A  |
>      Correlations: min=-0.5 max=0.98 mean=-0.0056 sum(abs)=79

>      On Sat, Dec 19, 2015 at 11:00 PM, Yaroslav Halchenko
>      <debian at onerussian.com> wrote:

>        On Sat, 19 Dec 2015, Kaustubh Patil wrote:

>        > Thanks a lot Yaroslav. I am following a procedure as described below
>        please
>        > letA  me know if it has any clear or potential problems. I am also
>        throwing in
>        > another questions here but can start another thread if its worth.

>        > 1) Alignment procedure: Align all the runs to the middle volume of
>        run1
>        > (example_func from fsl). Use the mask that was generated by fsl form
>        run1.

>        ok
>        > 2) MVPA: do the classifiers give balanced accuracy as my datasets
>        are not
>        > balanced?

>        might need rebalancing.A  post output of your dataset.summary() here
>        > Also, is it recommended to run searchlight on betamap (after fitting
>        > hrf) or zscored raw data?

>        whatever fits your bill.A  usually betamaps, and possibly z-scored
>        (per
>        run or across all)
>        > If betamap after fitting hrf then I using the
>        > provided function I get only one parameter per target per run, is
>        that how its
>        > supposed to be?

>        ok if that is what you want to classify... some times you might
>        want model each trial separately.A  there is no universal answer.
-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        



More information about the Pkg-ExpPsy-PyMVPA mailing list