[pymvpa] Chunks for structural analysis

Tue Dec 16 16:42:53 UTC 2014

Ah, that's great, thanks for the extra info.

On Tue, Dec 16, 2014 at 2:19 PM, Yaroslav Halchenko <debian at onerussian.com>
wrote:
>
>
> On Tue, 16 Dec 2014, Nick Oosterhof wrote:
>
>
> > On 16 Dec 2014, at 14:37, Thomas Nickson <thomas.nickson at gmail.com>
> wrote:
>
> > > The chunks are supposed to define sets that are independent but I'm
> not sure how to encode this. If I'm doing structural all of my scans are
> independent of each other right? So my chunks should just be an array of
> distinct numbers of the length of the number of subjects I have?
>
> > Indeed.
>
> > > Or is independence the same as the diagnosis groupings?
>
> > If you have different groups (e.g. patients and healthy controls) and
> you want to see if you can discriminate between these groups, you can
> encode this in the .sa.targets attribute.
>
> The trick here though is still to "keep balance" since in most of the
> scenarios you might have relatively small number of samples so you would
> still like to have balanced groups (e.g. patients and controls) in
> training and testing to avoid possible "training winner takes it
> all" situation.
>
> Here e.g. the possible partitioning beast which would do smth like that
> if you assign a different chunk to every sample (since they are
> independent)
>
> mvpa2.seed(1)  # reproducible balancing
> partitioner = ChainNode([NFoldPartitioner(cvtype=2),
>                          Sifter([('partitions', 2),
>                                  ('targets',
>                                    dict(uvalues=['patient', 'control'],
>                                         balanced=True))]),
>                          Balancer(attr='targets',
>                                   count=1, # can set > 1 if you dont have
> "enough"
>                                   limit='partitions',
>                                   apply_selection=True
>                                  )],
>                             space='partitions')
>
> it would grab out all possible pairs of patient/controls, assure that they
> are
> balanced (have 1 patient, 1 control), balances out number of samples across
> groups through subselection (if you have e.g. more controls than patients;
> otherwise -- remove Balancer)
>
> related issue  which I hope one of us will tackle soon or you are welcome
> to
> contribute is https://github.com/PyMVPA/PyMVPA/issues/261 which should
> provide
> "merged" NFoldPartitioner + Sifter  to avoid current inefficiency of the
> setup,
> and cumbersome construct ;)
>
> --
> Yaroslav O. Halchenko, Ph.D.
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Research Scientist,            Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20141216/42fbdfc2/attachment-0001.html>