[pymvpa] Dataset with multidimensional feature vector per voxel
Yaroslav Halchenko
debian at onerussian.com
Thu Nov 19 14:40:50 UTC 2015
On Thu, 19 Nov 2015, Ulrike Kuhl wrote:
> Dear Yaroslav, dear all,
> I might have solved the balancing problem using the pyMVPA's 'Balancer' (duh!).
> I extended the code of the partitioner like this:
> npart = ChainNode([
> NFoldPartitioner(len(DS_noisy.sa['targets'].unique),
> attr='chunks'),
> ## so it should select only those splits where we took 1 from
> ## each of the targets categories leaving things in balance
> Sifter([('partitions', 2),
> ('targets',
> { 'uvalues': DS_noisy.sa['targets'].unique,
> 'balanced': True})
> ]),
> Balancer(attr='targets',count=1,limit='partitions',apply_selection=True)
> ], space='partitions')
> The classification result on noisy data looks perfect even on imbalanced group sizes - is it correct to do it like this?
it is correct BUT how imbalanced your imbalance -- in the example you
gave they were balanced so no Balancer was due.
Since Balancer randomly subsamples your samples, you might want to
1. mvpa2.seed(1) # or another other number
to get reproducible results
2. increase count to > 1 so you get a more stable estimate
> Also, I would still like know how I can see how the individual partitions look like.
smth like
[ds.sa.partitions for ds in partitioner.generate(ds)]
? but since 'apply_selection=True' you would see them as subsampled.
if you do apply_selection=False I think this should work
[ds.sa.balanced_set for ds in partitioner.generate(ds)]
--
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-ExpPsy-PyMVPA
mailing list