[pymvpa] custom partitioner
Yaroslav Halchenko
debian at onerussian.com
Wed May 18 02:26:11 UTC 2016
On Tue, 17 May 2016, Wolfgang Pauli wrote:
> Hi,
> I think I just had a shocking revelation. I tried to do cross validation
> with a custom partitioner like so:
> splt_rule = [([0,1],[6,7]),([1,2],[4,7]),([2,3],[4,5]),([3,0],[5,6])]
> partitioner = CustomPartitioner(splitrule=splt_rule, attr='chunks')
> For example, i thought the classifier would be trained on chunks 0 and 1,
> and tested on 6 and 7. during cross-validation. However, when I used
> actually generated the partitions and looked which chunks are in each
> partition, I found that the partitioner would actually create three
> partitions, two as specified by the split rule, and one containing the
> remaining chunks.
> I.e. instead of getting e.g. [0,1],[6,7], I would get ([2, 3, 4,
> 5],[0,1],[6,7]).
> Is this correct? How can I keep it from creating that third partition with
> the remaining items?
well, in general partitioners are implemented so they don't cause any
memory impact and are fast... for that they just assign partitioning as
a new sample attribute, and do not split dataset into partitions. That
job is later done by Splitter, e.g. within CrossValidation. Within
CrossValidation, that splitter cares only about partitions labeled as 1
(for training) and 2 (for testing). The others it ignores.
If you really need to select those partitions 1 and 2 asap, I guess just
use smth like
partitioned_ds.select(partitions=[1, 2])
?
--
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-ExpPsy-PyMVPA
mailing list