[pymvpa] zscore / ZScoreMapper: How to transform both datasets of a split equally

Thorsten Kranz thorstenkranz at googlemail.com
Fri Feb 25 21:26:20 UTC 2011


Hi Yarik,

actually your post showed me something new (as almost all your replies
do). I wasn't aware of the MappedClassifier-class, very nice. But this
doesn'T solve my problem.

My reason for looping manually is to have the possibility of logging
progress information and aving live-information whether the
performance is reasonable. Some callback mechanism would help also.

The ChainMapper approach doesn't help me, though, as I want zscoring
pervoxel. I haven'T seen such a parameter for the ZScoreMapper.

Greetings, Thorsten

2011/2/25 Yaroslav Halchenko <debian at onerussian.com>:
>>     # Here the two mapped datasets should be zscored equally, but with
>
> by 'equally' do you mean that mean/std should be computed on
> concatenation of two datasets?
>
> if it would be ok with you if mean/std gets computed based on training
> portion of the dataset, then why not simply construct
> MappedClassifier chaining your two mappers together? e.g. something like
> (not tested)
>
> clfr = MappedClassifier(SVM(),
>                        mapper=ChainMapper([MyCustomMapper(...),
>                                            ZScoreMapper(...)]))
>
> and then simply doing
>
> cvte = CrossValidatedTransferError(TransferError(clfr),
>                            NFoldSplitter(nperlabel="equal"),
>                            enable_states = ['confusion'])
>
> error = cvte(dataset)
> print cvte.states.confusion
>
> ?
>
> On Fri, 25 Feb 2011, Thorsten Kranz wrote:
>
>> Hi all,
>
>> I think this should be trivial, but maybe I'm not seeing the forest
>> for the trees.
>
>> I want to do a cross-validation, I do this manually, also involving a
>> custom mapper, by (simplified Example):
>
>> cm = ConfusionMatrix()
>> clfr = SVM()
>> for d1, d2 in NFoldSplitter(nperlabel="equal")(dataset):
>>     mcm = MyCustomMapper(...)
>>     mcm.train(d1)
>>     d1_mapped = mcm.forward(d1)
>>     d2_mapped = mcm.forward(d2)
>>     # Here the two mapped datasets should be zscored equally, but with
>> perchunk=False, pervoxel=True
>>     ???
>>     clfr.train(d1_mapped)
>>     cm.add(clfr.predict(d2_mapped.samples),d2_mapped.labels)
>
>> I don't see how to zscore both mapped datasets equally AND
>> pervoxel=True, perchunk=False. I would love to use the ZScoreMapper
>> for that, but it doesn'T provide the "per*" arguments. And the method
>> zscore does accept parameters mean and std, but doesn't return them
>> when called so I could reuse them in a second call.
>
>> I guess I will write my own zscore-method, just copy-pasting it from
>> your source, but returning mean and std.
>
>> Or do you have another / better proposal?
>
>> Greetings,
>> Thorsten
>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>
>
> --
> =------------------------------------------------------------------=
> Keep in touch                                     www.onerussian.com
> Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>



More information about the Pkg-ExpPsy-PyMVPA mailing list