[pymvpa] Altering the weights of classes in binary SVM classifier

Michael Browning michael.browning at ndcn.ox.ac.uk
Thu Aug 13 15:48:14 UTC 2015


Super—thanks for the advice,

Mike

From: Pkg-ExpPsy-PyMVPA [mailto:pkg-exppsy-pymvpa-bounces+michael.browning=psych.ox.ac.uk at lists.alioth.debian.org] On Behalf Of Richard Dinga
Sent: 13 August 2015 15:31
To: pkg-exppsy-pymvpa at lists.alioth.debian.org
Subject: Re: [pymvpa] Altering the weights of classes in binary SVM classifier

> My sample is not balanced (there happens to have been 22 responders
> and 13 non-responders) and is not particularly large. I would like,
> if possible, to use all the data and adjust the classifier to the
> unbalanced set rather than selecting a subset of the responders.

You don't have to downsample, you can upsample by repeating balanced sampling n times, therefore all your data would be used. There is a balancer feature for it in pymvpa.

> I've seen recommendations for SVMs in unbalanced data suggesting that
> the weights of the outcome can be adjusted to reflect the sample size
> (essentially the weights of each class can be set as 1/(total number > in class)).

Yes, you can also move a decision threshold for some classifiers that outputs probability, so you will not predict class A if the prob of A is > 0.5 but if the prob of A > is number of A / number of total. I know you can do this with gaussian process clf, but i don't know about others

> I've tried to do this in pyMVPA using the following code:
> wts=[ 1/numnonresp, 1/numresp]
> wts_labels=[0,1]
> clf = LinearCSVMC(weight=wts, weight_label=wts_labels)

> I then embed the classifier in a crossvalidation call which includes > a feature selector.

> The code runs without error but the performance of the classifier
> does not alter (at all) regardless of the weights I use (e.g. using
> weights of [0 100000000000] or whatever. I'm concerned that I have
> not set this up correctly, and that the weights are not being
> incorporated into the SVM.

It didn't work for me either. You can try implementation from scikit-learn with pymvpa sklearn adaptor. There you can just put class_weight to auto and it should adjust them proportionally to class frequencies automatically

Best wishes,
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20150813/69d022dc/attachment.html>


More information about the Pkg-ExpPsy-PyMVPA mailing list