[pymvpa] Bad confusion matrix using RBF kernel SVM CrossValidation

Yaroslav Halchenko debian at onerussian.com
Sun Apr 5 16:29:05 UTC 2015


On Tue, 31 Mar 2015, gal star wrote:

> Hi all,
> I'm performing binary classification.
> I'm using SVM as classifier with RBF kernel using Balancer.
> Training stats get 100% accuracy.

> Though, the confusion matrix results for different C and gamma are either:
> [[ 248  216
>        0  36]]
> Or:
> [[ 90  136
>     158  116]]

> I don't get how could the second matrix happend and whether it's because of
> the data's nature or something is wrong with the classifier.

It is less likely to be a classifier problem -- so indeed probably more
of a data nature of which we don't know anything (usually output of
fds.summary() gives it least some hints)

> Do you know what's going on (which result as the second matrix)?
> Could it be that the resutls are backwards somehow?
> and how can I further understand if it's the data which is bad or something
> else?

strong "anti-learning" could be for variety of reasons.  There have been
a number of discussions on the list in the past.  in your case I already
see (since you use Balancer) that your conditions are not balanced out
-- knowing more about the data could help to give some more informative
answer.

> My code looks as follows:
> >> attr = SampleAttributes(os.path.join(source,map_name))
> >> fds=fmri_dataset (samples=os.path.join(source,img_name),
>                               targets=attr.targets, chunks=attr.chunks)
> >> zscore (fds,param_est=('targets',['baseline'])
> >> sens = SensitivityBasedFeatureSelection(OneWayAnova(),
>              FixedNElementsTailSelector(1000, tail='upper',mode='select'))

> >> clf = FeatureSelectionClassifier(SVM(kernel=RbfSVMKernel(gamma=0.001),

>  svm_impl='C_SVC',C=10000), sens)
> >> cv = CrossValidation (clf, ChainNode([NFoldPartitioner(),

>  Balancer(attr='targets',count=4,limit='partitions',
>                                          apply_selection=True)],
> space='partitions'),
>                                           enable_ca=['stats'])
> >> err = cv(fds)
> >> print cv.ca.stats.matrix

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Research Scientist,            Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        



More information about the Pkg-ExpPsy-PyMVPA mailing list