[pymvpa] Question about classifiers

Yaroslav Halchenko debian at onerussian.com
Mon Mar 30 01:09:30 UTC 2015

On Sun, 29 Mar 2015, Serin Atiani, Dr wrote:

> Hello, 

> I am doing a first brush analysis on my data using pymvpa. When I use a SVM classifier, which I think  theoratically makes more sense to use with my data I get a strange cross validation confusion matrix with one row that has high numbers, and the rest is mostly zeros or ones. I have 17 different classes that I train the classifier on, and this is an example of the cross validation confusion matrix I get

> [[ 2  0  0  0  0  0  0  0  1  0  1  0  1  0  0  0  0]
>  [16 17 17 16 16 16 17 17 17 17 17 15 16 16 16 16 16]

multiclass SVM does pair-wise classifications and then votes for a new
sample among all those pair-wise classes.  Then ties (two classes have
equal number of "votes") are not broken randomly but rather all fall
into a single class. And if there is no clear signal for classes, it is
quite common to see such ties to happen.  And it might be your 2nd class
is somewhat special here ... is it?  may be it is the only class which
is anyhow different from others? it seems that your classes are
balanced out to all have 20 samples, but is it the case across runs?
what is the output of print dataset.summary() for your dataset?

as Nick suggested you might have a "better" luck with classifier which
doesn't exhibit similar behavior, e.g.  SMLR or GNB -- how does
your confusion matrix look alike?

> Reducing the number of features, makes things a bit better but I still get one row that has large numbers. I tried also to group my classes and train the SVM classifier on the two most distinguishable ones, Nearest neighbour gives a 80% accuracy, with SVM it is slightly above chance with a confusion matrix that looks like this again.  

> [[5 0   ]
>  [15 20]

> It doesn't look right, anybody has any thoughts about this?

again, knowing more about your analysis/paradigm, and
preprocessing (as nick followed up) could help us to help you.

Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Research Scientist,            Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

More information about the Pkg-ExpPsy-PyMVPA mailing list