[pymvpa] help understanding 1-dimension classification differences
David V. Smith
david.v.smith at duke.edu
Fri Jan 27 21:38:41 UTC 2012
Hi,
As a simple test, I was curious to see how much better a multivariate classification test (2 or more dimensions/features) would be compared to a univariate classification test (1 dimension/feature). In the univariate case, can someone help me understand why LinearNuSVMC would differ from RbfNuSVMC?
CV: 79.28% (RbfNuSVMC)
CV: 66.42% (LinearNuSVMC)
We know from a logistic regression that this particular feature can predict our two conditions with ~80% accuracy. If the SVM classifier only has a single dimension to work with, should linear and RBF differ this much? I was under the impression that, given a single dimension, both methods would only find the best point on that dimension that discriminates the classes.
Details on the dataset are printed below:
Dataset / float64 140 x 1
uniq: 140 chunks 2 labels
stats: mean=0.256292 std=0.231866 var=0.0537616 min=0 max=1
No details due to large number of labels or chunks. Increase maxc and maxl if desired
Summary per label across chunks
label mean std min max #chunks
0 0.443 0.497 0 1 62
1 0.557 0.497 0 1 78
To account for the unbalanced labels, I'm using nperlabel='equal' in my splitter.
cv = CrossValidatedTransferError(
TransferError(clf),
NFoldSplitter(nperlabel='equal'),
enable_states=['confusion'])
Thanks!
David
More information about the Pkg-ExpPsy-PyMVPA
mailing list