[pymvpa] help understanding 1-dimension classification differences

Fri Jan 27 21:38:41 UTC 2012

Hi,

As a simple test, I was curious to see how much better a multivariate classification test (2 or more dimensions/features) would be compared to a univariate classification test (1 dimension/feature). In the univariate case, can someone help me understand why LinearNuSVMC would differ from RbfNuSVMC?

CV: 79.28% (RbfNuSVMC)
CV: 66.42% (LinearNuSVMC)

We know from a logistic regression that this particular feature can predict our two conditions with ~80% accuracy. If the SVM classifier only has a single dimension to work with, should linear and RBF differ this much? I was under the impression that, given a single dimension, both methods would only find the best point on that dimension that discriminates the classes.

Details on the dataset are printed below:
Dataset / float64 140 x 1
uniq: 140 chunks 2 labels
stats: mean=0.256292 std=0.231866 var=0.0537616 min=0 max=1
No details due to large number of labels or chunks. Increase maxc and maxl if desired
Summary per label across chunks
  label  mean  std  min max #chunks
   0    0.443 0.497  0   1     62
   1    0.557 0.497  0   1     78

To account for the unbalanced labels, I'm using nperlabel='equal' in my splitter.
cv = CrossValidatedTransferError(
	TransferError(clf),
	NFoldSplitter(nperlabel='equal'),
	enable_states=['confusion'])

Thanks!
David