[pymvpa] regression

Fri Dec 3 03:57:40 UTC 2010

Hi

I am trying to find a relation between EEG and mental workload. I have
five levels of workload (1 to 5). So I was thinking to use a
regression: SMLR(regression=True).

Btw, is it called a regressor, a regresser, some other name?

Does it make sense to do a regression when labels are discrete? i.e.
the example on regression uses a sinusoidal function, i.e. label
values are continuous. Would it help the regression to add small noise
to the labels to give a larger set of values?

Is it better to ensure that all labels have the same number of samples
in the training dataset? I manually codded a function to split my
dataset in training and testing set, randomly picking 70% of the
samples for the training set, but making sure that all labels have the
same number of samples in the training set. Is there a way to achieve
that with the existing splitters in PyMVPA?

How to measure the performances? I am currently computing the RMSE.
For that I do a 5 folds pseudo cross validation: for each fold I pick
at random 70% of the samples as part of the training set, with the
same number of samples per labels, and the rest of the samples go to
the testing set. I then train and predict, and measure the RMSE. Can I
do that using PyMVPA cross validation facility? I know how to do it
with a classifier, but I am not sure what's the result with a
regressor. Besides, I could also use the correlation coefficient
rather than the RMSE, any comment?

As for SMLR, I use the standards parameters, lm=0.1 which yields a
fairly small sparsification (a couple of the weights are null, out of
80). Is there a way to automatically tune lm (and other parameters) to
achieve the best performances?

That's a lot of questions....

Regards
Brice

Regards
Brice