[pymvpa] SVM classification of data with temporal correlation

Thu Dec 3 21:17:45 UTC 2009

Vadim Axel wrote:
> Hi,
> ...
> What do you think about this issue? Does ignoring temporal correlation 
> may just decrease the prediction rate or it casts doubt in the results 
> in general?

SVM will underperform in case of non-iid data because it will not 
exploit temporal
dependencies. Underperform in the sense that a classifier exploiting it 
could do better.
As far as I remember some generalization bounds should not hold for SVM 
when data
is not iid. Nevertheless it is pretty common that data is not iid and to 
use classifiers
that assume iid data on them.

As far as I know there are several schemas to minimize the impact of the 
temporal
dependencies between fMRI volumes. Averaging over blocks is one of them. For
example in [0] they use beta values for each trial as regressors instead 
of BOLD.
Many other strategies can be conceived.

As a basic rule just be sure that you don't use highly 
temporal-correlated samples
between train and test set, which in your case could mean to avoid 
samples from the same
block be splitted in train and test set. PyMVPA has the concept of 
"chunk" for that.
During cross-validation samples from the same chunk will all go either 
to train or
test set. This helps, for example, when you want to test the error rate 
of your
binary classifier with the binomial test.

HTH,

E.

[0]: http://www.citeulike.org/user/librain/article/3140982