[pymvpa] cross-validation

Mon Mar 15 17:46:50 UTC 2010

Hello, 

I'd like to pick the group's brain about an issue related to cross-validation. 

I just read a paper in Current Biology (Chadwick et al., 2010, Decoding Individual Episodic Memory Traces in the Hippocampus) in which the following cross-validation procedure was used: 

"We used a standard k-fold cross- validation testing regime [10] wherein k equalled the number of experimental trials, with the data from each trial set aside in turn as the test data, and the remaining data used as the training set (on each fold, the feature selection step was performed using only data from this training set)."

In other words, it seems to me that on each cross-validation fold, training was performed on all trials except one, and then the classifier is tested on one single trial.  Does this sort of approach make sense?  It seems to me that with only one test for each classifier you are not adequately assessing the performance of any of the classifiers.   I suppose the idea is that across all of the folds you get a measure of how the classifiers work in general.  I'd like to know if this is considered a reasonable approach since I have a dataset with a small number of trials that might benefit from maximizing the number of training trials. 

Thanks!

Jonas

----
Jonas Kaplan, Ph.D.
Research Assistant Professor
Brain & Creativity Institute
University of Southern California

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20100315/9caf7717/attachment.htm>