[pymvpa] cross-validation

Francisco Pereira francisco.pereira at gmail.com
Mon Mar 15 18:58:23 UTC 2010


Hi Jonas,

You are absolutely right in that the classifiers being trained in each
fold are different, and essentially you only get to see their
performance on a single trial. However, given that all of them share
almost their entire training set, the assumption is that the
classifiers will be very similar and results are reported as if they
had come from a classifier tested on all k examples.

The simplest scenario is one where you have a training set (K1
examples) and a separate test set (K2 examples). Then the result you
get is an estimate of the error (how precise an estimate depends on
how large K2 is) of a classifier trained on *that specific training
set*.

If you wanted to make a more general claim and could produce any
number of training sets of size K2, then what you would have would be
an estimate of the error of a classifier trained on a dataset with K2
examples.

The situation here is something in between: there are multiple
training datasets, all with k-1 examples, but they are not
independent. In practice, people overlook this and report it as I
described above.

If you are interested in this issue, there are two good references

"A Study of Cross-Validation and Bootstrap for Accuracy Estimation and
Model Selection"
by Ron Kohavi
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.529

"Approximate Statistical Tests for Comparing Supervised Classification
Learning Algorithms"
by Tom Dietterich
http://web.engr.oregonstate.edu/~tgd/publications/nc-stats.ps.gz

I was going to also give caveats about the trials being too close
together, but Yarik has already beat me to that :)

Francisco


On Mon, Mar 15, 2010 at 12:46 PM, Jonas Kaplan <jtkaplan at usc.edu> wrote:
> Hello,
> I'd like to pick the group's brain about an issue related to
> cross-validation.
> I just read a paper in Current Biology (Chadwick et al., 2010, Decoding
> Individual Episodic Memory Traces in the Hippocampus) in which the following
> cross-validation procedure was used:
> "We used a standard k-fold cross- validation testing regime [10] wherein k
> equalled the number of experimental trials, with the data from each trial
> set aside in turn as the test data, and the remaining data used as the
> training set (on each fold, the feature selection step was performed using
> only data from this training set)."
> In other words, it seems to me that on each cross-validation fold, training
> was performed on all trials except one, and then the classifier is tested on
> one single trial.  Does this sort of approach make sense?  It seems to me
> that with only one test for each classifier you are not adequately assessing
> the performance of any of the classifiers.   I suppose the idea is that
> across all of the folds you get a measure of how the classifiers work in
> general.  I'd like to know if this is considered a reasonable approach since
> I have a dataset with a small number of trials that might benefit from
> maximizing the number of training trials.
> Thanks!
> Jonas
> ----
> Jonas Kaplan, Ph.D.
> Research Assistant Professor
> Brain & Creativity Institute
> University of Southern California
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>
>



More information about the Pkg-ExpPsy-PyMVPA mailing list