[pymvpa] What is the value of using errorfx when using Cross validation?

Wed Feb 25 15:15:46 UTC 2015

On 25 Feb 2015, at 15:49, gal star <gal.star3051 at gmail.com> wrote:

> I am doing a k-fold cross validation on a data according to the following:
> 1. I'm partioning the data myself - set a train ('0' chunk) and test chunks ('1' chunk).
> 2. Using clf.train() and then clf.predict()
> 3. print the accuracy result and confusion matrix.
> And i'm repeating this k times (by running the script attached k times) […]
> The standard diviation among the accuracy results produced when using CrossValidation class, and the standard diviation among
> accuracy results in the way i described are different.

- your script really is quite messy (lots of code commented out, no documentation), which does not invite others to read and understand your code. Furthermore, it does not actually allow others to reproduce the issue. For future reference It is helpful if you can provide a minimal running example so that others can reproduce what you reported.

- from what I understand from the script, you provide ‘fold' as a parameter, but that parameter is actually not used in the script for your ‘manual’ crossvalidation. In your manual cross validation, you seem to always use chunk 0 for training and chunk 1 for testing.

- the nfold for k folds partitioner trains on (k-1) folds, which is more than the 1 fold you are training on if k>2. This will generally lead to more stable results and thus a lower standard deviation of accuracies.