[pymvpa] cross-validation

Mon Mar 15 18:37:09 UTC 2010

Thanks for the reply...

On Mar 15, 2010, at 11:11 AM, Yaroslav Halchenko wrote:

> 
> The only problem (while reading the quote  you gave not the paper) I
> could see depends on either by trial they meant just 1 sample (bold
> volume) or an independent run (chunk in PyMVPA terms)... if it is
> just 1 sample of 1 category, they could get quite biased estimate if
> they have overall relatively small number of trials and noisy data...
> more over, if 'trials' are coming from within the same scanning session
> then it might be hard to believe in significance testing requiring
> independence of trials (even for permutation testing strictly speaking) 
> 

All trials (recall events) are acquired in one continuous scan, although they do report some control analyses to test for independence of the trials.  I believe each 'trial' that goes into the classifier is either a single bold volume or a combination of volumes from one recall event, although I can't quite figure this out from the paper.   Aside from the independence issue though, do you see a problem with testing each classifier with only one trial?

>>   I'd like to know
>>   if this is considered a reasonable approach since I have a dataset with
>>   a small number of trials that might benefit from maximizing the number
>>   of training trials.
> how small is small? number of trials/chunks(sessions)/labels?

About 15 trials in each of 5 categories (labels), spread across 4 scanning sessions.  This experiment was not designed with mvpa in mind.  The trials are slow and decently spaced from each other, though, so the question would be could we treat each trial as a separate chunk as I believe the Current Biology paper has done.