[pymvpa] Number of data points per condition: what are your guidelines?

J.A. Etzel jetzel at artsci.wustl.edu
Wed May 2 19:21:01 UTC 2012


On 5/1/2012 11:27 AM, Vadim Axel wrote:
> Hi experts,
>
> I am talking about basic pattern classification (e.g. no feature
> selection etc). SVM algorithm (with built-in regularization).
>
> 1. A small number of data points with large dimension (ROI size)  can
> cause overfitting, which is  high prediction on training set and bad
> test set. Now, suppose, I have a beyond chance classification on test
> set, which was validated using within subject permutation test and
> across subjects t-test vs. chance. Can my results be still unreliable?
> If so, how can I test it?
Errors are always possible, from mislabelings during preprocessing to 
logic errors in coding. Combining methods and levels of analysis (e.g. 
most individual subjects significant in permutation tests and group 
results significant with t-tests) can help. You can check sensitivity as 
well (e.g. do the results change a lot with very small differences to 
thresholds or parameters? Does deleting one subject change things 
drastically?). There's no magic, one-size-fits-all solution.

>
> 2. Practically, is 10 independent data points (averaged block value or
> beta values) with the ROI of 100 voxels is safe enough?
I don't know about "safe", but this is in the range of reasonable things 
to try. I currently have a dataset that works well with a few hundred 
voxels and only 6 examples, and others that have more examples and fewer 
voxels.

>
> 3. Do you know about any imaging papers which tested / discussed this issue?
Mukherjee, S., Golland, P., Panchenko, D.: Permutation Tests for 
Classification. AI Memo 2003-019. Massachusetts Institute of Technology 
Computer Science and Artificial Intelligence Laboratory (2003)

Klement, S., Madany Mamlouk, A., Martinetz, T., Kurková, V., Neruda, R., 
Koutník, J.: Reliability of Cross-Validation for SVMs in 
High-Dimensional, Low Sample Size Scenarios Artificial Neural Networks - 
ICANN 2008. Vol. 5163. Springer Berlin / Heidelberg (2008) 41-50


Jo Etzel


> Thanks for ideas,
> Vadim
>
>
>
>
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa



More information about the Pkg-ExpPsy-PyMVPA mailing list