[pymvpa] effect size (in lieu of zscore)

Jonas Kaplan jtkaplan at usc.edu
Mon Jan 2 18:38:54 UTC 2012

> 1. For this one particular subject, I'm still seeing the strange negative peak to the chance distribution, even without any z-scoring. The shape looks remarkably similar with or without zscoring (whether I use the raw values or the effect sizes as input). I think my confusion here is, even if I did several things wrong in my code, I'd expect no worse than a regular-old looking chance distribution (centered on 0). There's about 40,000 3.5 isotrophic voxels in that subject's brain mask, so plenty of observations. Just eyeballing, the peak is centered at about -8%, and the bulk (95%) of observations fall between about -32% and +22% … so it's a notable shift. 

I don't know how exactly these permutations are done nowadays, but couldn't something like this happen if the permutations resulted in unequal number of trials of each type in the new chunks?  I got distributions like this before I enabled the perchunk option in the old v.4 permuteLabels function to ensure trials were balanced.  

> 3. I'm using a 3X3 design. So while I technically have 9 unique sounds, for this part of my study I collapse in one direction (fundamental frequency), giving me three classes of sounds (each class of which has a variable fundamental frequency). Each of my 9 runs has the 9 unique sounds, each repeated thrice. My current run-averaging scheme has me breaking up each run into 3 smaller portions ("chunklets" if you will…) by time (early part of the run, middle period, late period) and averaging within a chunklet. What this means is that a single example that is formed within a "chunklet" is made from 3 fMRI volumes, each following a sound from one experimental condition (but from different levels of the orthogonal condition: fundamental frequency). So, from my perspective, I'm feeding the SVM examples which, to my eye, are all "equivalent" - each example represents a single category/level in one dimension, and a "smear" of levels in the other dimension. 
> A second way I could be doing this, but haven't been, is to not define "chunklets" by time within a run (early, middle, late), but instead by level of the orthogonal condition (low fundamental frequency, mid FF, high FF). A single example for the SVM, after averaging, would then represent sounds from a single experimental condition and a single FF (so not smeared over either dimension). These averaged examples would presumably be cleaner (due to the combining of volumes from identical acoustic stimuli), but would each be less representative of the category-of-interest as a whole. I'm not sure what makes more sense for a support vector machine: to work on cleaner (but less representative) examples, or more variable (but more representative) examples. 

Or alternatively don't average among the trials at all and just code each trial for its category of interest? 


Jonas Kaplan, Ph.D.
Research Assistant Professor
Brain & Creativity Institute
University of Southern California

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20120102/c9cb2360/attachment.html>

More information about the Pkg-ExpPsy-PyMVPA mailing list