[pymvpa] null classification performance in the presence of strong univariate signal??

Mon Sep 8 09:31:53 UTC 2014

On Sep 8, 2014, at 11:21 AM, David Soto <d.soto.b at gmail.com> wrote:

> There are two experimental conditions  *cued and uncued* and 19 subjects. 
> 
> We therefore have a 4D nii file in which volumes 1-19 are PEs for the *cue* classification target and volumes 20-38  are the PEs for the *uncued* classification target

If I understand correctly, you have two samples per subject (one for each condition), and each value for a chunk corresponds to one subject.
With those parameters you would be doing between-subject classification. Are you sure that is what you want?

I'm asking, because /almost/ all MVPA (hyperalignment (TM) being an exception) are doing within-subject analysis. If you have not done so yet, I suggest strongly to do within-subject analysis first before trying between-subject analysis.

For that you would need more than one sample for each chunk. In fMRI world people usually take each run as a chunk; if you have 6 runs, you would have 6 chunks. Using the GLM to estimate the response gives 2 samples in each chunk (indicated by .sa.targets), and 12 samples in total. 
In this scenario you would analyze  each subject separately.

> 
> The above code gave a Warning re: the zscoring stage becos the number of datapoints per chunks was only 2 (in the example above the chunks were the subjects)

Yes, with only two values per chunk you cannot z-score, because you're losing 2 dfs when estimating mean and standard deviation. The z-scoring is more suitable for fMRI time series. When using a GLM, it is a good idea to do signal normalization (z-scoring, or dividing the signal at each timepoint by the mean over all timepoints - for each run (or chunk) separately) /before/ the GLM.