[pymvpa] question about detrend and zscore

John Clithero john.clithero at gmail.com
Tue Oct 6 20:30:06 UTC 2009


Hi Michael,

On Mon, Oct 5, 2009 at 11:28 AM, Michael Hanke <michael.hanke at gmail.com> wrote:
> Hey,
>
> On Mon, Oct 05, 2009 at 11:05:34AM -0400, John Clithero wrote:
>> Hi all,
>>
>> I have a naive question about some of the preprocessing steps in PyMVPA.
>>
>> I am loading in my data and detrending as follows (similar to some
>> examples listed):
>>
>> ##Load Data##
>> dataset = NiftiDataset((wb_file),
>>               labels=attr.labels,
>>               chunks=attr.chunks,
>>               mask=os.path.join(roidir,'wb.nii.gz'))
>>
>> ##Detrend Data##
>> detrend(dataset, perchunk=True, model='linear')
>> zscore(dataset, targetdtype='float32')
>>
>> I have two types of trials (A and B from the labels).
>> If I plot the average voxel value of A trials versus B trials, I get a
>> perfectly negatively correlated line.
>> In other words, if mean(sample voxel on A trials) = .5, then
>> mean(sample voxel on B trials) = -.5. This is true for all voxels.
>>
>> I have looked over miscfx.py, but I thought would send an email to see
>> if (1) this is what "should happen"
>
> Your description indicates a substantial univariate effect in the signal
> of each voxel in your mask (if the above scenario is true for every
> voxel in the dataset). zscoring transforms the data to have zero mean,
> hence a baseline difference is removed and the mean of the classes ands
> up being above and below zero. That can happen if there is such signal
> in the data, but it need not happen (e.g. if there is a more complex
> multivariate signal, or no signal at all).
>
> If you provide some more information about the nature of 'wb_file'
> (preprocessing done to it outside PyMVPA) and this ROI (e.g. size) it
> might be possible to figure out some more aspects of this problem.

The 'wb_file' is func data with the relevant timepoints... it has had
some preprocessing done in FSL (motion correction, slice timing, brain
extraction). The ROI mask is a whole brain mask. It seems (I hope)
that the NiftiDataset is being put together correctly.

This perfect negative correlation occurs even if I feed in arbitrary
labels to the NiftiDataset, so there must be some sort of error in how
I'm using detrend and/or score?? I am guessing this is my erros since
the raw feature data looks fine to me.

This perfect negative correlation also occurs after just implementing
"zscore" or "detrend", although obviously the values are different.

Best,
John

>
>> and (2) if so, what the idea is
>> for making such a split before running a classifier.
>
> If this question refers to why normalization is useful:
>
> Quote from:
>
>  Pereira, F., Mitchell, T. & Botvinick, M. (in press). Machine learning
>  classifiers and fMRI: A tutorial overview. Neuroimage.

Yes, this is a nice paper. I've read it more than once!

>
> | A final issue to consider in the construction of examples is that of
> | preprocessing. By this we do not mean the usual preprocessing of
> | neuroimaging data, e.g. motion correction or detrending, a topic covered
> | in great depth in Strother, (2006). We mean, rather, that done on the
> | examples, considered as a matrix where each row is an example (or each
> | column is a feature). In the example study, we normalized each example
> | (row) to have mean 0 and standard deviation 1. The idea in this case is
> | to reduce the effect of large, image-wide signal changes. Another
> | possibility would be to normalize each feature (column) to have mean 0
> | and standard deviation 1, either across the entire experiment or within
> | examples coming from the same run. This is worth considering if there is
> | a chance that some voxels will have much wider variation in signal
> | amplitude than others. Although a linear classifier can in principle
> | compensate for this to some degree by scaling the coefficient for each
> | voxel, there are situations where it will not and thus this
> | normalization will help. Our own informal experience suggests either row
> | or column normalization is generally beneficial and should be tried in
> | turn.
>
>
>
> Cheers,
>
> Michael
>
> --
> GPG key:  1024D/3144BE0F Michael Hanke
> http://mih.voxindeserto.de
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>



More information about the Pkg-ExpPsy-PyMVPA mailing list