[pymvpa] zscore clarification and question

Wed Dec 1 14:46:38 UTC 2010

The thread-hijacking was inadvertent, I assure you! :)

It's still not clear to me that setting pervoxel=False will work.

 From http://www.scipy.org/Numpy_Example_List#mean the axis argument 
determines whether the mean is calculated over all values in the array, 
for each row, or for each column. myDataset.samples is a 2d array 
(volumes x voxels). In the /mvpa/datasets/miscfx.py zscore method (which 
I think is the code used when calling zscore(myDataset)) the axis 
argument is set to {} when pervoxel is false, which calculates a single 
mean for the entire array. When pervoxel is true the axis argument is 
set to 0, which calculates the mean for each column in the 
dataset.samples. I think for row-wise scaling the argument would need to 
be 1, to calculate the mean and standard deviation row-wise.

thanks,
Jo

On 11/30/2010 6:53 PM, Yaroslav Halchenko wrote:
> Hi J.A., The Great Thief of a Thread! ;)
>
> I am sorry for our scarce docstring of zscore :-/
>
> pervoxel=False is actually what you are looking for I believe --
> row-wise standartization.
> just be careful with it -- using it for generalization assessment is ok,
> but features are not 'voxels' anylonger and you shouldn't look at their
> sensitivities from then.
>
>
> On Tue, 30 Nov 2010, J.A. Etzel wrote:
>
>> Setting perchunk=False and pervoxel=True normalizes the entire
>> column: each voxel has a mean of zero and standard deviation of one
>> over all volumes in the dataset.
>
>> Setting perchunk=False and pervoxel=False normalizes over all
>> columns: all voxels together have a mean of zero and standard
>> deviation of one over all volumes in the dataset.
>
>> Setting perchunk=True and pervoxel=False normalizes over all columns
>> within each chunk: all voxels together have a mean of zero and
>> standard deviation of one over all volumes in each chunk.
>
>> This gives various options for normalizing column-wise, but is it
>> possible to normalize row-wise? In other words, normalize so that
>> the voxels in each sample (volume) have a mean of zero and standard
>> deviation one? Chunks are irrelevant for this case, since each
>> sample is normalized separately. If this is already in pyMVPA, can
>> you point me in the right direction?
>