[pymvpa] advice for constructing a dataset for use in pyMVPA

Thu Feb 12 09:38:20 UTC 2009

I have been working with an fMRI data set, using R for classification 
analysis. I’d like to try the pyMVPA package on the same data, so I want 
to use my already-preprocessed (in SPM2) files.

I have 15 subjects, each of which did 3 runs. It was a block designed 
experiment, and I’ve preprocessed the data to have one analyze image per 
block (for each subject & run). I also have anatomical masks, also as 
analyze images. For each block I have several text labels ("color", 
"conf", etc.). I never need to classify on more than one text label at a 
time (i.e. just "color", not "color" and "conf"), though I do need to 
subset the data based on these labels prior to classification (i.e. 
classify "conf" for certain "colors" only).

I am trying to understand how to set up my data for pyMVPA, and 
appreciate your feedback as to whether this is the correct strategy, and 
thank you for your patience with this long post.

In this case, I think that the "samples" are my blocks, and I have two 
levels of "chunks" - runs and subjects. My "labels" are my block types 
("color", etc).

Do I want all of my data in *one* NiftiDataset object or separate ones 
for each subject?

I think that the steps I need to perform to get my data converted for 
pyMVPA are as follows:

1 - use fslmerge to convert my (one-for-each-block) analyze files into 
one large 4D nifti.gz file, containing all the files for all subjects.

2 - make attributes_literal.txt files, one for each labeling I need (one 
for "color", one for "conf", etc). These will be used for the labels 
part of NiftiDataset, read by SampleAttributes. The labels in these 
files need to be in the same order as my volumes in the nifti.gz files.

3 - define arrays to label my files by chunks. I think I will need a 2D 
array: the first column giving the subject number and the second the 
run, with the rows in the same order as my nifti.gz files.

4 - write python code to create my NiftiDataset object, using my analyze 
image (0 for voxels to exclude >0 for voxels to include) as a "mask" if 
I want to restrict my analysis to those voxels.

Would you advice this strategy?

Thank you so much for your help!

Jo