[pymvpa] optimal way of loading the whole-brain data

Tue May 6 11:57:50 UTC 2014

On May 6, 2014, at 1:19 PM, Dmitry Smirnov <dmi.smirnov07 at gmail.com> wrote:

> I was wondering, what would be the most optimal solution for loading massive data in PyMVPA.
> [...]
> # Trim a number of slices in the end of the file
> def trimImage(filename,cutoff):
>     tmp = nib.load(filename)
>     return nib.Nifti1Image(tmp.get_data()[:,:,:,0:cutoff],tmp.get_affine())
> [...]
> fds = fmri_dataset(samples=[trimImage(('run%i/epi.nii' % (r+1)),346) for r in range(runs)], 
> targets=targets, 
> chunks=selector, 
> mask='/triton/becs/scratch/braindata/DSmirnov/HarvardOxford/MNI152_T1_2mm_brain_mask.nii')

I don't think you would need the trimImage function, and it is possible that using it increases the execution time significantly. You can slice the samples to get the same effect:

run_ds=[]
for r in range(runs)
  ds=fmri_dataset('run%i/epi.nii' % (r+1),...)
  ds=ds[:cutoff,:]
  run_ds.append(ds)

all_ds=hstack(run_ds,True)

Also, are you sure you are not constrained by RAM? How big are the images?