[pymvpa] data scaling and accounting for nuisance factors

Tue Sep 6 04:52:14 UTC 2011

Hello,

I have lesion data, and I am trying to test whether particular patterns of lesions distinguish two classes of patients. I have two questions:

1) What is the best way to scale the lesion data? Traditionally, these data are represented with 1s (lesion) and 0s (no lesion). I've played around with different scalings, and I've gotten different (but replicable) results using the SMLR classifier in PyMVPA 0.4. See below: first column is the leave-one-out CV; second column the value for the spared voxels; third column is the value for the damaged voxels.
CV	NoLesion	Lesion
83.571	000	001
75.000	001	002
77.143	002	004
81.429	100	200
81.429	200	400

2.) What is the best way to control for a nuisance factor? I know there is an additional variable (i.e., lesion volume) that can distinguish between my two patient groups, so I would like the resulting CV and heavily weighted voxels to be uncontaminated by this nuisance factor. Ideally, I would like to know how much additional predictive power is gained over and above this nuisance factor. 

Thanks,
David