[pymvpa] feature sensitivity in MVPA run on EEG data

Fri Jan 17 09:57:12 UTC 2014

On Jan 16, 2014, at 10:22 PM, Marius 't Hart wrote:

> On 14-01-14 12:48 PM, Nick Oosterhof wrote:
>> On Jan 12, 2014, at 9:40 PM, Marius 't Hart wrote:
>> 
>>> My toy problem is to classify the two most extreme conditions (two targets) based on averaged Cz and Pz activity [...]
>> Did you provide an equal number of samples for each class (target)? Because if you didn't then 60% could, in principle, be due to chance. That is, if 60% of the samples are in one class in the training set, then a 'naive' classifier that is just guessing that class all the time will give 60% accuracy.
> 
> Yes, after manual artefact rejection the number of trials in each condition is different. I take the first N trials from each condition, with N being the number of trials in the condition with the smallest number of trials. This number is different for each participant.

That's fine then - it means that an accuracy of 60% would be meaningful, if it is consistent above 50% across participants.

>> 
>> Also: what preprocessing did you do? Any z-scoring, baseline correction etc?
> 
> I do baseline correction, but no Z-scoring. Should I do Z-scoring? If so, over all data, within electrode or within trial?

There are no *absolute* rules for this, and even then there are different ways to do z-scoring. 

In fMRI world there are at least two ways, both are voxel-wise
1) take all data to compute mean and std.
2) take data from baseline periods to compute mean and std. 

Translating this to MEEG, you could try a sensor-wise z-scoring. It may be best to use the pre-stimulus period to compute the mean and std parameters, so that noisy channels will be scaled less and thus contribute less to classification. 

However I don't know whether this as easily done in PyMVPA for MEEG data as in fMRI data, as I would assume that your features consist of combinations of time-points and channels, whereas in fMRI data the features are just voxels.

>> 
>>> In the CNV plot it looked like the usefulness of Pz and Cz for the classifier(s) should flip at around 1 second in the preparation interval, so I wanted to look at sensitivity. [...] That doesn't look like what I expected - but I find it hard to judge if what I'm doing is actually correct. For example, on inspecting all the different sensitivity datasets, it looks like the sign that each feature gets is usually the same... but there are a few exceptions. Does the sign actually mean anything in terms of a feature's usefulness?
>> As far as I know the sign is not that important - it's more about the relative magnitude. If the sensitivity is further away from zero then that means the feature is more important for discriminating the classes.
> 
> OK, so basically, although it looks like there is difference in the usefulness of the electrodes for classifying the conditions, the classifiers don't reflect that. Would it make sense to try different classifiers, instead of Linear SVM?

You could try and see if you get consistent results using other classifiers, but generally linear SVM is a good classifier to start with. For now I would stay away from non-linear stuff as its results are more difficult to interpret, and the low number of trials in a typical experiment means it can be prone to overfitting.