[pymvpa] feature sensitivity in MVPA run on EEG data
Marius 't Hart
mariusthart at gmail.com
Fri Mar 7 23:04:27 UTC 2014
Hi Brian,
I've just tried PLR. I can get it to work with 2 categories but not with
4. Is that some error on my part or is it correct that PLR only works
with 2 categories? I'd like to have a classifier that can also handle 3
or 4 categories.
Thanks!
Marius
On 14-01-28 07:33 AM, Brian Murphy wrote:
> Hi,
>
> just jumping into this discussion a bit late...
>
>> Tying in to another discussion, could it be beneficial to first average
>> every 5 trials or so? In a way this reduces noise, so the performance
>> would most likely go up - as might the informativeness of feature
>> sensitivity. The downside is that you no longer have predictions on a
>> trial by trial basis.
> If you have enough data to get away with it (ie will still have enough
> cases to train on), then yes, it is worth trying, with a very important
> caveat: that you are interested in time-domain signals. Obviously a
> straight trialwise averaging will wash out any interesting spectral
> activity which isn't phase-locked (and given your task, precise
> phase-locking seems unlikely). But anyway, averaging might clean up the
> sensitivity maps. Then again, from the paper-writing point of view,
> keeping things as simple as possible is always preferable.
>
>>>>> Also: what preprocessing did you do? Any z-scoring, baseline correction etc?
>>>> I do baseline correction, but no Z-scoring. Should I do Z-scoring? If so, over all data, within electrode or within trial?
> I'm not an SVM expert, so this might not be relevant - but for many
> classifiers, the weights are only interpretable as sensitivity measures
> if the underlying variable is on a similar scale. So, for the sake of
> argument, if your Cz was twice as loud as your Pz (unlikely, I know),
> then it's weights would be scaled down, and not be directly comparable.
> So yes, for sensitivity analyses z-scoring of some kind would be
> advisable - there are several ways, e.g. ideally you would do this based
> on the *clean* rest periods (you've done manual artefact rejection - so
> that should be possible). But for EEG data you can often just z-score
> based on the whole signal time-course. [I see Nick O has made similar
> suggestions]
>
>
>> That doesn't look like what I expected - but I find it hard to judge if
>> what I'm doing is actually correct.
> There are few reasons that could account for the differences you see between the ERPs and the sensitivity maps:
> - different scaling of the input signals (as above)
> - more/less variance in the signals (looking at the ERPs, it looks like particular periods have better or worse separation between the conditions, but it is not just the magnitude of this difference that matters, but rather its magnitude *relative to the variance* across trials)
> - models may also give weights to features that are good descriptions of noise, so that noise can be factored out of other condition-informative features. See this paper for details, also on how to normalise the sensitivity maps to compensate for this effect:
> http://www.citeulike.org/user/emanueleolivetti/article/12177881
>
> Regards classifiers, LinSVM is good, but my preference would be a regularized logistic regression (e.g. PLR), as I've yet to find a situation in which any variety of SVM gives me a decisive performance advantage. Also, consider the idea of SVMs, which are to find a hyperplane that best separates the boundary cases. If these boundary cases are representative of the conditions in general that is just fine. But if they are outliers in some sense, then maybe not.
>
> Brian
>
More information about the Pkg-ExpPsy-PyMVPA
mailing list