[pymvpa] Suspicious results

Nynke van der Laan nynkevanderlaan at gmail.com
Mon Feb 28 16:58:00 UTC 2011


Dear Yaroslav and Francisko,

many thanks for your quick responses.
I've repeated the same analysis with a radius of 1 mm for one subject.
Although there is more variation compared with the previous results
(radius 10 mm), the distribution still has a > 0.5 bias. (to be
specific, the peak is at approx. 0.6).
The output with dataset.summary() is:
Dataset / float64 72 x 27778
uniq: 36 chunks 2 labels
stats: mean=932.696 std=285.553 var=81540.5 min=9.4 max=1687
No details due to large number of labels or chunks. Increase maxc and
maxl if desired
Summary per label across chunks
  label mean std min max #chunks
   1      1   0   1   1     36
   2      1   0   1   1     36

So, each chunk has the labels balanced (which means chance level is 0.5).

To mention: all data stems from 1 scanning session (approximately 15
minutes). Each chunk is one trial (approx 16 sec in total). Trials
were separated by a (random duration) inter trial interval. Because of
the inter trial interval I would not expect that there is
contamination between chunks/trials, would you? Functional scans are
realigned and normalized to MNI (and block-averaged to reduce amount
of samples).

Best regards,
Nynke

On Mon, 28 Feb 2011, Nynke van der Laan wrote:
> What I did is the following: I did a searchlight analysis (radius 10
> mm)

which makes it 20mm in diameter, altogether meaning that you could get
"legally" >chance performance in your searchlight center anywhere 1cm
apart from  the actual relevant activation point.  That would be one of
the effects which would add up to the heavy right tail in your resultant
distribution of the performances.  to see how much an effect of this one
-- reduce radius to 1mm and run the same searchlight -- is distribution
loosing its heavy >0.5 bias?

> brain mask). I used a NFoldCrossvalidation (no detrending or
> z-scoring).

well, depending on the actual data and experimental design, absent
detrending might add confounds.

Also, although you have mentioned that every chunk had labels balanced,
what is the output of

dataset.summary()
?


also, because of no z-scoring with not tuned RBF (non-linear) SVM, I am
not sure if it trained correctly per se.... what is the "picture" if you
use Linear SVM? what if you introduce zscoring and detrending?

> I use two stimuluscategories. The task I used consisted of 38 chunks
> (38 trials) with in each chunk two stimuluspresentations (one of each
> category). I have used blockaveraging to reduce features.

blockaveraging reduces samples, not features... ?

> Because I have two stimuluscategories the chance level accuracy would
> thus be 0.5

yes, unless samples are disbalanced across labels/chunks when
classifier might go for the 'overrepresented' class.

> correctly classified) So this would mean that there is predictive
> information in all regions of the brain..

well -- more precisely, "every voxel seems to find a relevant diagnostic
neighbor within 10mm radius", so not necessarily carrying predictive
information itself.

> The highest peaks are located at the borders of the brain.

was data motion corrected? was motion correlated with the design? (what
accuracy would obtain by using motion correction
parameters/characteristics such as displacement as your features)

-- 
=------------------------------------------------------------------=
Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic



More information about the Pkg-ExpPsy-PyMVPA mailing list