[pymvpa] variable chance levels and searchlights

Mike E. Klein michaeleklein at gmail.com
Thu May 8 03:09:40 UTC 2014


Hi all,

I’m running some searchlights where (due to task mis-performance and the
need to throw out some volumes) there are unbalanced categories, where the
exact number of examples/target differ from subject to subject (and from
chunk to chunk within subjects!). There aren’t -too- many of these bad
volumes, so the categories are generally somewhat still close to 50%:50%
category1:category2 (or 67/33 for a separate analysis).

My goal, as before, is to create for each subject an information map where
each voxel’s value represents an accuracy-above-chance value for a local
sphere. I plan on using these volumes in a top-level analysis in SPM.

I’m wondering:

1. If it’s bad to have slight size-of-categories mismatches between
subjects and between targets for any given subject. If so, I guess I could
artificially prune some of the “good” volumes from the categories that have
greater numbers of examples. However, I’m worried about making my dataset
too small.

2. If there is a fairly simple and logical way to calculate the
chance-level accuracy rate in a script.

At the moment I use something like this to convert the
searchlight-generated error numbers into accuracy maps:

# [10] convert into percent-wise accuracies
s1_map.samples *= -1
s1_map.samples += 1
s1_map.samples *= 100
s1_map.samples -= 50 # for 2-category classification with equal numbers of
examples in each category

Ideally, I’d like to automate whatever replaces the “50”. That said, I’m
not sure if the number would always be obvious. (e.g. if I have 100 of one
target and 50 of another, would the theoretical chance accuracy rate really
be 66.7%? Or some value lower than this?)

*Completely separately*, I was wondering if it were possible to add and
account for a third row to the chunks/targets text file. One of my
potential analyses involves training and testing on completely separate
data (i.e. not LOOCV). I was, at first, thinking that I could hijack the
“chunks” column for this purpose, giving dataA odd values, dataB even
values, and then using an oddevensplitter. However, I still need the -real-
chunks for linear detrending, etc.

Thanks for your help!
Mike Klein
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140507/46b99ffd/attachment.html>


More information about the Pkg-ExpPsy-PyMVPA mailing list