[pymvpa] Balancing strategy
J.A. Etzel
jetzel at artsci.wustl.edu
Sat Dec 7 02:26:55 UTC 2013
I don't fully understand what you're asking about; to summarize: you want
to do leave-one-run-out cross-validation. You don't want to include all
scans, because of poor image quality. But omitting scans causes imbalance.
I think your concern is that you end up with unequal numbers of examples of
each task type in each run after getting rid of the bad images? If the
imbalance isn't too bad (e.g. 10 examples of one class in the run, 8 of the
other), my usual strategy is to subset the larger class (e.g. only using 8
of the 10 examples). Since there are many ways to do the subsetting, I
usually suggest doing 10 different random subsets (e.g. examples
c(1:6,9,10); 2:9) and averaging over the subsets. But if the imbalance is
quite bad (e.g. only 1 or 2 of examples left of a class in a run) I
sometimes change the partitioning (e.g.
leave-two-sequentially-presented-runs-out) to get the balance a bit closer.
Not hard-and-fast rules, but I hope it helps,
Jo
Sent with AquaMail for Android
http://www.aqua-mail.com
More information about the Pkg-ExpPsy-PyMVPA
mailing list