[pymvpa] [mvpa-toolbox] Re: revisiting the issue of below-chance classification performance

Thu Apr 18 14:26:48 UTC 2013

Hi Hunar,

NB cross-posting to pymvpa mailing list as well, since it would be of
   interest there too, and I doubt we have a complete overlap between
   audiences

Thank you bunch for attacking this issue and bringing back the
discussion.  Unfortunately I have no conclusive answer myself to this
phenomenon in the analysis of fMRI -- I have tried few ideas but they
seems were not representative, at least on simulated data... Since
you have already explored so much -- would you mind trying more? ;-)

e.g. for the dataset(s) where you have already observed consistent
anti-learning -- you have mentioned that you have tried different
classifiers, but have you tried principally different classification
schemes (e.g. non-linear classifiers allows to learn combinatorial
coding  instead of simple linear separation which you have explored).
The most obvious which comes to mind would be SVM with RBF kernel.  You
have mentioned a study with 70 samples per class.  Since to train RBF
you would need to tune up hyper-parameters -- it better to have "larger"
training set for nested cross-validation for that purpose.
if you provide me with your dataset (e.g. in .mat  file with a brief
README on what is what there -- data, labels, run labels) -- I could try
re-analyzing it with PyMVPA and different classification schemes to see
how results would differ from anti-learning

On Thu, 18 Apr 2013, Hunar Ahmad wrote:

>    Thanks a lot for the replay Rawi,

>    But anti learning is a real phenomenon and there are many papers talking
>    about as a different problem than over-fitting!
>    if I change the labels the below chance level become above chance and if I
>    add noise to the data the below chance level performance attenuates to a
>    chance level. Unfortunately not many people encounters this problem as its
>    very unusual and it mostly occurs in datas with high dimensionality and
>    low sample numbers!

>    Regards
>    Hunar

>    On Thu, Apr 18, 2013 at 10:51 AM, MS Al-Rawi <[1]rawi707 at yahoo.com> wrote:

>      Since below chance accuracy occurs in permutation testing
>      experiments with nearly similar probability to that of above chance
>      accuracy,
>      this imply that below chance and above chance could both be stochastic.
>      I would
>      say (and there is� a high possibility that
>      I could be wrong) that if only one subject out of 10 is giving below
>      chance
>      accuracy, then, this could be due to different sources of
>      noise/degradation (e.g.,
>      scanner noise, head motion during acquisition, the subject attention,
>      etc.). In
>      this case, maybe you can overlook that subject and talk about the effect
>      you
>      are targeting in your study. If, on the other hand, the number of below
>      chance
>      subjects is close to the number of above chance subjects, then, there is
>      no
>      correlation between the fMRI and the stimuli.

>      Regards,
>      -Rawi

>      One anti-learning nonbeliever

>      >________________________________
>      > From: Hunar Ahmad <[2]hunar4321 at gmail.com>
>      >To: [3]mvpa-toolbox at googlegroups.com
>      >Sent: Wednesday, April 17, 2013 7:40 PM
>      >Subject: [mvpa-toolbox] Re: revisiting the issue of below-chance
>      classification performance

>      >Hi Jesse,

>      >Thanks for this post, even though its along time ago! but know I'm
>      running into the exact problem you have described!
>      >wondering if you or anybody from this group have got a solution or a
>      theoretical interpretation of why this happening, I really
>      >don't know what to do with the subjects which has a clear below
>      classification performance 30-40 (chance level is 50) while most of the
>      >other normal subjects had classification performance of >60! should I
>      discard the below chance subjects from the analysis
>      >or consider them as low performing subjects as I'm comparing too groups
>      older and younger adults, being unbiased is really import for the
>      study!!
>      >any help or suggestion is really appreciated...
>      >Thanks a lot in advance

>      >Hunar

>      >On Thursday, September 17, 2009 9:46:35 PM UTC+1, Jesse Rissman wrote:
>      >I know that I've brought up this issue before on the mailing list
>      ([4]http://groups.google.com/group/mvpa-toolbox/browse_thread/thread/3b16fd93f001adc0/986053f4ef0d4e1d?pli=1),
>      but this is still bothering me, so I want to bring it up again to see if
>      anyone has any insights. �

>      >>Over the past two years I've literally runs thousands of different
>      classification analyses on the fMRI data from dozens of subjects. �Most
>      of the time classification performance is well better than chance,
>      occasionally it hovers at or near chance (for cognitive states that are
>      challenging to differentiate), but once in a rare while performance is
>      well below chance. �That is to say that classifer is reliably guessing
>      that the test examples come from opposite class as they actually do
>      (typically yielding accuracy levels of 38-44% correct). �Moreover, the
>      stronger the classifer thinks an example is from Class A (as measured by
>      the scalar probability values of the output nodes), the more likely it
>      is to be from Class B. �When this happens, it happens regardless of the
>      classification algorithm I use (regularized logistic regression or
>      support vector machines), and whether I use aggressive feature selection
>      (1000 voxels) or no feature selection at all
>      �(23,000 voxels). �The number of Class A and Class B examples in my
>      training set are always balanced, as are those in my testing set, and
>      the classifier does not develop any bias to guess one class more than
>      the other (i.e., the classifier guesses Class A and Class B roughly an
>      equal amount; the guesses just tend to be wrong more often than not). �I
>      use a leave-one-run-out cross-validation approach with 10 runs, and
>      performance is below chance on most of the test iterations (and always
>      below chance overall), so this doesn't seem to be a fluke of scanner
>      drift, cumulative subject motion, etc. �I see below-chance performance
>      most commonly when there are a small number of training examples of each
>      class (e.g., 20), but I've also observed this when I have 70 examples of
>      each class. �Importantly, when I scramble the Class A and Class B labels
>      prior to running the classifier, performance settles on chance levels --
>      again indicating that this isn't
>      �something wrong with my data or my analysis code. �From informal
>      conversations with mvpa users from several labs, I know that others have
>      also encountered below-chance classification performance in their data.
>      �Below-chance performance is so frustrating because it means that the
>      classifier is actually able to extract meaningful information about the
>      neural signatures of the two classes -- it just somehow learns precisely
>      the opposite labeling as it it should. �Very puzzling...

>      >>Last time I brought this up on the mvpa list,�Yaroslav Halchenko sent
>      me a link to an interesting article by Adam�Kowalczyk that discusses the
>      "anti-learning" phenomenon in supervised machine learning:

>      >>Here's a link to the paper, for those who are interested:

>      >>>A. Kowalczyk and O. Chapelle. An Analysis of the Anti-Learning
>      Phenomenon for the Class Symmetric Polyhedron, in Sanjay Jain, Hans
>      Ulrich Simon, Etsuji Tomita , edts., � �Algorithmic Learning Theory:
>      16th International Conference, ALT 2005, Springer , 2005.

>      >>>[5]http://kyb.mpg.de/ publications/attachments/alt_ 05_%5B0%5D.pdf

>      >>>And a link to a video of a lecture by Dr. Kowalczyk about this issue:

>      >>>"In the talk we shall analyze and theoretically explain some
>      counter-intuitive experimental and theoretical findings that systematic
>      reversal of classifier decisions can occur when switching from training
>      to independent test data (the phenomenon of anti-learning). We
>      demonstrate this on both natural and synthetic data and show that it is
>      distinct from overfitting."

>      >>>[6]http://videolectures.net/ mlss06au_kowalczyk_al/

>      >>While Dr. Kowalczyk's work is intriguing, indicating that below-chance
>      classification performance is a real phenomenon in need of further
>      study, I'm throwing this issue out to the list again because I still
>      can't conceptually wrap my mind around what below-chance classification
>      means, why it happens, and what one might do about it. �Any ideas or
>      anecdotes from your own analysis experiences would be greatly welcome.

>      >>Thanks,
>      >>Jesse

>      >>------------------------------ -------------------
>      >>Jesse Rissman, Ph.D.
>      >>Dept. of Psychology
>      >>Stanford University
>      >>Jordan Hall, Bldg 420
>      >>Stanford, CA 94305-2130
>      >>(o) 650-724-9515
>      >>(f)� 650-725-5699
>      >>[7]http://www.stanford.edu/~ rissmanj
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik