[pymvpa] New evil change from me

Yaroslav Halchenko debian at onerussian.com
Fri Sep 5 17:34:55 UTC 2008


Hi All,

I got bored with numerical codes for the labels.  since I am also a lazy person
( ;-) ) I decided to get the problem solved an easy way -- instead of providing
some uniform mapping from literal to numeric values in the classifiers (SMLR
and sg.SVM do it anyways pretty much) I decided to implement it on the Dataset
level, so it has to be done pretty much once.

My changes are backward compatible, so all existing code should be able
to run no problem (although never know ;-))

Now if you like to have literal labels, you are welcome to do so, but
also instruct Dataset to remap them with 'labels_map=True' or with
explicit mapping "labels_map={'candy':1, 'shit':0}"

so
Dataset(samples=..., labels=['candy', 'candy', 'shit'], labels_map=True)

Not to say that you can remap numerical and literal labels to group them:

labels_map={0:1, 1:1, 2:1, 3:2}

then in confusion matrix (as below) you will see 0,1,2 as  a single class name
for target 1

Dataset.select.* functions should work fine and provide new Dataset
instances with full mapping

ConfusionMatrix instances should be provided labels_map argument as well
for it to display smth like

    ----------.         bottle   cat   chair  face   house scissors scrambledpix  shoe
    predictions\targets    0      1      2      3      4       6          7         8
                `------  -----  -----  -----  -----  -----   -----      -----     -----  P'  N'  FP  FN  PPV  NPV  TPR  SPC  FDR  MCC
           bottle / 0      1      2      4      5      3       4          8         3    30 227  29 107 0.03 0.53 0.01 0.81 0.97 -0.13
              cat / 1      9      4      5      9      8      12          5         3    55 221  51 104 0.07 0.53 0.04  0.7 0.93 -0.15
            chair / 2     10     11      5      8      9       6         10         8    67 219  62 103 0.07 0.53 0.05 0.65 0.93 -0.17
             face / 3     15     20     14     17     17      13         20        25   141 195 124  91 0.12 0.53 0.16 0.46 0.88  -0.2
            house / 4     17     23     18     19     23      16         15        17   148 183 125  85 0.16 0.54 0.21 0.44 0.84 -0.18
         scissors / 6     11     14     13      7     17      13          6        19   100 203  87  95 0.13 0.53 0.12 0.55 0.87 -0.17
      scrambledpix / 7    30     18     31     22     16      25         38        13   193 153 155  70  0.2 0.54 0.35 0.35  0.8 -0.16
             shoe / 8     15     16     18     21     15      19          6        20   130 189 110  88 0.15 0.53 0.19 0.48 0.85 -0.17
        Per target:      -----  -----  -----  -----  -----   -----      -----     -----
             P            108    108    108    108    108     108        108       108
             N            756    756    756    756    756     756        756       756
             TP            1      4      5     17     23      13         38        20
             TN           120    117    116    104    98      108        83        101
          SUMMARY:       -----  -----  -----  -----  -----   -----      -----     -----
            ACC          0.14
            ACC%          14
         # of sets        12

Unless you are manually creating a confusion matrix, you have nothing to worry
about -- classifiers and CrossValidatedTransferError know what to do
already (where else do we have confusions?)


-- 
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student  Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
        101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW:     http://www.linkedin.com/in/yarik        



More information about the Pkg-ExpPsy-PyMVPA mailing list