[pymvpa] SMLR needs fixing?
Yaroslav Halchenko
debian at onerussian.com
Wed Jun 4 15:19:59 UTC 2008
SMLR confuses me a bit... it seems that classification (generalization actually
but I don't think that it is a problem anywhere outside of SMLR although could
be) is heavily dependent on the order of labels... sorry for being cryptic --
in a data with 5 labels, which I code to numbers from 0 to 4 here are 2
outcomes where in 2nd one labels 0 and 4 are interchanged (as you can see by
total number of samples for that category in P)
----------.
predictions\targets 0 1 2 3 4
`------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
0 105 0 0 0 1 106 303 1 0 0.99 1 1 1 0.01 0.95
1 0 82 0 0 22 104 328 22 2 0.79 0.99 0.98 0.94 0.21 0.83
2 0 0 84 0 4 88 324 4 0 0.95 1 1 0.99 0.05 0.93
3 0 0 0 84 4 88 324 4 0 0.95 1 1 0.99 0.05 0.93
4 0 2 0 0 53 55 386 2 31 0.96 0.92 0.63 0.99 0.04 0.74
Per target: ----- ----- ----- ----- -----
P 105 84 84 84 84
N 336 357 357 357 357
TP 105 82 84 84 53
TN 303 326 324 324 355
SUMMARY: ----- ----- ----- ----- -----
ACC 0.93
ACC% 92.52
# of sets 4
----------.
predictions\targets 0 1 2 3 4
`------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
0 84 1 0 0 28 113 256 29 0 0.74 1 1 0.9 0.26 0.73
1 0 83 0 0 23 106 258 23 1 0.78 1 0.99 0.92 0.22 0.74
2 0 0 84 0 21 105 256 21 0 0.8 1 1 0.92 0.2 0.76
3 0 0 0 84 28 112 256 28 0 0.75 1 1 0.9 0.25 0.73
4 0 0 0 0 5 5 435 0 100 1 0.77 0.05 1 0 0.19
Per target: ----- ----- ----- ----- -----
P 84 84 84 84 105
N 357 357 357 357 336
TP 84 83 84 84 5
TN 256 257 256 256 335
SUMMARY: ----- ----- ----- ----- -----
ACC 0.77
ACC% 77.1
# of sets 4
and if fit_all_weights == True, then results are consistent and much nicer:
----------.
predictions\targets 0 1 2 3 4
`------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
0 105 0 0 0 0 105 334 0 0 1 1 1 1 0 1
1 0 83 0 0 0 83 357 0 1 1 1 0.99 1 0 0.99
2 0 0 84 1 0 85 355 1 0 0.99 1 1 1 0.01 0.99
3 0 0 0 83 0 83 357 0 1 1 1 0.99 1 0 0.99
4 0 1 0 0 84 85 355 1 0 0.99 1 1 1 0.01 0.99
Per target: ----- ----- ----- ----- -----
P 105 84 84 84 84
N 336 357 357 357 357
TP 105 83 84 83 84
TN 334 356 355 356 355
SUMMARY: ----- ----- ----- ----- -----
ACC 1
ACC% 99.55
# of sets 4
----------.
predictions\targets 0 1 2 3 4
`------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
0 84 1 0 0 0 85 355 1 0 0.99 1 1 1 0.01 0.99
1 0 83 0 0 0 83 357 0 1 1 1 0.99 1 0 0.99
2 0 0 84 1 0 85 355 1 0 0.99 1 1 1 0.01 0.99
3 0 0 0 83 0 83 357 0 1 1 1 0.99 1 0 0.99
4 0 0 0 0 105 105 334 0 0 1 1 1 1 0 1
Per target: ----- ----- ----- ----- -----
P 84 84 84 84 105
N 357 357 357 357 336
TP 84 83 84 83 105
TN 355 356 355 356 334
SUMMARY: ----- ----- ----- ----- -----
ACC 1
ACC% 99.55
# of sets 4
I just wanted to check if anyone observed smth like that before with SMLR
(before I dive into figuring out wtf), or may be Per has a clue right away
because it seems to be just a little issue in computing probability for 'the
other label'..?
--
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-ExpPsy-PyMVPA
mailing list