[pymvpa] SMLR needs fixing?
Per B. Sederberg
persed at princeton.edu
Wed Jun 4 15:27:52 UTC 2008
Fascinating!
Is it the case that you have different numbers for each label type and
specifically for the two that you are switching. If so, then the
models it is using to fit the data are actually different for the two
conditions and I would expect the differences you see.
Perhaps the safe thing to do whenever your N are different for each
label is to just train the full model. Although it is slightly
slower, the weights you get from training the full model are actually
meaningful. They are not meaningful for a multiclass classification
if you train the N-1 model.
How about we set the default to be to always train the full model?
Best,
Per
On Wed, Jun 4, 2008 at 11:19 AM, Yaroslav Halchenko
<debian at onerussian.com> wrote:
> SMLR confuses me a bit... it seems that classification (generalization actually
> but I don't think that it is a problem anywhere outside of SMLR although could
> be) is heavily dependent on the order of labels... sorry for being cryptic --
> in a data with 5 labels, which I code to numbers from 0 to 4 here are 2
> outcomes where in 2nd one labels 0 and 4 are interchanged (as you can see by
> total number of samples for that category in P)
>
> ----------.
> predictions\targets 0 1 2 3 4
> `------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
> 0 105 0 0 0 1 106 303 1 0 0.99 1 1 1 0.01 0.95
> 1 0 82 0 0 22 104 328 22 2 0.79 0.99 0.98 0.94 0.21 0.83
> 2 0 0 84 0 4 88 324 4 0 0.95 1 1 0.99 0.05 0.93
> 3 0 0 0 84 4 88 324 4 0 0.95 1 1 0.99 0.05 0.93
> 4 0 2 0 0 53 55 386 2 31 0.96 0.92 0.63 0.99 0.04 0.74
> Per target: ----- ----- ----- ----- -----
> P 105 84 84 84 84
> N 336 357 357 357 357
> TP 105 82 84 84 53
> TN 303 326 324 324 355
> SUMMARY: ----- ----- ----- ----- -----
> ACC 0.93
> ACC% 92.52
> # of sets 4
>
> ----------.
> predictions\targets 0 1 2 3 4
> `------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
> 0 84 1 0 0 28 113 256 29 0 0.74 1 1 0.9 0.26 0.73
> 1 0 83 0 0 23 106 258 23 1 0.78 1 0.99 0.92 0.22 0.74
> 2 0 0 84 0 21 105 256 21 0 0.8 1 1 0.92 0.2 0.76
> 3 0 0 0 84 28 112 256 28 0 0.75 1 1 0.9 0.25 0.73
> 4 0 0 0 0 5 5 435 0 100 1 0.77 0.05 1 0 0.19
> Per target: ----- ----- ----- ----- -----
> P 84 84 84 84 105
> N 357 357 357 357 336
> TP 84 83 84 84 5
> TN 256 257 256 256 335
> SUMMARY: ----- ----- ----- ----- -----
> ACC 0.77
> ACC% 77.1
> # of sets 4
>
> and if fit_all_weights == True, then results are consistent and much nicer:
>
> ----------.
> predictions\targets 0 1 2 3 4
> `------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
> 0 105 0 0 0 0 105 334 0 0 1 1 1 1 0 1
> 1 0 83 0 0 0 83 357 0 1 1 1 0.99 1 0 0.99
> 2 0 0 84 1 0 85 355 1 0 0.99 1 1 1 0.01 0.99
> 3 0 0 0 83 0 83 357 0 1 1 1 0.99 1 0 0.99
> 4 0 1 0 0 84 85 355 1 0 0.99 1 1 1 0.01 0.99
> Per target: ----- ----- ----- ----- -----
> P 105 84 84 84 84
> N 336 357 357 357 357
> TP 105 83 84 83 84
> TN 334 356 355 356 355
> SUMMARY: ----- ----- ----- ----- -----
> ACC 1
> ACC% 99.55
> # of sets 4
>
> ----------.
> predictions\targets 0 1 2 3 4
> `------ ----- ----- ----- ----- ----- P' N' FP FN PPV NPV TPR SPC FDR MCC
> 0 84 1 0 0 0 85 355 1 0 0.99 1 1 1 0.01 0.99
> 1 0 83 0 0 0 83 357 0 1 1 1 0.99 1 0 0.99
> 2 0 0 84 1 0 85 355 1 0 0.99 1 1 1 0.01 0.99
> 3 0 0 0 83 0 83 357 0 1 1 1 0.99 1 0 0.99
> 4 0 0 0 0 105 105 334 0 0 1 1 1 1 0 1
> Per target: ----- ----- ----- ----- -----
> P 84 84 84 84 105
> N 357 357 357 357 336
> TP 84 83 84 83 105
> TN 355 356 355 356 334
> SUMMARY: ----- ----- ----- ----- -----
> ACC 1
> ACC% 99.55
> # of sets 4
>
>
>
> I just wanted to check if anyone observed smth like that before with SMLR
> (before I dive into figuring out wtf), or may be Per has a clue right away
> because it seems to be just a little issue in computing probability for 'the
> other label'..?
>
> --
> Yaroslav Halchenko
> Research Assistant, Psychology Department, Rutgers-Newark
> Student Ph.D. @ CS Dept. NJIT
> Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
> 101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
> WWW: http://www.linkedin.com/in/yarik
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>
More information about the Pkg-ExpPsy-PyMVPA
mailing list