[pymvpa] top-n match lists
Scott Gorlin
gorlins at MIT.EDU
Thu Jan 7 18:28:29 UTC 2010
Let me ask a clarifying question - Tara, did you mean the top 10
PREDICTED CLASSES (how many do you have??) or the top 10 CLOSEST SAMPLES?
For CLASSES, as Yaroslav said, this will be dependent on how the
classification is done (unified ala SMLR/KNN, 1 vs 1 or 1 vs Rest for
SVM's, etc), so you need to decide exactly how you want to handle it.
LibSVM based SVM's do weird things with multiclass values, but you can
use the Meta classifier MulticlassClassifier to easily create a suite of
1v1 (or 1 vs Rest in more recent development branches) binary machines
and extract each of their values. If you use the Shogun backend, there
are some SVM's which can spit out correct MC values in either 1v1 or
1vR, depending on your installed version. Though perhaps you're not
using SVM's...?
If you meant SAMPLES, then you can simply look at the prediction kernel
matrix for the rows with the highest values - or use a distance function
of your own.
-Scott
Yaroslav Halchenko wrote:
> Hi Tara,
>
> Unfortunately ATM we simply do not have such feature unified and
> exposed at all, but it sounds like a nice feature to have! Probably
> something like 'predictions_ranked' where for each prediction it would
> not only store the winner but all labels ordered accordingly.
>
> I have filed the request under
> http://github.com/hanke/PyMVPA/issues#issue/8
>
> Meanwhile you can make use of 'values' state variable. The difficulty
> might be that the content of values is not unified across the
> classifiers, but it is somewhat obvious for most (e.g. for SMLR it would
> be probabilities from logistic regressions per each label in the order
> of your ds.uniquelabels, so you would need just to argsort each one of
> them with [::-1] to get in reverse order ;) ). What classifier do you
> have in mind though?
>
>
More information about the Pkg-ExpPsy-PyMVPA
mailing list