[pymvpa] top-n match lists
Tara Gilliam
tg at cs.york.ac.uk
Thu Jan 7 18:35:28 UTC 2010
Hi Scott,
Sorry, just realised I forgot to include the list in my earlier reply! I did
mean the closest classes (I'm working with 25-200 classes, depending on the
dataset) rather than samples.
Thanks for pointing out the prediction matrix though - I did spot this in the
docs but it didn't seem to quite fit what I was looking for at the time. For
kNNs though it might be worth another look.
Thanks,
Tara
Scott Gorlin wrote:
> Let me ask a clarifying question - Tara, did you mean the top 10
> PREDICTED CLASSES (how many do you have??) or the top 10 CLOSEST SAMPLES?
>
> For CLASSES, as Yaroslav said, this will be dependent on how the
> classification is done (unified ala SMLR/KNN, 1 vs 1 or 1 vs Rest for
> SVM's, etc), so you need to decide exactly how you want to handle it.
> LibSVM based SVM's do weird things with multiclass values, but you can
> use the Meta classifier MulticlassClassifier to easily create a suite of
> 1v1 (or 1 vs Rest in more recent development branches) binary machines
> and extract each of their values. If you use the Shogun backend, there
> are some SVM's which can spit out correct MC values in either 1v1 or
> 1vR, depending on your installed version. Though perhaps you're not
> using SVM's...?
>
> If you meant SAMPLES, then you can simply look at the prediction kernel
> matrix for the rows with the highest values - or use a distance function
> of your own.
>
> -Scott
>
> Yaroslav Halchenko wrote:
>> Hi Tara,
>>
>> Unfortunately ATM we simply do not have such feature unified and
>> exposed at all, but it sounds like a nice feature to have! Probably
>> something like 'predictions_ranked' where for each prediction it would
>> not only store the winner but all labels ordered accordingly.
>>
>> I have filed the request under
>> http://github.com/hanke/PyMVPA/issues#issue/8
>>
>> Meanwhile you can make use of 'values' state variable. The difficulty
>> might be that the content of values is not unified across the
>> classifiers, but it is somewhat obvious for most (e.g. for SMLR it would
>> be probabilities from logistic regressions per each label in the order
>> of your ds.uniquelabels, so you would need just to argsort each one of
>> them with [::-1] to get in reverse order ;) ). What classifier do you
>> have in mind though?
>>
>>
More information about the Pkg-ExpPsy-PyMVPA
mailing list