[pymvpa] bug reporting

tg at cs.york.ac.uk tg at cs.york.ac.uk
Tue Sep 1 17:25:27 UTC 2009


> if you are using Debian, then you are welcome to use
> reportbug python-mvpa
> but in general -- just complaint to the mailing list -- we haven't had
> much of need for full-featured bug tracker yet (but things might change
> ;))

Ok - I'm running Mint, and don't seem to have reportbug installed, so I'll
just stick with posting to list.
A problem I ran into was that using kNN with 'majority' voting caused a
crash. Copying the dataset example in the docs, the following code
reproduces that:

import numpy as N
from mvpa.datasets import Dataset
from mvpa.clfs.knn import kNN

data = Dataset(samples=N.random.normal(size=(10,5)), labels=1)
test = N.random.normal(size=(1,5))

k = kNN(k=1, voting='majority') #other values of k cause the same result
k.train(data)
k.predict(test)

Which gives me the stack trace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/var/lib/python-support/python2.5/mvpa/clfs/base.py", line 436, in
predict
    result = self._predict(data)
  File "/var/lib/python-support/python2.5/mvpa/clfs/knn.py", line 148, in
_predict
    results = [vfx(knn) for knn in knns]
  File "/var/lib/python-support/python2.5/mvpa/clfs/knn.py", line 173, in
getMajorityVote
    votes[self.__labels[nn]] += 1
  File "/var/lib/python-support/python2.5/mvpa/misc/state.py", line 1088,
in __getattribute__
    return _object_getattribute(self, index)
AttributeError: 'kNN' object has no attribute '_kNN__labels'


I opened that file, and guessing at the following change means it now runs
for me:

160c160
<             votes[self.__labels[nn]] += 1
---
>             votes[self.__data.labels[nn]] += 1

The thing is, the classification rate I get from this on my test set is
only ~1%, whereas the kNN weighted voting gets over 70%. The discrepancy
seems odd (especially as I have the same number of samples for each
class), which makes me wonder whether I've 'corrected' this to the wrong
thing!

I'm running the hardy-backported version, 0.4.2 (rather than the 0.4.0
that I originally emailed about).

> heh -- sorry about that. The reason is simple -- our server (and us) has
> moved to a new location,
...
>'Show source' which would reveal original ReST text) -- thanks in
> advance!

Np :) I don't get on too well with git, so I'll probably send you source
diffs instead if that's ok.
I tend to start more with examples than APIs when getting into something
new, so the mistake I made was that looking through the examples under the
Datasets section, and the first part of the Dataset class description, I
got the impression that labels could be strings.
(In particular, from the dataset.labels[1] += "_bad" text at
http://www.pymvpa.org/modref/mvpa.datasets.base.html#mvpa.datasets.base.Dataset.)

This caused several different ValueErrors from the classifiers I tried
(GPR, BLR, RidgeReg); eventually a scipy error that was trying to convert
a class label to a float twigged things for me. Incidentally, why *do* the
labels get converted to floats? It seems counter-intuitive to return
things of a different type (floats to ints) when they are just label
markers.

Anyway, I just wanted to add a note to the docs to mention this in case
anyone else makes the same mistake! I'll sort this out later and post the
results to look over.

Thanks again,
Tara





More information about the Pkg-ExpPsy-PyMVPA mailing list