[pymvpa] TreeClassifier and decision trees
Thorsten Kranz
thorstenkranz at googlemail.com
Mon Feb 14 23:23:10 UTC 2011
Hi all,
I try to reproduce rather complicated decision trees with your
TreeClassifier class. I'm still using 0.4-branch.
If I have 4 labels in my data, the tree I want to use might look like:
/\
/ \
/ \
3 / \
1 /\
2 4
I have now two questions:
1) How do I define the corresponding TreeClassifier? With only one
label in some branches...
TreeClassifier(SVM(), {"g3": ([3],SVM()),
"g6":([1,2,4],TreeClassifier(SVM(),
{"g1":([1],SVM()),"g5":([2,4],SVM())])])
It looks strange to me to give a classifier for "g3", if there is
only one label in g3... Same for g1.
2) When I do so, with same number of samples for every label, the
predictions are always one of (2,4), as there seems to be a bias. For
decision one, there are always more samples in "g6" than in "g3", for
decision 3, "g5" always outnumbers "g1". The training sets are not
balanced this way.
I hope it was understandable what I tried to explain,
many greetings,
Thorsten
More information about the Pkg-ExpPsy-PyMVPA
mailing list