[pymvpa] _tent/parameters now is in yoh/master
Yaroslav Halchenko
debian at onerussian.com
Fri May 16 17:42:01 UTC 2008
Dear All,
Having gained ACK from Michael and getting bored with merges and cherry
picking I simply merged _tent/parameters into yoh/master.
It is quite an evil change -- ie it changes a lot and hopefully brings
useful pieces. Quick summary of what I remember was gained in that
branch
* Class Parameter mvpa.misc.param is intended to store parameters of a
classifier, and be automagically picked up into .params (or
.kernel_params) collection for the given class. So it in line with how
StateVariables were collected together into .states.
TODO: work out the same for Dataset, so we get those samples, labels,
chunks to be a part of a collection (what would be nice name? we have
now dsattr internall but that is not a good one imho,
samples_attributes is too long imho), then we could easily
add/remove new samples attributes at run-time
* we got new clfs.warehouse. It provides instances of classifiers based
on their _clf_internals, so you can easily get all linear svms not
from shogun with
clfs['linear', 'svm', '!sg']
and unittests use now that warehouse instead of duplicate under
tests/tests_warehouse_clfs
TODO: I think warehouse itself should become iteratable and on
__getattrib__ it should return another warehouse with selected subset
of classifiers. That would make it more proper imho
* shogun and libsvm SVMs got base class _SVM (mvpa.clfs._svmbase) which
does lots of house-keeping in its __init__. That also unified
parameters of different SVMs, unified and minimized __repr__ for those
(now only parameters with non-default values are printed)
TODO:
imho one of the next steps is to either remove helper classes like
RbfCSVMC, or just create them automagically. Why? because it is
somewhat non-orthogonal and polluting imho. We have nu and C SVMs, we
have at least 4 types of kernel, and in shogun there is around 5
implementations of SVM. So if we code all those params in class name
we end up with
CSVMLibSVM, CSVMSVMLight, or smth like that...
now I need to work out nice __doc__ creation for those SVMs and unify
naming and few left-out arguments a bit
* Initial retraining/retesting is done for shogun's SVMs. If class is
announced to be retrainable, on train, if it was previousely trained
on the same data - it wouldn't recreate the kernel. That would lead to
great speed up if you just change labels (which is the case with all
those pertumation tests)
There are glitches to fix yet and refactor code so it becomes more
readable
Obviously there were some other changes which I can't recall/spot now
Major cons:
Access to attributes of Statefull classes got slower since it goes
through __getattribute__, __setattrib__ of Stateful. But it can be
optimized, and will be some time
So if you have some analysis which you ran using master or
someone/master branch, could you give a try to my yoh/master first
before merging/committing, so we know what things got broken ;-)
--
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-ExpPsy-PyMVPA
mailing list