[Pkg-exppsy-pymvpa] model selection and other questions

Yaroslav Halchenko debian at onerussian.com
Thu Apr 3 16:17:13 UTC 2008

> GPR does a Bayesian linear regression, taking into account prior probability
> and predicting means and variances of the test data. SVM draws the line
> according to the support vectors (I use SVMs but I'm not an expert). But
> if you need a sound probabilistic tool I guess GPR (or GPC) should be
> definitely considered.
And since we aren't experts in GP{R,C} your participation is crucial in
that respect ;-)

> I usually use subversion but I'm going to move to DVCS in
> these weeks because of the many troubles experienced on
> svn. I know some basics of bzr and never tried git. But I guess
> it should not be painful.
git might behave weird from time to time, but generally it is quite
if you are familiar with bzr, then
might be of some use

> I use emacs and have just a rough idea of pylint. I'll try it.
BTW there is some limited support of git-status for emacs. It is
crippled in many respects but I do use it from time to time.

Depending on your linux distribution it might come bundled with git (but
in Debian it sits in examples and I never had chance to follow up on
and apparently it will be integral part of emacs23

> I'm reading SciPy-dev threads on scikits.learn and read Anton
> proposal when he posted. I guess there is the strong need to coordinate
> efforts since part of Anton proposal seems to duplicate part of PyMVPA.
indeed... nice discussion should help - as I suggested - may be we can
make voice conference some time next week? I am still waiting if Anton
follows up on that thread in scipy mailing list

> > great indeed if you provided some input/ideas.
> That's an hard task. Anyway I'll try to sketch something. As
> I already told I'm really interested in this topic.
thanks in advance

> It was cross validation, as provided by libsvm.
> Once you have
> a function to evaluate the quality of your hyperparameters then
> it is just matter of optimize it.
indeed it is a function call which returns point-estimate of some value,
but it is not a function per se - ie you cant (or it is computationally
way too demanding) get reliable estimate of
its gradient/Hessian for efficient optimization. What would be cool is
to have analytically derived criterion -- there are some upper-bound
generalization performance estimates for SVMs and SMLR. That would be
really cool to see if optimization based on them would be fruitful.

But indeed, in majority of the cases we are left with crippled
optimization on point-estimates returned by some function call. Whenever
I was using lightsvm I've tuned it up also to do simplex search for me
on C and the parameters of the kernel.

> At that time I used scipy.optimize but it seems somewhat unstable on
> my datasets. Dmitrey (OpenOpt's author) suggested the Shor's
> r-algorithm ("ralg") for my setting.  I tried it on GPR (nice result!)
> and hope to test on SVR when possible.
Thanks for sharing the experience -- we should give them a spin too but
lots of changes should happen first to make that viable  -- for
instance, now considerable amount of 'training' of SVMs is taken by
transforming data into 'library' space. In shogun implementation then
considerable time is spent to pre-compute kernel values (not sure what
is internals of libsvm). If we are to retrain the SVM after altering
parameters, performance-wise it is critical to don't go through
trasnform/precompute again -- thus we are sill missing 'retrain'
functionality of a classifier (SVM in particular).

doh -- I should stop talking and just work and implement this sooner
than later ;-)
> Bye,

> Emanuele

> _______________________________________________
> Pkg-exppsy-pymvpa mailing list
> Pkg-exppsy-pymvpa at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa

Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student  Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
        101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW:     http://www.linkedin.com/in/yarik        

More information about the Pkg-exppsy-pymvpa mailing list