Michael Hanke
Mon Apr 21 08:12:20 UTC 2008


[ cross posting this (with full quote) to pymvpa mailing list, as I think
  it is more appropriate for that list ]

On Sun, Apr 20, 2008 at 01:24:15PM -0400, Per B. Sederberg wrote:
> Hi Folks:
> So I was talking with a world-famous mathematician (Ingrid Daubechies)
> about SMLR on Friday and she suggested trying out the Least Angle
> Regression (LARS) technique instead.  Here's the relevant paper by
> some of the most famous folks in machine learning:
> Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani,
> Least Angle Regression Annals of Statistics (with discussion) (2004)
> 32(2), 407-499. A new method for variable subset selection, with the
> lasso and "epsilon" forward stagewise methods as special cases.
> It is kind of like a smart boosted linear regression classifer where
> it adds features in one-by-one in a very intelligent fashion (those
> folks are smart.)
> She warned that implementing it may be a bit difficult due to a few
> tricks and suggested that I use some existing implementation from a
> person we can trust.  Well, it turns out that Trevor Hastie
> implemented it very nicely in R, so I figured let's make use of that
> PyMVPA framework and wrap it up!!!
Great work! Now we actually have a classifier that uses this possibility
and we can stop talking about being potentially able to do it ;-)

> The result is a new LARS classifier that makes use of RPy to wrap the
> R implementation.  It looks like it works great and we should
> eventually make a LARSWeights to go with the SMLRWeights and
> LinearSVMWeights.
Right -- I guess we can simply take the route we talked about in last
VNC session and merge them with the classifiers themselves (with factory

> To make it work, you have to install R and RPy and then download the
> lars contributed package.  Below are my notes on doing this on Debian:
> Howto install and use the R version of lars (on Debian Lenny):
>  - First you have to install all the R you need:
> <example>
> sudo aptitude install python-rpy python-rpy-doc r-base-dev
> </example>
>  - Then you have to install the lars library (if you do this as root
>    you will install it globally):
> <example>
> R
> install.package()
> </example>
>    Just pick your mirror, then pick lars from the list of packages.
>  - Finally this is how to use it with rpy:
> <example>
> ipython -pylab
> import rpy
> import numpy as N
> rpy.r.library('lars')
> x = N.random.randn(100,1000)
> x[:50,:5] = x[:50,:5] + 2
> x2 = N.random.randn(10,1000)
> x2[:5,:5] = x2[:5,:5] + 2
> y = N.zeros((100,1))
> y[:50,0] = 1
> res = rpy.r.lars(x,y,use_Gram=False)
> p = rpy.r.predict_lars(res,x2)
> </example>
Tried it -- worked fine. Could you please add this information to the
relevant pieces of the manual (with corresponding changelog entry).

> The current implementation passes the test_lars.py tests, but I was
> having a shogun error, so all the tests we not running on my machine.
> We should think about a graceful way for the code to error out if
> someone does not have the dependencies loaded correctly.
We have that already: look at mvpa/base/externals.py. Although I'm not
sure why it doesn't work for you. Currently all external deps are
supposed to be fully optional and at least the test battery should
automatically limit itself to those tests where all deps are available.
I have added the relevant test code snippet for lars in my branch.

Reading the above again I check it and had to make some improvements ;-)
Please try my branch and tell me if it works fine. I do not have shogun
installed and the test run fine (with and without 'test_lars', with both
lars installed and not installed). However, I had to disable lars in the
full classifier sweep as it doesn't seem to like the test yet! I haven't
had a chance to track it down yet.

BTW: This snippet should be sufficient to make code optional when lars
is not there:

from mvpa.base import externals

if 'lars' in externals.present:
	<something optional>

I will add some proposal about handling of external deps to the

Thanks for lars ;-),


Michael Hanke
ICQ: 48230050

