[pymvpa] libsvm dense arrays

Yaroslav Halchenko debian at onerussian.com
Mon Sep 29 17:48:15 UTC 2008


> Apologies if this doesn't end up in the right thread, there was
> a minor problem with my subscription settings.
no problem ;-)

> I played with the Shogun backend circa version 0.2.0.  If i remember
> correctly, this was when it was spitting out those nasty debug
> messages so I stopped using it pretty quickly.
yeah -- those are nasty ;-) but shogun has gone long way since 0.2.0,
and shogun is as quiet as you like it (ie you can use DEBUG_SG.*
debug targets if you like it noisy) ;-)

> My understanding of the pymvpa source is that it uses libsvm by
> defaulti
right -- for the convenience  SVM classes (e.g. LinearCSVMC),
libsvm is the default.

> 1) There was an explicit reason you guys chose libsvm as the
> default backend over Shogun, which I believed likely since libsvm is so
> popular but I had not heard of Shogun before, or,
that was pretty much the reason -- shogun was out of our sight whenever
pymvpa was started (quite a while ago). Later on we saw shogun and
decided to make use of it.

> 2) Shogun actually uses libsvm internally so there was no reason not
> to use it directly.  To this point, I am unclear if Shogun simply uses
> the system installation of libsvm or if it has it's implementation or
> even its own wrapper
shogun pretty much borrowed libsvm code (solver etc) and tuned it up a
bit to fit its framework (so its uses shogun's kernels). I am not sure
on how far it diverged from the original libsvm code though and how many
changes from libsvm propagate into shogun's copy of its code. Since it
uses shogun's kernels implementation, and from a quick look at what is
the difference between stock and dense libsvm, I think shogun's libsvm
is whatever you feed it with -- now we are feeding it with dense
structures, so implementation is pretty much 'dense', thus indeed, close
to no reason to worry about dense libsvm implementation

> Perhaps now is the time to clarify my assumptions - is libsvm not the
> better default choice?  If Shogun indeed uses a dense array
> implementation of libsvm, is there any reason not to set it as the
> default backend?
primarily we stayed with libsvm being default for the compatibility,
also shogun interface was/is more convoluted, shogun still has memory
leaks issues.

Either shogun is better default choice... not sure, since we have a copy
of libsvm within pymvpa for those who have no libsvm installed. There is
also licensing issue with shogun which we haven't 'finalized yet' --
shogun is GPLed, and we release under expat (MIT) license. It seems that
we will have to double-license pympva together with GPL license for those
wishing to use it with shogun [1]

So for now I guess libsvm is a 'safer' default under various aspects

> I suppose I could also figure this out from the source, but
> while I have your attention, what's the best way (other than
> uninstalling libsvm) to transparently switch to the Shogun backend?
> ie so I can still use the classes LinearCSVMC etc without explicitly
> calling sg.SVM etc
heh... I guess we didn't make clean interface for such a choice (should
have smth like MVPA_SVM_BACKEND env variable which would switch
between libsvm and shogun)... 

ok -- added a quicky (should be refactored whenever we get yet another
SVM implementation by some other packet, or finally add those "simple"
sparse linear SVMs from shogun) support for that (look in yoh/master
branch). Now, if you use MVPA_SVM_BACKEND=shogun it would use shogun's
SVM by default.

if you are using released version of pymvpa, then you can do dirty hack.

prior to any mvpa imports

import mvpa.base.externals
mvpa.base.externals._VERIFIED['libsvm'] = False

so it would trigger pymvpa to say that libsvm is not available.

Please let us know how it works for you


[1] http://lists.debian.org/debian-legal/2008/09/msg00126.html
> Thanks again,
> Scott
-- 
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student  Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
        101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW:     http://www.linkedin.com/in/yarik        



More information about the Pkg-ExpPsy-PyMVPA mailing list