[pymvpa] _clf_internals question

Per B. Sederberg persed at princeton.edu
Sat Apr 25 14:37:53 UTC 2009


Howdy Everybody:

I've now committed GLMNET_Reg and GLMNET_Class to my branch.  They are
the gaussian regression and multinomial classification versions of
GLMNET.

They are also added to the classifier warehouse and pass all tests,
though there are some convergence warnings on the tiny sample datasets
(I don't get these with actual data.

So, have fun!  The alpha parameter controls the L1/L2 norm trade-off.
Try it with alpha = .1 for a lot of L2, which should help to bring in
redundant features in typical elastic-net style.

Just so it doesn't take forever, I like to make this classifier part
of a FeatureSelection classifier that runs an ANOVA first to get it
down to around 10K features:

anova_keep = 10000
gnet = GLMNET_Class(alpha=.1,nlambda=100,
                    descr="GLMNET_Class(nlambda=100, alpha=.1)")
anova_gnet = FeatureSelectionClassifier(
    gnet,
    SensitivityBasedFeatureSelection(
    OneWayAnova(),
    FixedNElementTailSelector(anova_keep,mode='select',tail='upper')),
    descr="GLMNET on %d(ANOVA)" % (anova_keep))


Best,
Per


On Sat, Apr 25, 2009 at 8:31 AM, Per B. Sederberg <persed at princeton.edu> wrote:
> On Sat, Apr 25, 2009 at 8:13 AM, Michael Hanke <michael.hanke at gmail.com> wrote:
>> Hi Per,
>>
>> On Sat, Apr 25, 2009 at 07:25:33AM -0400, Per B. Sederberg wrote:
>>> Howdy folks:
>>>
>>> As you might have seen, I recently added preliminary support for the
>>> GLMNET regression/classification algorithm as seen in the following
>>> paper:
>>>
>>> http://www-stat.stanford.edu/~hastie/Papers/glmnet.pdf
>>
>> Thanks for adding this one.
>>
>
> No probs, it looks to be a great algorithm for our needs.
>
>>> In refining the implementation (which is wrapping the R code from the
>>> authors) I've run into a minor issue that I'm not sure how to resolve.
>>>  Depending on how you parameterize the algorithm, it can either
>>> perform a multinomial logistic regression style classification (and
>>> hence would want the 'multiclass' internal, and possibly 'binary',
>>> too) or a simple gaussian regression that is meant for continuous data
>>> (and hence would want the 'regression' internal set.)
>>>
>>> Should I make two other instances of the class, similar to what is
>>> done for SVMs, that separate this out: i.e., a GLMNET_Reg and
>>> GLMNET_Class for regression and classification?
>>
>> This seems to be the easiest way.
>>
>
> Coolio.  I'll give it a whirl.
>
>>> That way I can add them to the warehouse and have them take part in
>>> the unittest battery.  Speaking of unittests, is adding this
>>> classifier to the warehouse all that is needed to get basic unittests
>>> running on it?
>>
>> Yes, that should be it.
>>
>
> Word!  I'll do all the above and commit some other minor mods to it soon.
>
> Best,
> Per
>
>
>
>
>>
>> Cheers,
>>
>> Michael
>>
>> --
>> GPG key:  1024D/3144BE0F Michael Hanke
>> http://apsy.gse.uni-magdeburg.de/hanke
>> ICQ: 48230050
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>
>



More information about the Pkg-ExpPsy-PyMVPA mailing list