[pymvpa] Customize ANOVA Feature Selection

Mon Oct 3 17:38:47 UTC 2011

AHA -- that is a "cool" one ;)  if only we refactored those selectors
into proper Node's, they could have been chained :-/  now you could
chain (afaik) the corresponding SensitivityBasedFeatureSelection's (or
just use CombinedFeatureSelection), but that would be wasteful
since it would compute ANOVA twice... also we could take advantage that
feature selectors are simple callables so we could craft a corresponding
funct(ion|tor) which would chain those two:

        def custom_tail_selector(seq):
            seq1 = FractionTailSelector(0.01, mode='discard', tail='upper')(seq)
            seq2 = FractionTailSelector(0.05, mode='select', tail='upper')(seq)
            return list(set(seq1).intersection(seq2))

it seems to work as desired (pushed a rudimentary smoke unittest for it)

On Mon, 03 Oct 2011, Roberto Guidotti wrote:

>    Thank you for the response,

>    I'm always lazy to improve problem description (and my english doesn't
>    help me!!!)

>    Theoretically for every fold I have to select important features, so I
>    perform ANOVA, the algorithm ranks my feature, but I want to drop the
>    first 1% and use the 5% (or X%) of the new rank after 1% dropping.  I
>    found on my analysis that most important feature are, in the 90% of
>    cases, on the brain edge (probably motion artefacts voxels that still
>    affect the dataset also after preprocessing).

>    For this reason I would like to extract ANOVA ranking stack or to find
>    a method to use voxels from 1.01% to 5.01%!

>    So I want to perform ANOVA on data, drop the first 1%, select the new
>    ranking stack (5% of residual voxels)  and train the classifier!

>    Thank you

>    RG

>    On 3 October 2011 18:14, Yaroslav Halchenko <[1]debian at onerussian.com>
>    wrote:

>      I might be missing the problem since you seem to have to know the
>      blocks
>      you need ;)  so following the example from the clfs warehouse:
>         FeatureSelectionClassifier(
>             kNN(),
>             SensitivityBasedFeatureSelection(
>                OneWayAnova(),
>                FractionTailSelector(0.05, mode='select', tail='upper')),
>             descr="kNN on 5%(ANOVA)")
>      logical change would be to use 0.99 instead of 0.05 to select 99%
>      top voxels
>      (or in other words to dump 1%)... or just change mode to 'discard'
>      and specify
>      0.01 as the fraction

>    On Mon, 03 Oct 2011, Roberto Guidotti wrote:
>    >    Hi all,
>    >    I would like to know if there is a method to extract the feature
>    >    ranking from ANOVA feature selection or if there is a method to
>    drop
>    >    first 1% of voxel included in the ranking and select other voxel
>    as
>    >    usual (with FractionTailSelection()).
>    >    Thank you
>    >    Roberto

>      > _______________________________________________
>      > Pkg-ExpPsy-PyMVPA mailing list
>      > [2]Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org

>      [3]http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-expps
>      y-pymvpa
-- 
=------------------------------------------------------------------=
Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic