[pymvpa] retraining
Scott Gorlin
gorlins at MIT.EDU
Sun Apr 12 00:53:56 UTC 2009
Yaroslav Halchenko wrote:
> Another aspect you might benefit from in case of SVM is the fact that
> some samples are not influencing SVM performance (ie the non-support
> vector ones). So, you can speed up n-fold (or pure leave-one-out)
> considerably if there is only few SVs -- same strategy is used by
> lightsvm:
>
> 1. train SVM on all samples
>
> 2. in n-fold testing check if testing set includes any of support
> vectors. If not -- then you already know what would be the result (the
> same) if you train the SVM without their participation ;)
>
> This strategy is giving especially large speed up if number of chunks is
> large (or each sample is a chunk like in leave-1-out) and number of SVs
> is small.
>
>
Hmm... I'm not sure I understand your suggestion. Won't the support
vectors change with each new chunk in cross validation? Or at least the
coefficients won't be identical. Is there a paper you know of which
describes this? It seems like training on the whole set will ruin the
notion of iid distribution of the output error on each fold.
> Just think may be about accounting for such scenario as well?
>
> sorry for not being too up-to-point with the reply, but I will get
> through your email/code some time later whenever I get a chance ;)
>
>
No problem... VSS is looming anyway, so polished code isn't my priority
now either ;)
-S
More information about the Pkg-ExpPsy-PyMVPA
mailing list