[pymvpa] GPR woes
Scott Gorlin
gorlins at MIT.EDU
Sat Mar 28 15:26:59 UTC 2009
I haven't checked the python syntax, but in Matlab, the eps function can
take a number and return the smallest recordable numerical difference at
that level. If there is an equivalent in numpy then (say 2x) this would
be a good alternative default epsilon (calculated at train time)
Emanuele Olivetti wrote:
> Dear Per and Scott,
>
> You are both right and I feel quite guilty since I did not spend
> time on GPR as I promised long time ago (basically lack of time). The
> epsilon issue is exactly what Scott says: I discussed the same issue
> with Yarick some time ago but then did not work on a principled solution.
> I'm fine if you patch the current code in order to increase epsilon
> till a
> certain extent. Any idea about alternative/principled solutions?
>
> Moreover I'm (slowly) after another numerical instability due, I
> guess, to
> underflow in summation which could affect the marginal likelihood
> computation (which means it is highly relevant for GPR). This is more
> likely to happen when the dataset is large. So be prepared to that ;-)
>
> As far as I remember I forced the kernel matrix to be double precision
> for exactly the same reasons you mentioned, but checking more
> accurately this fact is highly desirable.
>
> Unfortunately I'm not going to work on GPR very soon, even though
> I'd love to. So any patch is warmly welcome, as well as bug reports ;-)
>
> Best,
>
> Emanuele
>
>
> Per B. Sederberg wrote:
>> Hi Scott:
>>
>> You are correct on all accounts. epsilon is currently hard-coded and
>> either of your proposed solutions would help to make GPR work more
>> frequently/consistently. I'll try out some combination of both and
>> report back.
>>
>> Best,
>> Per
>>
>> On Sat, Mar 28, 2009 at 9:46 AM, Scott Gorlin <gorlins at mit.edu> wrote:
>>
>>> Epsilon is the Tikhonov regularization parameter, no? (Haven't read
>>> much about this classifier, but it looks similar to the RLS solution to
>>> me...). As such it should be adjustable to enable punishing overfit
>>> solutions, so maybe it should actually be a formal parameter.
>>>
>>> If the default returns a LinAlg error, you could also stick the
>>> cholesky
>>> decomposition in a finite loop where epsilon is doubled each time
>>> (though not too many iterations...)
>>>
>>> On a wild tangent, also check that your kernel matrix (here self._C) is
>>> double precision, since 1e-20 is close to the precision limit for
>>> single
>>> floats.
>>>
>>> Per B. Sederberg wrote:
>>>
>>>> Hi Everybody (and Emanuele in particular):
>>>>
>>>> I've tried GPR on about 5 different datasets and I've never actually
>>>> gotten it to run through an entire cross validation process. I always
>>>> get the following error:
>>>>
>>>> /home/per/lib/python/mvpa/clfs/gpr.pyc in _train(self, data)
>>>> 308 except SLAError:
>>>> 309 epsilon = 1.0e-20 * N.eye(self._C.shape[0])
>>>> --> 310 self._L = SLcholesky(self._C + epsilon,
>>>> lower=True)
>>>> 311 self._LL = (self._L, True)
>>>> 312 pass
>>>>
>>>> /usr/lib/python2.5/site-packages/scipy/linalg/decomp.pyc in
>>>> cholesky(a, lower, overwrite_a)
>>>> 552 potrf, = get_lapack_funcs(('potrf',),(a1,))
>>>> 553 c,info =
>>>> potrf(a1,lower=lower,overwrite_a=overwrite_a,clean=1)
>>>> --> 554 if info>0: raise LinAlgError, "matrix not positive
>>>> definite"
>>>> 555 if info<0: raise ValueError,\
>>>> 556 'illegal value in %-th argument of internal
>>>> potrf'%(-info)
>>>>
>>>> LinAlgError: matrix not positive definite
>>>>
>>>> I think Yarik mentioned changing epsilon or something like that, but
>>>> it seems to me that having to change source code to tweak a classifier
>>>> to not crash is not ideal.
>>>>
>>>> Do you have any suggestions for how to fix this? I've been tantalized
>>>> with some very good transfer errors in the folds that GPR was able to
>>>> complete before crashing. Is there a permanent change we could make
>>>> to the GPR code to make it more stable across datasets? Or could we
>>>> turn epsilon into a configurable variable when setting up the
>>>> classifier? If so, what should I change epsilon to in order to see if
>>>> I can get it to run?
>>>>
>>>> Thanks,
>>>> Per
>>>>
>>>> _______________________________________________
>>>> Pkg-ExpPsy-PyMVPA mailing list
>>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Pkg-ExpPsy-PyMVPA mailing list
>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>>
>>>
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>
>>
>
More information about the Pkg-ExpPsy-PyMVPA
mailing list