# [pymvpa] probabilities from GNB

Richard Dinga dinga92 at gmail.com
Sun Apr 29 08:52:15 BST 2018

```Thanks, this seems to be the solution according to documentation, but it
doesn't work for my data. Here is an example using tutorial data:

from mvpa2.suite import *
datapath = '/usr/share/data/pymvpa2-tutorial/'
roi='vt',
'orig', 'vt.nii.gz')})
poly_detrend(haxby, polyord=1, chunks_attr='chunks')
haxby = haxby[np.array([l in ['rest', 'house', 'face']
for l in haxby.targets], dtype='bool')]
zscore(haxby, chunks_attr='chunks', param_est=('targets', ['rest']),
dtype='float32')
haxby = haxby[haxby.sa.targets != 'rest']
haxby = remove_invariant_features(haxby)

clf = GNB(enable_ca='estimates', logprob=True, normalize=True)
cv = CrossValidation(clf, NFoldPartitioner(attr='chunks'), postproc=None)
cv_results = cv(haxby)
print clf.ca.estimates

[[          inf           inf]
[-234.34792494    0.        ]
[          inf           inf]
...,
[          inf           inf]
[          inf           inf]
[          inf           inf]]

However it works with test data:

from mvpa2.testing.datasets import *
ds_test = datasets['uni2medium']
cv_results = cv(ds_test)
print np.round(np.exp(clf.ca.estimates), 3)

[[ 0.956  0.044]
[ 1.     0.   ]
[ 1.     0.   ]
...,
[ 0.     1.   ]
[ 0.168  0.832]
[ 0.001  0.999]]

On Thu, Apr 26, 2018 at 6:11 PM, Yaroslav Halchenko <debian at onerussian.com>
wrote:

>
> On Thu, 26 Apr 2018, Richard Dinga wrote:
>
> > Hi,
> > I am trying to get a probability prediction for each sample from
> > cross-validation. I used .ca.stats.sets to get those, however, for GNB
> > these all look like this:
>
> > print cvte.ca.stats.sets[0][2]
> > [[-1009.22758728 -1079.77409491]
> >  [ -795.59690176 -1038.32481958]
> >  [ -875.73917377 -1189.377741  ]
> >  ...,
> >  [-1483.49338276  -856.61441132]
> >  [-1308.29372328  -815.90664933]
> >  [-1169.79999768  -737.54291075]]
>
> > I thought these are log probabilities, but after exponentiation, they are
> > all 0, although based on accuracy and AUC the classifier works fine.
>
> > Any idea how to fix this or is this as good as it can get? My ultimate
> goal
> > is to get GNB probabilities from GNB searchlight. Trying the same thing
> > using SMLR seems to produce valid probabilities (in a sane range and rows
> > sums to 1).
>
> > Best regards,
> > Richard
>
>
> Try using setting normalize=True for your GNB.  Here is from GNB? in
> ipython,
> check other parameters which might be relevant:
>
> normalize : bool, optional
>   Normalize (log)prob by P(data). Requires probabilities thus for
>   `logprob` case would require exponentiation of 'logprob's, thus
>   disabled by default since does not impact classification output.
>   Constraints: value must be convertible to type bool. [Default:
>   False]
>
>
>
> --
> Yaroslav O. Halchenko
> Center for Open Neuroscience     http://centerforopenneuroscience.org
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419