# [pymvpa] BayesConfusionHypothesis

Emanuele Olivetti emanuele at relativita.com
Tue Jun 25 12:48:34 UTC 2013

```Dear Marco,

Sorry for the late reply, I'm traveling during these days.

BayesConfusionHypothesis, as default, computes the posterior probabilities of
each hypothesis
tested on the confusion matrix. As you correctly report, there is one hypothesis
for each possible partition of the set of the class labels. For example for three
class labels, (A,B,C), there are 5 possible partitions: H_1=((A),(B),(C)),
H_2=((A,B),(C)),
H_3=((A,C),(B)), H_4=((A),(B,C)), H_5=((A,B,C)).

The posterior probability of each hypothesis is computed in the usual way (let
CM be the
confusion matrix):

p(H_i | CM) = p(CM | H_i) * p(H_i) / (sum_j p(CM | H_j) * p(H_j))

where p(H_i) is the prior probability of each hypothesis and p(CM | H_i) is
the (integrated) likelihood of each hypothesis.
The default value for p(H_j) is  p(H_i) = 1/(number of hypotheses), i.e. no
hypothesis is preferred. You can specify a different one from the "prior_Hs"
parameter of BayesConfusionHypothesis.

The measures that are popped out by BayesConfusionHypothesis, i.e. the posterior
probabilities of each hypothesis, quantify how likely is each hypothesis in the
light
of the data and of the priors that you assumed. So those values should be what
you are
looking for.

If you set "postprob=False" in BayesConfusionHypothesis, you will get the
likelihoods
of each model/hypothesis, i.e. p(CM | H_i), instead of posterior probabilities.
This is a
different quantity. Note that, differently from p(H_i | CM), if you sum all the
p(CM | H_i) you
will not get one. The likelihoods (which is an "integrated likelihood", or a
Bayesian
likelihood) are useful to compare hypotheses in pairs. For example if you want to
know how much evidence is in the data in favor of discriminating all classes,
i.e. H_5=((A),(B),(C)), compared to not discriminating any class, i.e.
H_1=((A,B,C)),
then you can look at the ratio B_51 = p(CM|H_5) / p(CM|H_1), which is called
Bayes factor (similar to the likelihood ratio of the frequentist approach, but note
that the likelihoods are not frequentist likelihoods). If that number is >1,
then the
evidence of the data supports H_5 more than H_1. More detailed guidelines to
interpret
the value of the Bayes factor can be found for example in Kass and Raftery (JASA
1995).

In the paper Olivetti et al (PRNI 2012) I presented the Bayes factor way, but I
believe that looking at the posterior probabilities - which is the PyMVPA's default
I proposed - is simpler and more clear especially in the case of many
hypotheses/partitions.
I am describing these things in an article in preparation.

The parameters "space" and "hypotheses" of BayesConfusionHypothesis have
the following meaning:

- "space" stores the string of the dataset's field where the posterior probabilities
are stored. That dataset is the output of BayesConfusionHypothesis. You might
want to change the default name "hypothesis". Or not :).

- "hypotheses" may be useful if you want to define your own set of
hypotheses/partitions
instead of relying on all possible partitions of the set of classes. The default
value "None" triggers the internal computation of all possible partitions. If you
do not have strong reasons to change this default behavior, I guess your should
stick with the default value.

Best,

Emanuele
Olivetti

On 06/21/2013 08:47 AM, marco tettamanti wrote:
> Dear all,
> first of all I take my first chance to thank the authors for making such a
> great software as pymvpa available!
>
> I have some (beginner) questions regarding the BayesConfusionHypothesis
> algorithm for for multiclass pattern discrimination.
>
> If I understand it correctly, what the algorithm does is to compare all
> possible partitions of classes and it then reports the most likely
> partitioning hypothesis to explain the confusion matrix (i.e. highest log
> likelihood among those of all possible hypotheses, as stored in the .sample
> attribute).
>
> Apart from being happy to see confimed my hypothesis of all classes being
> discriminable from each other, is there any way to obtain or calculate some
> measures of how likely it is that the most likely hypothesis is truly
> strongly/weakly superior than some or all of the alternative hypotheses?
> For instance, Olivetti et al (PRNI 2012) state that a BF>1 is sufficient to
> support H1 over H0 and report Bayes Factor and binomial tests in tables.
>
> I assume I should know the answer, so forgive me for my poor statistics.
>
> On a related matter: I see form the BayesConfusionHypothesis documentation,
> that there should be parameters to define a hypothesis space (space=) or some
> specific hypotheses (hypotheses=).
> Could anybody please provide some examples on how to fill in these parameters?
>
> Thank you and all the best,
> Marco
>

```