Theorem

Proof: Using conditional probability . Similarity . Setting them equal to each other we get dividing through by gives Bayes’ Theorem.

Terminology

  1. is called the Prior. This is what our current belief/hypothesis is about is before we see the data/evidence. It is our starting point.
  2. is called the Likelihood. It is the probability of observing the evidence if our hypothesis was true. Effectively it measures how well our hypothesis/current beliefs are compatible with the evidence we just observed.
  3. is called the Evidence or Marginal Likelihood. This is the probability of observing across all possible hypotheses. It serves as a normalisation constant.
  4. is called the Posterior. This is our updated belief in our hypothesis after accounting for the new evidence .

Note Often Bayes’ Theorem is written as and we leave out the normalisation factor (the evidence). As it is often hard to compute as well as the fact that we don’t have to compute it if we have conjugacy that is the posterior PDF is in the same family as the prior PDF but just with new parameter values.