1.2.1 Prior probability density
𝑝(𝜃)
Our knowledge about q is assumed to be contained in a known prior distribution P(q), which expresses previous knowledge of θ from, for example, a past experience, with the absence of some proof.
1.2.2 Likelihood function
The form of P(z|q) is assumed known, but the value of q is not known exactly. The likelihood reads as “the probability of the observation, given that the hypothesis is true”. This term basically represents how strongly the hypothesis predicts the observation.
1.2.3 Normalization factor(Evidence)
p(z)=
The rest of our knowledge about q is contained in a set D of n random variables x1, x2, …, xn that follows P(z). In the context of Bayes’ theorem, the evidence is the probability of a particular statement being true.
1.2.4 Posterior probability density
p
The posterior probability is obtained after multiplying the prior probability by the likelihood and then dividing by the evidence.
So let us rewrite the bayesian theorem in the following form:
1.2.4 properties that might be used in the following sections
The posterior p is a distribution over θ and has all the usual properties of a distribution. In particular
l The posterior distribution integrates to 1
l We may compute the posterior probability that θ is in a set A by
l The posterior distribution has a mean and variance, just like any other distribution. If we have to make a guess as to the exact value of θ, one commonly used guess is the posterior mean