![[Pasted image 20240710005946.png]]
# Bayesian Statistics
**Bayesian statistics** is a [[Probability Theory|probabilistic]] framework that blends *prior beliefs* with *observed data* to update and refine our understanding of uncertainty.
Bayes’ Theorem, also known as Bayes’ Rule, is a simple equation to calculate the conditional probability
> [!abstract] **Definition:** Bayes’ Theorem
>
$P(A | B) = \frac{P(B|A) \cdot P(A)}{P(B)} \text{, where } P(B) \ne 0$
>
> $P(A)$ is the *prior probability*. It represents what we understand about $A$ before new evidence is taken into account.
> a
> $P(B)$
>
> $P(A|B)$ is the *posterior probability*. It represents the probability of event $A$ occuring given that event $B$ is true.
>
> $P(B|A)$ is called the
## The Bayesian Framework
![[Pasted image 20240709090337.png|450]]
Bayes’ Theorem provides the mathematical framework for the modeling used by Bayesian statistics.
Let $P(H)$ be the probability of some hypothesis and $P(e)$ be the probability of
$P(H|e) = \frac{P(e|H) \cdot P(H)}{P(e)} = \frac{P(e|H)P(H)}{P(e|H)P(H)+P(e|\neg H)P(\neg H)}$
$P(H | e)$ is the **posterior probability** of the parameters $\theta$ given the evidence $e$.
- The probability distribution of the parameter given the evidence, $e$, and a model/hypothesis, $H$.
- Sometimes referred to as the “joint a-posteriori probability distribution”
$P(e|H)$ is the **likelihood** of the data given the parameters.
-
$P(H)$ is the **prior** probability of the parameters.
-
$P(e)$ is the marginal likelihood or **evidence/data** (a normalizing constant)
-
### Bayesian Learning Methods
**Features of Bayesian learning methods include:**
- *Flexibility:* Each observed training example can incrementally decrease or increase the estimated probability that a hypothesis is correct, rather than completely eliminating it if its found to be inconsistent with a single example.
### Conjugate Priors
## Bayesian Inference
### Maximum A Posteriori (MAP) Hypothesis
> See also:
> - [[Expectation Maximization (EM)]]
In simple terms, the **maximum a posteriori (MAP)** is the mode of the computed posterior.
*a priori*