# Conjugate Distributions

Tags |
---|

# What is a conjugate distribution?

In Bayesian probability theory, we hold onto $p(\theta)$. Now, if $p(\theta | x)$ is in the same distribution, then we say that $p(\theta)$ is a `conjugate prior`

.

This is a very neat property because it means that more data will only change the parameters of the distribution, not the type.

And often, with known conjugates, we don’t have to apply Bayes rule to the distribution. rather, we can just use the conjugate.

# Example of Conjugates

- Bernoulli $p(\phi)$ can be modeled by a Beta distribution $p(\phi | \alpha, \beta)$, where $\alpha$ and $\beta$ are the number of successes and failures. Binomial and negative binomial, and geometric, are also modeled through beta

- Poisson → Gamma

- Categorial, Multinomial → Dirchlet

- Normal → Normal (i.e. the parameters $\mu$ are modeled by a normal distribution itself, which changes as more samples are added)