# Prior probability

In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable. Bayes' theorem calculates the renormalized pointwise product of the prior and the likelihood function, to produce the posterior probability distribution, which is the conditional distribution of the uncertain quantity given the data. Similarly, the prior probability of a random event or an uncertain proposition is the unconditional probability that is assigned before any relevant evidence is taken into account. Priors can be created using a number of methods. A prior can be determined from past information, such as previous experiments. A prior can be elicited from the purely subjective assessment of an experienced expert. An uninformative prior can be created to reflect a balance among outcomes when no information is available. Priors can also be chosen according to some principle, such as symmetry or maximizing entropy given constraints; examples are the Jeffreys prior or Bernardo's reference prior. When a family of conjugate priors exists, choosing a prior from that family simplifies calculation of the posterior distribution. Parameters of prior distributions are a kind of hyperparameter. For example, if one uses a beta distribution to model the distribution of the parameter p of a Bernoulli distribution, then: p is a parameter of the underlying system (Bernoulli distribution), and α and β are parameters of the prior distribution (beta distribution); hence hyperparameters. Hyperparameters themselves may have hyperprior distributions expressing beliefs about their values. A Bayesian model with more than one level of prior like this is called a hierarchical Bayes model.

## Words

This table shows the example usage of word lists for keywords extraction from the text above.

Word | Word Frequency | Number of Articles | Relevance |
---|---|---|---|

prior | 16 | 57529 | 0.217 |

distribution | 13 | 24048 | 0.21 |

probability | 8 | 3610 | 0.174 |

priors | 3 | 157 | 0.093 |

bayes | 3 | 163 | 0.093 |