In probability theory and statistics, the Dirichlet process (DP) is one of the most popular Bayesian nonparametric models. It was introduced by Thomas Ferguson[1] as a prior over probability distributions.
A Dirichlet process is completely defined by its parameters: (the base distribution or base measure) is an arbitrary distribution and (the concentration parameter) is a positive real number (it is often denoted as ). According to the Bayesian paradigm these parameters should be chosen based on the available prior information on the domain.
The question is: how should we choose the prior parameters of the DP, in particular the infinite dimensional one , in case of lack of prior information?
To address this issue, the only prior that has been proposed so far is the limiting DP obtained for , which has been introduced under the name of Bayesian bootstrap by Rubin;[2] in fact it can be proven that the Bayesian bootstrap is asymptotically equivalent to the frequentist bootstrap introduced by Bradley Efron.[3] The limiting Dirichlet process has been criticized on diverse grounds. From an a-priori point of view, the main criticism is that taking is far from leading to a noninformative prior.[4] Moreover, a-posteriori, it assigns zero probability to any set that does not include the observations.[2]
The imprecise Dirichlet[5] process has been proposed to overcome these issues. The basic idea is to fix but do not choose any precise base measure .
More precisely, the imprecise Dirichlet process (IDP) is defined as follows:
where is the set of all probability measures. In other words, the IDP is the set of all Dirichlet processes (with a fixed ) obtained by letting the base measure to span the set of all probability measures.