Bayesian estimation of discrete entropy with mixtures of stickbreaking priorsEvan Archer, Il Memming Park, & Jonathan W. Pillow (2012) 
Advances in Neural Information Processing Systems 25,
eds. P. Bartlett and F.C.N. Pereira and C.J.C. Burges and
L. Bottou and K.Q. Weinberger, 20242032.
We consider the problem of estimating Shannon's entropy H in the undersampled regime, where the number of possible symbols may be unknown or countably infinite. Dirichlet and PitmanYor processes provide tractable prior distributions over the space of countably infinite discrete distributions, and have found major applications in Bayesian nonparametric statistics and machine learning. Here we show that they provide natural priors for Bayesian entropy estimation, due to the analytic tractability of the moments of the induced posterior distribution over entropy H. We derive formulas for the posterior mean and variance of H given data. However, we show that a fixed Dirichlet or PitmanYor process prior implies a narrow prior on H, meaning the prior strongly determines the estimate in the undersampled regime. We therefore define a family of continuous mixing measures such that the resulting mixture of Dirichlet or PitmanYor processes produces an approximately flat prior over H. We explore the theoretical properties of the resulting estimators and show that they perform well on data sampled from both exponential and powerlaw tailed distributions.

This paper superceded by:

online publications 