Asymptotic Computations of MLEs

We apply the Fundamental Theorem of MLEs to a few computations

Exercise

The relative entropy of the empirical distribution and distributions on a bit.

Recall the classical coordinates on probability distributions on \(\{0,1\}\):

\[\begin{split}\begin{align*} \mathrm{Prob}(\{0,1\}) & \overset{\simeq} \longrightarrow \mathbb{R} \\ \rho &\longmapsto \rho(0) = p \end{align*}\end{split}\]

given data \(X \in \{0,1\}^{times N}\) of size \(N\), drawn from a distribtion \(\rho\) and a distribution \(\rho_\theta\):

\[\begin{split}\begin{align*} \mathcal{D}(\rho_X || \rho_\theta) &\approx \mathcal{D}(\rho || \rho_\theta) \\ &= - \bigl(\rho \log\rho_\theta + (1-\rho)\log(1 - \rho_\theta) \bigl) + \mathcal{S}(\rho) \end{align*}\end{split}\]

where the last term is the entropy of the empirical distribution, and does not depend on \(\rho_\theta\).

Example

We now compute the standard devation of \(\hat{\rho}_\theta\), using the computation above and the fundamental theorem. As we are in dimension 1, the hessian is just a second derivative:

\[\begin{split}\begin{align*} \sigma_\theta &= \partial^2_{p_\theta} \mathcal{D}(\rho || \rho_\theta) \lvert_{\rho_\theta = \rho} \\ &= \bigl( \frac{1}{p} + \frac{1}{1-p} \bigl)^{-1} \\ &= p(1-p) \end{align*}\end{split}\]

This recovers a result which is commonly justified using the central limit theorem. I prefer this derivation.

Exercise

The relative entropy of the empirical distribution and multinomial distribution

Example

distributions on a finite set compute the hessian find normal coordinates of the hessian say something interesting

Exercise

Recall the standard coordinates on the space of normal distributions on \(\mathbb{R}\) in terms of it’s first two cumulants: the expected value and squared standard deviation.

\[\begin{align*} \mathcal{D}(\rho_0 || \rho_1) &= \frac{1}{2} \Bigl( \log \bigl( \frac{}{} \bigl) \end{align*}\]

Here, we are working in units in which:

\[\mu_0 = 0, \sigma_0 = 1\]

Example

normal distribution with fixed standard distibution compute the hessian find normal coordinates of the hessian say something interesting

Exercise

The relative entropy of the empirical distribution and

Example

general normal distributions compute the hessian find normal coordinates of the hessian say something interesting

Exercise

The relative entropy of the empirical distribution and

Example

exponential distribution compute the hessian find normal coordinates of the hessian say something interesting

Exercise

The relative entropy of the empirical distribution and

Example

poisson distribution compute the hessian find normal coordinates of the hessian say something interesting

Note

these are spherical and hyperbolic metrics