Stats 101 Examples with MCMC Part 2

Estimating mean with unknown 𝜎

In #math

This is another in the series of "Statistics 101" examples solved with MCMC. The previous in the series can be found here. In all of these posts I'm going to use a python library I make for my science-programming class, stored on github, and the emcee library. Install like:

pip install "git+git://github.com/bblais/sci378" --upgrade
pip install emcee

In this example, like the last one, there is some true value, we call μ\mu. The NN data points we have, {xi}\{x_i\}, consist of that true value with added independent, normally-distributed noise of unknown scale, σ\sigma.

For a Bayesian solution, we need to specify the likelihood function -- how our model produces the data -- and the prior probability for the parameters. The likelihood function is determined exactly as before,

xiμ+ϵiP(xiμ,σ)Normal(xiμ,σ)P({xi}μ,σ)iNormal(xiμ,σ) \begin{aligned} x_i &\sim \mu + \epsilon_i \\ P(x_i|\mu,\sigma) &\sim \text{Normal}(x_i-\mu,\sigma) \\ P(\{x_i\}|\mu,\sigma) &\sim \prod_i \text{Normal}(x_i-\mu,\sigma) \end{aligned} but with unknown scale, σ\sigma.

In the "Statistics 101" examples, the results are typically equivalent to uniform priors on the location parameters, but Jeffreys priors on scale parameters, so we'll assume uniform priors on μ\mu and Jeffreys priors on σ\sigma

Setting up the problem

In this example, like the last one, there is some true value, we call μ\mu, with normally-distributed noise but this time with unknown scale, σ\sigma. The likelihood function is determined identically as before,

xiμ+ϵiP(xiμ,σ)Normal(μ,σ) \begin{aligned} x_i &\sim \mu + \epsilon_i \\ P(x_i|\mu,\sigma) &\sim \text{Normal}(\mu,\sigma) \end{aligned} known σ\sigma

and we'll assume uniform priors on μ\mu. The prior for σ\sigma, which we must now estimate, is taken to be a Jeffreys prior. This prior is uniform in logσ\log \sigma or, in other words, uniform in scale.

def lnlike(data,μ):
    # known σ
    x=data
    return lognormalpdf(x,μ,σ)

data=array([12.0,14,16])
model=MCMCModel(data,lnlike,
                μ=Uniform(-50,50),
                σ=Jeffreys(),
)

Running MCMC, looking at the results

Now we run MCMC, plot the chains (so we can see it has converged) and look at distributions,

model.run_mcmc(800,repeat=3)
model.plot_chains()
Sampling Prior...
Done.
0.27 s
Running MCMC 1/3...
Done.
2.63 s
Running MCMC 2/3...
Done.
2.79 s
Running MCMC 3/3...
Done.
2.47 s

model.plot_distributions()

Comparing to textbook examples

We compare to textbook solution for μ\mu (i.e. Student t-distribution),

x,y=model.get_distribution('μ')

plot(x,y,'-')
μ̂=mean(data)
N=len(data)

σμ=std(data,ddof=1)/sqrt(N)
y_pred=[exp(logtpdf(_,N-1,μ̂,σμ)) for _ in x]
plot(x,y_pred,'r--')

and the textbook solution for σ\sigma (i.e. Chi-square),

x,y=model.get_distribution('σ',bins=800)
plot(x,y,'-')

V=((data-data.mean())**2).sum()
logp=-N*log(x)-V/2/x**2
y_pred=exp(logp)
dx=x[1]-x[0]
y_pred=y_pred/y_pred.sum()/dx

plot(x,y_pred,'r--')
xlim([0,20])

Tail-area Probabilities

We can easily find the tail-area probability, the Bayesian equivalent to Student-t test, for μ\mu,

model.P("μ>15")
0.037066666666666664

and for σ\sigma, the Bayesian equivalent to Chi-squared test,

model.P("σ<1")
0.01985