Deriving the Maximum Likelihood Estimators
Assume that we have m random vectors, each of size p: X(1),X(2),...,X(m) where each random vectors can be interpreted as an observation (data point) across p variables. If each X(i) are i.i.d. as multivariate Gaussian vectors:
X(i)∼Np(μ,Σ)
Where the parameters μ,Σ are unknown. To obtain their estimate we can use the method of maximum likelihood and maximize the log likelihood function.
Note that by the independence of the random vectors, the joint density of the data {X(i),i=1,2,...,m} is the product of the individual densities, that is ∏mi=1fX(i)(x(i);μ,Σ). Taking the logarithm gives the log-likelihood function
l(μ,Σ|x(i))=log∏i=1mfX(i)(x(i)|μ,Σ)=log ∏i=1m1(2π)p/2|Σ|1/2exp(−12(x(i)−μ)TΣ−1(x(i)−μ))=∑i=1m(−p2log(2π)−12log|Σ|−12(x(i)−μ)TΣ−1(x(i)−μ))
l(μ,Σ;)=−mp2log(2π)−m2log|Σ|−12∑i=1m(x(i)−μ)TΣ−1(x(i)−μ)
Deriving μ^
To take the derivative with respect to μ and equate to zero we will make use of the following matrix calculus identity:
∂wTAw∂w=2Aw if w
does not depend on A and A is symmetric.
∂∂μl(μ,Σ|x(i))0μ^=∑i=1mΣ−1(μ−x(i))=0Since Σ is positive definite=mμ−∑i=1mx(i)=1m∑i=1mx(i)=x¯
Which is often called the sample mean vector.
Deriving Σ^
Deriving the MLE for the covariance matrix requires more work and the use of the following linear algebra and calculus properties:
- The trace is invariant under cyclic permutations of matrix products: tr[ACB]=tr[CAB]=tr[BCA]
- Since xTAx is scalar, we can take its trace and obtain the same value: xtAx=tr[xTAx]=tr[xtxA]
- ∂∂Atr[AB]=BT
- ∂∂Alog|A|=A−T
Combining these properties allows us to calculate
∂∂AxtAx=∂∂Atr[xTxA]=[xxt]T=xTTxT=xxT
Which is the outer product of the vector x with itself.
We can now re-write the log-likelihood function and compute the derivative w.r.t. Σ−1 (note C is constant)
l(μ,Σ|x(i))∂∂Σ−1l(μ,Σ|x(i))=C−m2log|Σ|−12∑i=1m(x(i)−μ)TΣ−1(x(i)−μ)=C+m2log|Σ−1|−12∑i=1mtr[(x(i)−μ)(x(i)−μ)TΣ−1]=m2Σ−12∑i=1m(x(i)−μ)(x(i)−μ)T Since ΣT=Σ
Equating to zero and solving for Σ
0Σ^=mΣ−∑i=1m(x(i)−μ)(x(i)−μ)T=1m∑i=1m(x(i)−μ^)(x(i)−μ^)T
Sources