As kjetil b halvorsen pointed out, it is, in its own way, a miracle that the linear regression admits an analytical solution. And this is so only by virtue of linearity of the problem (with respect to the parameters). In OLS, you have
∑i(yi−x′iβ)2→minβ,
which has the first order conditions
−2∑i(yi−x′iβ)xi=0
For a problem with
p variables (including constant, if needed—there are some regression through the origin problems, too), this is a system with
p equations and
p unknowns. Most importantly, it is a linear system, so you can find a solution using the standard
linear algebra theory and practice. This system will have a solution with probability 1 unless you have perfectly collinear variables.
Now, with logistic regression, things aren't that easy anymore. Writing down the log-likelihood function,
l(y;x,β)=∑iyilnpi+(1−yi)ln(1−pi),pi=(1+exp(−θi))−1,θi=x′iβ,
and taking its derivative to find the MLE, we get
∂l∂β′=∑idpidθ(yipi−1−yi1−pi)xi=∑i[yi−11+exp(x′iβ)]xi
The parameters
β enter this in a very nonlinear way: for each
i, there's a nonlinear function, and they are added together. There is no analytical solution (except probably in a trivial situation with two observations, or something like that), and you have to use
nonlinear optimization methods to find the estimates
β^.
A somewhat deeper look into the problem (taking the second derivative) reveals that this is a convex optimization problem of finding a maximum of a concave function (a glorified multivariate parabola), so either one exists, and any reasonable algorithm should be finding it rather quickly, or things blow off to infinity. The latter does happen to logistic regression when Prob[Yi=1|x′iβ>c]=1 for some c, i.e., you have a perfect prediction. This is a rather unpleasant artifact: you would think that when you have a perfect prediction, the model works perfectly, but curiously enough, it is the other way round.