Biostat 200C Homework 1 Bonus Solution
1 Q2. Concavity of logistic regression log-likelihood
1.1 Q2.1
Write down the log-likelihood function of logistic regression for binomial responses.
\[ \begin{eqnarray} L(\boldsymbol{\beta}) &=& \prod_{i}\left[ p_i^{y_1}(1-p_i)^{1-y_i}\right]\\ \ell(\boldsymbol{\beta}) &=& \sum_i \log \left[p_i^{y_i} (1 - p_i)^{1 - y_i}\right] \\ &=& \sum_i \left[ y_i \log p_i + (1 - y_i) \log (1 - p_i) \right] \\ &=& \sum_i \left[ y_i \log \frac{e^{\eta_i}}{1 + e^{\eta_i}} + (1 - y_i) \log \frac{1}{1 + e^{\eta_i}} \right] \\ &=& \sum_i \left[ y_i \eta_i - \log (1 + e^{\eta_i}) \right] \\ &=& \sum_i \left[ y_i \cdot \mathbf{x}_i^T \boldsymbol{\beta} - \log (1 + e^{\mathbf{x}_i^T \boldsymbol{\beta}}) \right]. \end{eqnarray} \]
1.2 Q2.2
Derive the gradient vector and Hessian matrix of the log-likelhood function with respect to the regression coefficients \(\boldsymbol{\beta}\).
\[ \begin{eqnarray} \nabla \ell(\boldsymbol{\beta}) &=& \sum_i \left[ y_i \mathbf{x}_i - \frac{e^{\mathbf{x}_i^T \boldsymbol{\beta}}}{1 + e^{\mathbf{x}_i^T \boldsymbol{\beta}}} \mathbf{x}_i \right] \\ &=& \sum_i \left[ (y_i - \frac{e^{\mathbf{x}_i^T \boldsymbol{\beta}}}{1 + e^{\mathbf{x}_i^T \boldsymbol{\beta}}}) \mathbf{x}_i \right] \\ &=& \sum_i \left[ (y_i - \frac{1}{1+e^{-\mathbf{x_i}^T\boldsymbol{\beta}}}) \mathbf{x}_i \right] \\ H(\boldsymbol{\beta}) &=& \frac{\partial \nabla \ell(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}}\\ &=&-\sum_i \left[ \frac{ e^{-\mathbf{x_i}^T\boldsymbol{\beta}}}{(1+e^{-\mathbf{x_i}^T\boldsymbol{\beta}})^2}\cdot\mathbf{x}_i\mathbf{x}_i^T\right] \\ \end{eqnarray} \]
1.3 Q2.3
Show that the log-likelihood function of logistic regression is a concave function in regression coefficients \(\boldsymbol{\beta}\). (Hint: show that the negative Hessian is a positive semidefinite matrix.)
Since \(e^{-\mathbf{x_i}^T\boldsymbol{\beta}}>0\), and \((1+e^{-\mathbf{x_i}^T\boldsymbol{\beta}})^2>0\). For any non-zero vector \(\mathbf{v}\) (which has the same length as \(\mathbf{x}_i\)), we have
\[ \begin{eqnarray} -\mathbf{v}^TH\mathbf{v} &=& \sum_i \left[ \frac{ e^{-\mathbf{x_i}^T\boldsymbol{\beta}}}{(1+e^{-\mathbf{x_i}^T\boldsymbol{\beta}})^2}\cdot\mathbf{v}^T\mathbf{x}_i\mathbf{x}_i^T\mathbf{v}\right]\\ &=& \sum_i \left[ \frac{ e^{-\mathbf{x_i}^T\boldsymbol{\beta}}}{(1+e^{-\mathbf{x_i}^T\boldsymbol{\beta}})^2}\cdot(\mathbf{x}_i^T\mathbf{v})^2\right]\\ &\geq& 0 \end{eqnarray} \] Since \((\mathbf{x}_i^T\mathbf{v})^2 \geq 0\) for all \(i\). Therefore, the Hessian matrix is semi-negative definite, and the log-likelihood function is concave.