To submit homework, please upload both Rmd and html files to Bruinlearn by the deadline.
Let \(Y_1,\ldots,Y_n\) be independent random variables with \(Y_i \sim \text{Poisson}(\mu_i)\) and \(\log \mu_i = \mathbf{x}_i^T \boldsymbol{\beta}\), \(i = 1,\ldots,n\).
Write down the log-likelihood function.
Derive the gradient vector and Hessian matrix of the log-likelhood function with respect to the regression coefficients \(\boldsymbol{\beta}\).
Show that the log-likelihood function of the log-linear model is a concave function in regression coefficients \(\boldsymbol{\beta}\). (Hint: show that the negative Hessian is a positive semidefinite matrix.)
Show that for the fitted values \(\widehat{\mu}_i\) from maximum likelihood estimates \[ \sum_i \widehat{\mu}_i = \sum_i y_i. \] Therefore the deviance reduces to \[ D = 2 \sum_i y_i \log \frac{y_i}{\widehat{\mu}_i}. \]
Recall the probability mass function of negative binomial distribution is \[ \mathbb{P}(Y = y) = \binom{y + r - 1}{r - 1} (1 - p)^r p^y, \quad y = 0, 1, \ldots \] Show \(\mathbb{E}Y = \mu = rp / (1 - p)\) and \(\operatorname{Var} Y = r p / (1 - p)^2\).
For the uniform association when all two-way interactions are included, i.e., \[ \log \mathbb{E}Y_{ijk} = \log p_{ijk} = \log n + \log p_i + \log p_j + \log p_k + \log p_{ij} + \log p_{ik} + \log p_{jk}. \]
Proof the odds ratio (or log of odds ratio) across all stratum \(k\) \[ \log \frac{\mathbb{E}Y_{11k}\mathbb{E}Y_{22k}}{\mathbb{E}Y_{12k}\mathbb{E}Y_{21k}} \]
is a constant, i.e., the estimated effect of the interaction term “i:j” in the uniform association model