psid (Panel Study of Income Dynamics) data records the income of 85 individuals, who were aged 25-39 in 1968 and had complete data for at least 11 of the years between 1968 and 1990.
psid <-as_tibble(psid) %>%print(n =30)
# A tibble: 1,661 × 6
age educ sex income year person
<int> <int> <fct> <int> <int> <int>
1 31 12 M 6000 68 1
2 31 12 M 5300 69 1
3 31 12 M 5200 70 1
4 31 12 M 6900 71 1
5 31 12 M 7500 72 1
6 31 12 M 8000 73 1
7 31 12 M 8000 74 1
8 31 12 M 9600 75 1
9 31 12 M 9000 76 1
10 31 12 M 9000 77 1
11 31 12 M 23000 78 1
12 31 12 M 22000 79 1
13 31 12 M 8000 80 1
14 31 12 M 10000 81 1
15 31 12 M 21800 82 1
16 31 12 M 19000 83 1
17 29 16 M 7500 68 2
18 29 16 M 7300 69 2
19 29 16 M 9250 70 2
20 29 16 M 10300 71 2
21 29 16 M 10900 72 2
22 29 16 M 11900 73 2
23 29 16 M 13000 74 2
24 29 16 M 12900 75 2
25 29 16 M 19900 76 2
26 29 16 M 18900 77 2
27 29 16 M 19100 78 2
28 29 16 M 20300 79 2
29 29 16 M 31600 80 2
30 29 16 M 16600 81 2
# ℹ 1,631 more rows
summary(psid)
age educ sex income year
Min. :25.00 Min. : 3.00 F:732 Min. : 3 Min. :68.00
1st Qu.:28.00 1st Qu.:10.00 M:929 1st Qu.: 4300 1st Qu.:73.00
Median :34.00 Median :12.00 Median : 9000 Median :78.00
Mean :32.19 Mean :11.84 Mean : 13575 Mean :78.61
3rd Qu.:36.00 3rd Qu.:13.00 3rd Qu.: 18050 3rd Qu.:84.00
Max. :39.00 Max. :16.00 Max. :180000 Max. :90.00
person
Min. : 1.00
1st Qu.:20.00
Median :42.00
Mean :42.44
3rd Qu.:63.00
Max. :85.00
Display trajectories of first 20 individuals.
psid %>%filter(person <=20) %>%ggplot() +geom_line(mapping =aes(x = year, y = income)) +facet_wrap(~ person) +labs(x ="Year", y ="Income")
Income trajectories stratified by sex. We observe men’s incomes are generally higher and less variable than women’s. Women’s income seems to increase quicker than men’s. How to test these hypotheses?
psid %>%filter(person <=20) %>%ggplot() +geom_line(mapping =aes(x = year, y = income, group = person)) +facet_wrap(~ sex) +scale_y_log10()
If we fit a seperate linear model for each individual, we see variability in the intercepts and slopes. We use log income as response and center year by the median 78.
We consider a linear mixed model with random intercepts and random slopes \[
\log (\text{income}_{ij}) = \mu + \text{year}_i \cdot \beta_{\text{year}} + \text{sex}_j \cdot \beta_{\text{sex}} + (\text{sex}_j \times \text{year}_i) \cdot \beta_{\text{sex} \times \text{year}} + \text{educ}_j \times \beta_{\text{educ}} + \text{age}_j \cdot \beta_{\text{age}} + \gamma_{0,j} + \text{year}_i \cdot \gamma_{1,j} + \epsilon_{ij}
\] where \(i\) indexes year and \(j\) indexes individual. The random intercepts and slopes are iid \[
\begin{pmatrix}
\gamma_{0,j} \\
\gamma_{1,j}
\end{pmatrix} \sim N(\mathbf{0}, \boldsymbol{\Sigma}).
\] and the noise terms \(\epsilon_{ij}\) are iid \(N(0, \sigma^2)\).
mmod <-lmer(log(income) ~ cyear * sex + age + educ + (cyear | person), data = psid)summary(mmod)
Linear mixed model fit by REML ['lmerMod']
Formula: log(income) ~ cyear * sex + age + educ + (cyear | person)
Data: psid
REML criterion at convergence: 3819.8
Scaled residuals:
Min 1Q Median 3Q Max
-10.2310 -0.2134 0.0795 0.4147 2.8254
Random effects:
Groups Name Variance Std.Dev. Corr
person (Intercept) 0.2817 0.53071
cyear 0.0024 0.04899 0.19
Residual 0.4673 0.68357
Number of obs: 1661, groups: person, 85
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.674211 0.543323 12.284
cyear 0.085312 0.008999 9.480
sexM 1.150312 0.121292 9.484
age 0.010932 0.013524 0.808
educ 0.104209 0.021437 4.861
cyear:sexM -0.026306 0.012238 -2.150
Correlation of Fixed Effects:
(Intr) cyear sexM age educ
cyear 0.020
sexM -0.104 -0.098
age -0.874 0.002 -0.026
educ -0.597 0.000 0.008 0.167
cyear:sexM -0.003 -0.735 0.156 -0.010 -0.011
Exercise: interpretation of fixed effects.
To test the fixed effects, we can use the Kenward-Roger adjusted F-test. For example, to test the interaction term
library(pbkrtest)mmod <-lmer(log(income) ~ cyear * sex + age + educ + (cyear | person), data = psid, REML =TRUE)mmodr <-lmer(log(income) ~ cyear + sex + age + educ + (cyear | person), data = psid, REML =TRUE)KRmodcomp(mmod, mmodr)
large : log(income) ~ cyear * sex + age + educ + (cyear | person)
small : log(income) ~ cyear + sex + age + educ + (cyear | person)
stat ndf ddf F.scaling p.value
Ftest 4.6142 1.0000 81.3279 1 0.03468 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The interaction term is marginally significant.
We can test the significance of the variance components by the parameteric bootstrap.
Modelling: power is a fixed effect, subject is random effect, eye is nested within subjects. \[
y_{ijk} = \mu + p_j + s_i + e_{ik} + \epsilon_{ijk},
\] where \(i=1,\ldots,7\) indexes individuals, \(j=1,\ldots,4\) indexes power, and \(k=1,2\) indexes eyes. The random effect distributions are \[
s_i \sim_{\text{iid}} N(0,\sigma_s^2), \quad e_{ik} \sim_{\text{iid}} N(0,\sigma_e^2) \text{ for fixed } i, \quad \epsilon_{ijk} \sim_{\text{iid}} N(0, \sigma^2).
\]\(\sigma_e^2\) is interpreted as the variance of acuity between combinations of eyes and subject, since we don’t believe the eye difference is consistent across individuals.
mmod <-lmer(acuity ~ power + (1| subject) + (1| subject:eye), data = vision)summary(mmod)
Linear mixed model fit by REML ['lmerMod']
Formula: acuity ~ power + (1 | subject) + (1 | subject:eye)
Data: vision
REML criterion at convergence: 328.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.4240 -0.3232 0.0109 0.4408 2.4662
Random effects:
Groups Name Variance Std.Dev.
subject:eye (Intercept) 10.27 3.205
subject (Intercept) 21.53 4.640
Residual 16.60 4.075
Number of obs: 56, groups: subject:eye, 14; subject, 7
Fixed effects:
Estimate Std. Error t value
(Intercept) 112.6429 2.2349 50.401
power6/18 0.7857 1.5400 0.510
power6/36 -1.0000 1.5400 -0.649
power6/60 3.2857 1.5400 2.134
Correlation of Fixed Effects:
(Intr) pw6/18 pw6/36
power6/18 -0.345
power6/36 -0.345 0.500
power6/60 -0.345 0.500 0.500
3.1 Test fixed effects
To test the fixed effect power, we can use the Kenward-Roger adjusted F-test. We find the power is not significant at the 0.05 level.
mmod <-lmer(acuity ~ power + (1| subject) + (1| subject:eye), data = vision, REML =TRUE)nmod <-lmer(acuity ~ (1| subject) + (1| subject:eye), data = vision, REML =TRUE)KRmodcomp(mmod, nmod)