library(nlme)
<- DAAG::kiwishade
kiwishade $plot <- factor(paste(kiwishade$block, kiwishade$shade,
kiwishadesep="."))
<- lme(yield~shade,random=~1|block/plot, data=kiwishade)
kiwishade.lme summary(kiwishade.lme)
-effects model fit by REML
Linear mixed: kiwishade
Data
AIC BIC logLik265.9663 278.4556 -125.9831
:
Random effects: ~1 | block
Formula
(Intercept): 2.019373
StdDev
: ~1 | plot %in% block
Formula
(Intercept) Residual: 1.478623 3.490381
StdDev
: yield ~ shade
Fixed effects-value p-value
Value Std.Error DF t100.20250 1.761617 36 56.88098 0.0000
(Intercept) 3.03083 1.867621 6 1.62283 0.1558
shadeAug2Dec -10.28167 1.867621 6 -5.50522 0.0015
shadeDec2Feb -7.42833 1.867621 6 -3.97743 0.0073
shadeFeb2May :
Correlation
(Intr) shdA2D shdD2F-0.53
shadeAug2Dec -0.53 0.50
shadeDec2Feb -0.53 0.50 0.50
shadeFeb2May
-Group Residuals:
Standardized Within
Min Q1 Med Q3 Max -2.41538976 -0.59814252 -0.06899575 0.78046182 1.58909233
: 48
Number of Observations:
Number of Groups%in% block
block plot 3 12
anova(kiwishade.lme)
-value p-value
numDF denDF F1 36 5190.560 <.0001
(Intercept) 3 6 22.211 0.0012
shade intervals(kiwishade.lme)
95% confidence intervals
Approximate
:
Fixed effects
lower est. upper96.629775 100.202500 103.775225
(Intercept) -1.539072 3.030833 7.600738
shadeAug2Dec -14.851572 -10.281667 -5.711762
shadeDec2Feb -11.998238 -7.428333 -2.858428
shadeFeb2May
:
Random Effects: block
Level
lower est. uppersd((Intercept)) 0.5475859 2.019373 7.446993
: plot
Level
lower est. uppersd((Intercept)) 0.3700762 1.478623 5.907772
-group standard error:
Within
lower est. upper 2.770652 3.490381 4.397072
8 Multi-Level Models and Repeated Measures Models
Models have both a fixed effects structure and an error structure. For example, in an inter-laboratory comparison there may be variation between laboratories, between observers within laboratories, and between multiple determinations made by the same observer on different samples. If we treat laboratories and observers as random, the only fixed effect is the mean.
The functions lme()
and nlme()
, from the Pinheiro and Bates nlme package, handle models in which a repeated measures error structure is superimposed on a linear (lme4()
) or non-linear (nlme()
) model. The lme()
function is broadly comparable to Proc Mixed in the widely used SAS statistical package. The function lme()
has associated with it important abilities for diagnostic checking and other insight.
There is a strong link between a wide class of repeated measures models and time series models. In the time series context there is usually just one realization of the series, which may however be observed at a large number of time points. In the repeated measures context there may be a large number of realizations of a series that is typically quite short.
8.1 Multi-level models – examples
The Kiwifruit Shading Data, Again
Refer back to Section 3.8 for details of the data. The fixed effects are block
and treatment (shade
). The random effects are block (though making block a random effect is optional, for purposes of comparing treatments), plot
within block
, and units within each block/plot combination. Here is the analysis:
We are interested in the three sd estimates. By squaring the standard deviations and converting them to variances we get the information in the following table:
Variance component | Notes | |
---|---|---|
block | 2.0192\(^2\) = 4.076 | Three blocks |
plot | 1.4792\(^2\) = 2.186 | 4 plots per block |
residual (within group) | 3.4902\(^2\)=12.180 | 4 vines (subplots) per plot |
The above gives the information for an analysis of variance table. We have:
Variance component | Mean square for anova table | d.f. | |
---|---|---|---|
block | 4.076 | 12.180 + 4 \(\times\) 2.186 + 16 \(\times\) 4.076 = 86.14 | 2 (3-1) |
plot | 2.186 | 12.180 + 4 \(\times\) 2.186 = 20.92 | 6 (3-1)\(\times\)(4-1) |
residual (within gp) | 12.180 | 12.18 | 3 \(\times\) 4 \(\times\) (4-1) |
Now see where these same pieces of information appeared in the analysis of variance table of Section 3.8:
<-aov(yield~block+shade+Error(block:shade),data=kiwishade)
kiwishade.aovin aov(yield ~ block + shade + Error(block:shade), data = kiwishade):
Warning Error() model is singular
summary(kiwishade.aov)
: block:shade
ErrorPr(>F)
Df Sum Sq Mean Sq F value 2 172.3 86.2 4.118 0.07488
block 3 1394.5 464.8 22.211 0.00119
shade 6 125.6 20.9
Residuals
: Within
ErrorPr(>F)
Df Sum Sq Mean Sq F value 36 438.6 12.18 Residuals
The Tinting of Car Windows
In Section 2.6 we encountered data from an experiment that aimed to model the effects of the tinting of car windows on visual performance . The authors are mainly interested in effects on side window vision, and hence in visual recognition tasks that would be performed when looking through side windows. Data are in the data frame tinting. In this data frame, csoa
(critical stimulus onset asynchrony, i.e. the time in milliseconds required to recognise an alphanumeric target), it
(inspection time, i.e. the time required for a simple discrimination task) and age are variables, while tint
(3 levels) and target
(2 levels) are ordered factors. The variable sex
is coded 1 for males
and 2 for females, while the variable agegp
is coded 1 for young people (all in their early 20s) and 2 for older participants (all in the early 70s).
We have two levels of variation – within individuals (who were each tested on each combination of tint
and target
), and between individuals. So we need to specify id
(identifying the individual) as a random effect. Plots such as we examined in Section 2.6 make it clear that, to get variances that are approximately homogeneous, we need to work with log(csoa)
and log(it)
. Here we examine the analysis for log(it)
. We start with a model that is likely to be more complex than we need (it has all possible interactions):
<- DAAG::tinting
tinting <-lme(log(it)~tint*target*agegp*sex,
itstar.lmerandom=~1|id, data=tinting,method="ML")
A reasonable guess is that first order interactions may be all we need, i.e.
<-lme(log(it)~(tint+target+agegp+sex)^2,
it2.lmerandom=~1|id, data=tinting,method="ML")
Finally, there is the very simple model, allowing only for main effects:
<-lme(log(it)~(tint+target+agegp+sex),
it1.lmerandom=~1|id, data=tinting,method="ML")
Note that all these models have been fitted by maximum likelihood. This allows the equivalent of an analysis of variance comparison.
Here is what we get:
anova(itstar.lme,it2.lme,it1.lme)
-value
Model df AIC BIC logLik Test L.Ratio p1 26 8.146187 91.45036 21.926906
itstar.lme 2 17 -3.742883 50.72523 18.871441 1 vs 2 6.11093 0.7288
it2.lme 3 8 1.138171 26.77022 7.430915 2 vs 3 22.88105 0.0065 it1.lme
The model that limits attention to first order interactions appears adequate. As a preliminary to examining the first order interactions individually, we re-fit the model used for it2.lme
, now with method="REML"
.
<-update(it2.lme,method="REML") it2.reml
We now examine the estimated effects:
options(digits=3)
summary(it2.reml)$tTable
-value p-value
Value Std.Error DF t3.61907 0.1301 145 27.817 5.30e-60
(Intercept) 0.16095 0.0442 145 3.638 3.81e-04
tint.L 0.02096 0.0452 145 0.464 6.44e-01
tint.Q -0.11807 0.0423 145 -2.789 5.99e-03
targethicon 0.47121 0.2329 22 2.023 5.54e-02
agegpolder 0.08213 0.2329 22 0.353 7.28e-01
sexm :targethicon -0.09193 0.0461 145 -1.996 4.78e-02
tint.L:targethicon -0.00722 0.0482 145 -0.150 8.81e-01
tint.Q:agegpolder 0.13075 0.0492 145 2.658 8.74e-03
tint.L:agegpolder 0.06972 0.0520 145 1.341 1.82e-01
tint.Q:sexm -0.09794 0.0492 145 -1.991 4.83e-02
tint.L:sexm 0.00542 0.0520 145 0.104 9.17e-01
tint.Q:agegpolder -0.13887 0.0584 145 -2.376 1.88e-02
targethicon:sexm 0.07785 0.0584 145 1.332 1.85e-01
targethicon:sexm 0.33164 0.3261 22 1.017 3.20e-01 agegpolder
Because tint is an ordered factor, polynomial contrasts are used.
The Michelson Speed of Light Data
The MASS::michelson
dataframe has columns Speed
, Run
, and Expt
, for five experiments of 20 runs each. A plot of the data seems consistent with sequential dependence within runs, possibly with random variation between runs.
Code is:
<- MASS::michelson
michelson ::xyplot(Speed~Run|factor(Expt), layout=c(5,1),
latticedata=michelson, type=c('p','r'),
scales=list(x=list(at=seq(from=1,to=19, by=3))))
We try a model that allows the estimates to vary linearly with Run (from 1 to 20), with the slope varying randomly between experiments. We assume an autoregressive dependence structure of order 1. We allow the variance to change from one experiment to another.
To test whether this level of model complexity is justified statistically, one needs to compare models with and without these effects, setting method="ML"
in each case, and compare the likelihoods.
<- MASS::michelson
michelson library(nlme)
$Run <- as.numeric(michelson$Run) # Ensure Run is a variable
michelson<- lme(fixed = Speed ~ Run, data = michelson,
mich.lme1 random = ~ Run| Expt, correlation = corAR1(form = ~ 1 | Expt),
weights = varIdent(form = ~ 1 | Expt), method='ML')
<- lme(fixed = Speed ~ Run, data = michelson,
mich.lme0 random = ~ 1| Expt, correlation = corAR1(form = ~ 1 | Expt),
weights = varIdent(form = ~ 1 | Expt), method='ML')
anova(mich.lme0, mich.lme1)
-value
Model df AIC BIC logLik Test L.Ratio p1 9 1121 1144 -551
mich.lme0 2 11 1125 1153 -551 1 vs 2 2.63e-08 1 mich.lme1
The simpler model is preferred. Can it be simplified further?
8.2 References and reading
See the vignettes that accompany the lme4 package.
Maindonald and Braun (2010) . Data Analysis and Graphics Using R –- An Example-Based Approach. Cambridge University Press.
Maindonald, Braun, and Andrews (2024, forthcoming) . A Practical Guide to Data Analysis Using R. An Example-Based Approach. Cambridge University Press.
Pinheiro and Bates (2000) . Mixed effects models in S and S-PLUS. Springer.