Day 26 Planning designed experiments

26.1 Announcements

  • Project deadlines:
    • December 12th: give your presentation by this date (it can be earlier than this also). You will need to schedule a 30min meeting with me.
    • December 19th: submit your final project and tutorial on canvas, including both your classmate’s and my feedback.

26.2 Designed experiments

  • On the importace of designed experiments
  • Defining the data generating process
  • Statistical power
  • Precision of parameters
Types of errors.

Figure 26.1: Types of errors.

26.2.0.1 Analytical solutions

We know Power is

\[Power = P(F>F_{c} \vert F \sim F(df_1, df_2, \phi)),\]

where:

  • \(F\) is the F test statistics,
  • \(F_c\) is the critical test F,
  • \(df_1\) are the degrees of freedom of the numerator,
  • \(df_2\) are the degrees of freedom of the denominator,
  • \(\phi\) is the noncentrality parameter.

Note that \(F\) depends on \(\sigma^2_\varepsilon\), and \(F_c\) depends on \(\alpha\).

Suppose we are trying to compare designs to find differences main effects and interactions of a \(3 \times 2\) factorial treatment structure. Suppose we have:

  • Treatment A: 3 levels,
  • Treatment B: 2 levels.

Typically, we would find means around \(10\) units, and \(\sigma^2 \approx 1\).

library(nlme)

This is how a dataframe coudl look like:

n_reps <- 3
data <- 
  purrr::map_dfr(
    seq_len(n_reps),
    ~expand.grid(tA = factor(1:3),
                tB = factor(1:2)) |>
      mutate(mu = c(10,12,12,10,10,12))) |> 
  mutate(rep = rep(1:n_reps, each = 6))
head(data)
##   tA tB mu rep
## 1  1  1 10   1
## 2  2  1 12   1
## 3  3  1 12   1
## 4  1  2 10   1
## 5  2  2 10   1
## 6  3  2 12   1
# Also:

sigma2_e <- 1
sigma2_block <- .8
sigma2_wp <- .2

How do different designs compare?

# create an RCBD model 
m_rcbd <- lme(mu ~ tA * tB, 
         random = list(rep = pdSymm(matrix(sqrt(sigma2_block)), ~ 1, nam = "(Intercept)")), 
         data = data,
         # this won't converge but ask to get an object anyways
         control = lmeControl(
           sigma = sqrt(sigma2_e),
           returnObject = TRUE))


# peek at the variance-covariance matrix 
VarCorr(m_rcbd)
## rep = pdSymm(1) 
##             Variance   StdDev    
## (Intercept) 0.00863157 0.09290624
## Residual    1.00000000 1.00000000
# fix variance components 
m_rcbd$modelStruct$reStruct$rep <-  pdSymm(matrix(sqrt(sigma2_block)), ~ 1, nam = "(Intercept)")

# peek at the variance-covariance matrix again
VarCorr(m_rcbd)
## rep = pdSymm(1) 
##             Variance  StdDev   
## (Intercept) 0.8944272 0.9457416
## Residual    1.0000000 1.0000000
results <-
  emmeans::joint_tests(m_rcbd) |> 
  as.data.frame() |> 
  transmute(source = `model term`, 
            numDF = df1, 
            denDF = df2, 
            F.ratio)

alpha <- 0.05

results |> 
  mutate(ncp = F.ratio* numDF, 
         Fcrit = qf(1-alpha, df1 = numDF, df2 = denDF, ncp = 0),
         Power = 1 - pf(Fcrit, df1 = numDF, df2 = denDF, ncp = ncp))
##   source numDF denDF F.ratio ncp    Fcrit     Power
## 1     tA     2    10       6  12 4.102821 0.7592131
## 3     tB     1    10       2   2 4.964603 0.2490518
## 2  tA:tB     2    10       2   4 4.102821 0.3183418
# create a split-plot model where tA was the whole plot  
m_sp <- lme(mu ~ tA * tB, 
         random = list(rep = pdSymm(matrix(sqrt(sigma2_block)), ~ 1, 
                                    nam = "(Intercept)"), 
                       tA = pdSymm(matrix(sqrt(sigma2_wp)), ~ 1)), 
         data = data,
         # this won't converge but ask to get an object anyways
         control = lmeControl(sigma = sqrt(sigma2_e), 
                              returnObject = TRUE, 
                              maxIter = 0))

# peek at the variance-covariance matrix 
VarCorr(m_sp)
##             Variance   StdDev   
## rep =       pdSymm(1)           
## (Intercept) 0.01216669 0.1103027
## tA =        pdSymm(1)           
## (Intercept) 0.03296807 0.1815711
## Residual    1.00000000 1.0000000
# fix variance components 
m_sp$modelStruct$reStruct$rep <-  pdSymm(matrix(sqrt(sigma2_block)), ~ 1, nam = "(Intercept)")
m_sp$modelStruct$reStruct$tA <-  pdSymm(matrix(sqrt(sigma2_wp)), ~ 1, nam = "(Intercept)")

# peek at the variance-covariance matrix again
VarCorr(m_sp)
##             Variance  StdDev   
## rep =       pdSymm(1)          
## (Intercept) 0.8944272 0.9457416
## tA =        pdSymm(1)          
## (Intercept) 0.4472136 0.6687403
## Residual    1.0000000 1.0000000
results <-
  emmeans::joint_tests(m_sp) |> 
  as.data.frame() |> 
  transmute(source = `model term`, 
            numDF = df1, 
            denDF = df2, 
            F.ratio)

alpha <- 0.05

results |> 
  mutate(ncp = F.ratio* numDF, 
         Fcrit = qf(1-alpha, df1 = numDF, df2 = denDF, ncp = 0),
         Power = 1 - pf(Fcrit, df1 = numDF, df2 = denDF, ncp = ncp))
##   source numDF denDF F.ratio    ncp    Fcrit     Power
## 1     tA     2     4   5.629 11.258 6.944272 0.5146767
## 3     tB     1     6   2.000  2.000 5.987378 0.2231880
## 2  tA:tB     2     6   2.000  4.000 5.143253 0.2682179

26.2.1 Precision analysis

res <- as.data.frame(emmeans::emmeans(m_rcbd, ~ tA, contr = list(c(1, -1, 0)))$contrasts)
## NOTE: Results may be misleading due to involvement in interactions
res$SE*(qt(p = .975, df = res$df) - qt(p = .025, df = res$df))
## [1] 2.572833
res <- as.data.frame(emmeans::emmeans(m_sp, ~ tA, contr = list(c(1, -1, 0)))$contrasts)
## NOTE: Results may be misleading due to involvement in interactions
res$SE*(qt(p = .975, df = res$df) - qt(p = .025, df = res$df))
## [1] 3.30997

26.3 Bayesian analysis of designed experiments

Bayes theorem:

\[[\boldsymbol{\theta} \vert \mathbf{y}] = \frac{[\mathbf{y} \vert \boldsymbol{\theta}] [\boldsymbol{\theta}] }{ [\mathbf{y}]}, \]

where:

  • \([\boldsymbol{\theta} \vert \mathbf{y}]\) is the posterior distribution,
  • \([\mathbf{y} \vert \boldsymbol{\theta}]\) is the likelihood,
  • \([\boldsymbol{\theta}]\) is the prior,
  • \([\mathbf{y}]\) is the data.

Note that whatever is defined in \([\cdot]\), it’s information.

26.4 Applied example

Suppose we designed an experiment with the rationale outlined above. What if the model doesn’t converge?

  • About dropping effects from the model
  • Bayesian solutions
library(glmmTMB)
library(ggpubr)
dat <- readxl::read_xlsx("../../data_confid/Behavior_Counts.xlsx")

# viz
dat |> 
  ggplot(aes(timepoint, behavior_count))+
  geom_line(aes(group = calfID, 
                color = breed))+
  geom_point(aes(group = breed, 
                 fill = breed),
             shape =21, size= 2.5)+
  facet_wrap(~trt)+
  scico::scale_fill_scico_d()+
  scico::scale_color_scico_d()+
  theme_pubclean()

m2 <- glmmTMB(behavior_count ~ timepoint * trt * breed +
                toep(timepoint + 0 | calfID),
              ziformula = ~1,
              family = poisson(link = "log"),
              data = dat |> mutate(timepoint_f=as.factor(timepoint)))

What to do?

Link to R code