Upstrap for estimating power and sample size in complex models

Power and sample size calculation are major components of statistical analyses. The upstrap resampling method was proposed as a general solution to this problem.

We evaluate power estimation properties of the upstrap and provide a series of “read, adapt and use” R code examples for power estimation in simple and complex settings (bioRxiv preprint).

Table of Contents

Scientific problem

We consider the following problem: given an observed data sample $\mathbf{x}$ of sample size $N$, given specific null and alternative hypothesis, and a test statistic, assuming significance level $\alpha$, estimate sample size $M$ required to achieve power $1 -\beta$ (i.e., to achieve probability $1 -\beta$ of rejecting the null hypothesis when the null is true).

Here, we consider the tasks to estimate the power to detect:

  1. an effect size observed in the data;
  2. an effect size chosen by a researcher.

We aim to ddress complex settings, including testing significance of model coefficients in: LM, GLM, LMM, GLMM, GEE, and others.

Challenges

For multilevel data settings, the existing approaches tend to fall into two categories.

  • First are theoretical results for estimating power in specific multilevel data setups; these often use assumptions about the intra-class correlation coefficient, and/or assume a particular study design (e.g., that the data are balanced).

  • Another class is based on simulations. Such may require specifying a population model for the data, simulating data from the assumed model, and estimating the power via Monte Carlo simulations.

The former arguably lacks flexibility; in addition, finding and then determining applicability of a theoretical result may be difficult. The latter is flexible but involves potentially complex programming task.

Proposed solution

Upstrap

The upstrap resampling method (Crainiceanu, C.M., Crainiceanu, A. (2020)) was proposed as a general solution to this problem.

Upstrap starts with a sample (observed data) and resamples with replacement either fewer or more samples than in the original data set. The below example shows difference between bootstrap and upstrap resample.

# simulate data 
x <- rnorm(n = 10)

# bootstrap resample
x_b <- sample(x, size = 10, replace = TRUE)

# upstrap resample (case: more samples than in original x) 
x_u <- sample(x, size = 20, replace = TRUE)

Upstrap for estimating power

Given observed data sample $\mathbf{x}$, to estimate power for a target sample size, we propose:

  1. Generate $B$ resamples of target sample size by sampling with replacement from $\mathbf{x}$.

  2. Perform hypothesis test on each resample.

  3. Estimate power as the proportion of $B$ resamples where the null hypothesis was rejected.

The above procedure estimates power corresponding to the effect size observed in the sample $\mathbf{x}$. For example, for one-sample t-test, the above procedure estimates:

power.t.test(delta = mean(x), sd = sd(x), ...)$power

Upstrap for estimating power: specific effect size

In practice, one is often is interested in estimating power for a specific effect size. For example, for one-sample t-test, a specific effect size could be set to 0.3:

power.t.test(delta = 0.3, sd = sd(x), ...)$power

To address such cases, we propose to update response variable values in the observed sample $\mathbf{x}$ to ensure the effect size in this updated data is our target effect size. Details are provided in the preprint.

Code example: testing for significance of LM coefficient

Below, we demonstrate the upstrap power estimation method for testing for significance of LM coefficient. In the preprint, we provide more R code examples, including testing for significance of coefficient in LM, GLM, LMM, GLMM.

We define simulation parameters and simulate a sample of size $N = 50$.

(Click to see setup definition and R code.)

Setup

Consider a random sample with $N = 50$ independent observations (e.g., 50 subjects, 1 observation per subject). Assume a continuous response variable $Y$ and two covariates: dichotomous $X_1$, continuous $X_2$. We are interested in estimating power of test for significance of the coefficient $\beta_{1}$ in linear model $Y_{i}=\beta_{0}+\beta_{1} X_{1 i}+\beta_{2} X_{2 i}+\varepsilon_{i}$, where $i=1, \ldots, N$ and $\varepsilon_{i} \sim_{\text {iid }} N\left(0, \sigma^{2}\right)$.

# simulation parameters
N <- 50
coef_x0 <- 0
coef_x1 <- 0.2
coef_x2 <- 0.1
sigma2  <- 1

# simulate sample
set.seed(1)
subjid_i <- 1 : N # subject ID unique in data set
x1_i  <- rbinom(n = N, size = 1, prob = 0.5)
x2_i  <- rbinom(n = N, size = 1, prob = 0.5)
eps_i <- rnorm(N, sd = sqrt(sigma2))
y_i   <- coef_x0 + (coef_x1 * x1_i) + (coef_x2 * x2_i)  + eps_i
dat   <- data.frame(y = y_i, x1 = x1_i, x2 = x2_i, subjid = subjid_i)
str(dat)
# 'data.frame':	50 obs. of  4 variables:
#  $ y     : num  0.398 -0.512 0.541 -0.929 1.433 ...
#  $ x1    : int  0 0 1 1 0 1 1 1 1 0 ...
#  $ x2    : int  0 1 0 0 0 0 0 1 1 0 ...
#  $ subjid: int  1 2 3 4 5 6 7 8 9 10 ...

We get the observed effect size.

fit <- lm(y ~ x1 + x2, data = dat)
coef(fit)["x1"]
# 0.5585879 

We estimate the test power via the upstrap.

Case: target sample size M = N = 50, target effect size as observed in the sample

# number of upstrap resamples
B_boot <- 1000
out <- rep(NA, B_boot)
for (rr in 1 : B_boot){
  dat_rr_idx <- sample(1 : nrow(dat), replace = TRUE)
  dat_rr <- dat[dat_rr_idx, ]
  fit_rr <- lm(y ~ x1 + x2, data = dat_rr)
  pval_rr <- summary(fit_rr)$coef["x1", 4]
  out[rr] <- (pval_rr < 0.05)
}
mean(out)
# [1] 0.493

Case: target sample size M = 100, target effect size as observed in the sample

out <- rep(NA, B_boot)
for (rr in 1 : B_boot){
  dat_rr_idx <- sample(1 : nrow(dat), size = 100, replace = TRUE)
  dat_rr <- dat[dat_rr_idx, ]
  fit_rr <- lm(y ~ x1 + x2, data = dat_rr)
  pval_rr <- summary(fit_rr)$coef["x1", 4]
  out[rr] <- (pval_rr < 0.05)
}
mean(out)
# [1] 0.773

Case: target sample size M = 100, target effect size set to 0.8

# update the outcome in the sample to represent the target effect size
dat_upd <- dat
dat_upd$y <- dat_upd$y + (0.8 - coef(fit)["x1"]) * dat_upd$x1
out <- rep(NA, B_boot)
for (rr in 1 : B_boot){
  dat_rr_idx <- sample(1 : nrow(dat_upd), size = 100, replace = TRUE)
  dat_rr <- dat_upd[dat_rr_idx, ]
  fit_rr <- lm(y ~ x1 + x2, data = dat_rr)
  pval_rr <- summary(fit_rr)$coef["x1", 4]
  out[rr] <- (pval_rr < 0.05)
}
mean(out)
# [1] 0.92

Contributions

  • Our preprint: Karas, M., Crainiceanu, C.M. (2021) Upstrap for estimating power and sample size in complex models is available on bioRxiv.
Marta Karas
Marta Karas
Associate Director, Statistics