Brain connectivity-informed regularization methods for regression

We propose to estimate association between the brain structure features and a scalar outcome in a regression model while utilizing additional information about structural connectivity between the brain regions.

Specifically, we propose a novel regularization method – riPEER (ridgified Partially Empirical Eigenvectors for Regression) – that defines a regularization penalty term based on the structural connectivity-derived Laplacian matrix.

Table of Contents

Scientific problem

The motivation for the work was to quantify the association between alcohol abuse phenotypes (outcome) and cortical thickness of the brain (covariates) in a study sample of young social-to-heavy drinking males. The data included measurements of average cortical thickness estimated for 68 brain regions.

This image (see images credit below) visualizes process of obtaining cortical thickness measurements from structural MRI images.


Commonly shared issues in such settings are:

  1. high dimensionality of the data - we typically parcel the brain into tens, or hundreds of units from which we take measurements, and each unit may then correspond to a covariate in the data set,

  2. correlation of the covariates - measurements from spatially neighbouring or otherwise connected brain regions are likely to be correlated,

  3. small sample size - brain imaging studies often recruit a few tens of participants only.

Proposed solution

We propose penalized regression method riPEER to estimate a linear model: $$y = Zb + X\beta + \varepsilon$$ where:

  • $y$ - response (here: alcohol abuse phenotypes),
  • $Z$ - input data matrix (here: cortical thickness measurements),
  • $X$ - input data matrix (here: demographics data),
  • $\beta$ - regression coefficients, not penalized in estimation process
  • $b$ - regression coefficients, penalized in estimation process and for whom there is a prior graph of similarity / graph of connections. available.

The riPEER estimation method uses a penalty being a linear combination of a graph-based and ridge penalty terms: $$ \hat{\beta}, \hat{b} = \underset{\beta,b}{\text{arg min}} \left[ (y - X\beta - Zb)^T(y - X\beta - Zb) + \lambda_Qb^TQb + \lambda_Rb^Tb \right ] $$


  • $Q$ - a graph-originated penalty matrix; typically: a graph Laplacian matrix (here: a graph Laplacian derived from structural connectivity of brain regions),
  • $\lambda_Q$ - regularization parameter for a graph-based penalty term,
  • $\lambda_R$ - regularization parameter for ridge penalty term.

riPEER penalty term

In the riPEER penalty term $(\lambda_Qb^TQb + \lambda_Rb^Tb)$,

  • A graph-originated penalty matrix $Q$ allows imposing similarity between coefficients of variables which are connected.

  • A ridge penalty term, $\lambda_Rb^Tb$, allows for L2 regularization component; in addition, even with very small $\lambda_R$, eliminates computational issues arising from singularity of $Q$.

  • Regularization parameters $\lambda_R$, $\lambda_Q$ are estimated automatically as ML estimators of equivalent Linear Mixed Models optimization problem.

Published work


We provided open-source implementation of the proposed riPEER estimation method in R package mdpeer (CRAN index). The package provides functions for graph-constrained regression methods in which regularization parameters are selected automatically via estimation of equivalent Linear Mixed Model formulation.

The R package is accompanied by Intro and usage examples vignette.

Images used in the post – credit/references

  • Featured image - top left component. Cortical thickness. Resources of Neurorecovery Laboratory at MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging. Accessed at: link (last accessed on Nov 20, 2020).

  • Featured image - top middle component. Diffusion MRI Tractography in the brain white matter. Xavier Gigandet et. al. - Gigandet X, Hagmann P, Kurant M, Cammoun L, Meuli R, et al. (2008) Estimating the Confidence Level of White Matter Connections Obtained with MRI Tractography. PLoS ONE 3(12): e4006. doi:10.1371/journal.pone.0004006. Accessed at: link (last accessed on Nov 20, 2020).

  • Featured image - top right component. Databases of Statistical Information. Resources of Berkeley Advanced Media Institute Graduate School of Journalism. Accessed at: link (last accessed on Nov 20, 2020).

Marta Karas
Marta Karas
Postdoctoral researcher