Generalized Linear Models for Spatial Transcriptomic Data

Kaishu Mason, University of Pennsylvania

Photo of Kaishu Mason

Cells within a tissue continuously collaborate to respond to changing environments. This coordination generally relies on communication between neighboring cells, making knowledge of a cell's immediate surroundings a key factor in understanding its function.

In the last five years, Spatial Transcriptomic (ST) technologies have been developed that allow for precise measurements of both gene expression and spatial location within the tissue. This in turn allows scientists to ask the general question, “How does a cell’s environment affect it?” Unfortunately, the unit of observation is not a single cell but rather at the resolution of a “spot” which can encompass as many as 10 cells. Therefore, to perform cell type-specific inference we must first determine the relative proportions of each cell type that belong to a spot in a step called deconvolution. Even after deconvolution, cell type-specific inference remains challenging due to the lack of gene expression measurements for individual cells. In response, we investigate procedures for fitting generalized linear models in this setting which we call SpotGLM.

We then demonstrate how the SpotGLM framework can be used to aid in scientific discovery by investigating the effect of a cell’s neighborhood on gene expression which we call niche differential expression. I will showcase how the model as well as how false discovery rate control and computational efficiency are attained in settings where we perform over a million hypothesis tests.