Structure-Aware Annotation of Leucine-Rich Repeat Domains

Presenter:

Boyan

Profile Link:

Boyan Xu

University:

University of California, Berkeley

Program:

CSGF

Year:

2023

Protein domain annotation is typically done by predictive models such as hidden Markov models (HMMs) trained on sequence motifs. However, sequence-based annotation methods are prone to error, particularly in calling domain boundaries and motifs within them. These methods are limited by a lack of structural information accessible to the model. With the advent of deep learning-based protein structure prediction, we aim to leverage the geometry of protein structures to assist in domain annotation and enhance existing sequence-based annotation. We develop dimensionality reduction methods to annotate repeat units of the Leucine Rich Repeat (LRR) solenoid domain. The methods are able to correct mistakes made by existing machine learning-based annotation tools, and enable the automated detection of hairpin loops and structural anomalies in the solenoid. The methods are applied to and tested on 127 predicted structures of LRR-containing intracellular innate immune proteins in the model plant Arabidopsis thaliana.

Program Review:

2023 Annual Program Review

Secure Login

Secure Login

Structure-Aware Annotation of Leucine-Rich Repeat Domains