Comprehensive RNA-binding protein analyses and deep learning uncover genetic constraints and disease associations in protein-RNA interfaces.
Abstract
RNA-binding proteins (RBPs) orchestrate post-transcriptional processes, including splicing, cleavage and polyadenylation, and translation. Our updated RBP resource integrates data from 92 additional RBPs (286 in total) profiled by enhanced CLIP (eCLIP), enabling comprehensive characterization of RNA elements within human K562 and HepG2 cells. To interrogate RBP-binding syntax, we trained deep-learning models on eCLIP profiles, allowing us to score genetic variants and quantify constraints on RBP-binding sites. We observed opposing selective-constraint profiles at splicing enhancers versus silencers, including an unexpected enrichment of strengthening mutations in ELAVL1- and HNRNPC-binding sites. Finally, our model prioritizes disease variants, exposing unexpected RBP-related mechanisms of pathogenesis, exemplified by the enrichment of weakening mutations in spliceosomal protein-binding sites among retinal disease variants. The complete eCLIP resource offers an integrated platform for exploring RBP-RNA interactomes.
Publication Types
MeSH Terms
Funding
Linked Datasets (3)
Comprehensive RNA-binding protein analyses using enhanced CLIP (ENCORE) [dataset1]
Comprehensive RNA-binding protein analyses using enhanced CLIP (ENCORE) [dataset2]