Oral Presentation Australasian RNA Biology and Biotechnology Association 2025 Conference

RegLLM: A RNA Foundation Large Language Model for Decoding the Post-Transcriptional Regulatory Code (130025)

Ke Ding 1 , Brian Parker 1 , Jiayu Wen 1
  1. The Australian National University, CANBERRA, ACT, Australia

We introduce RegLLM, a foundation large language model built to decode the multilayered landscape of RNA regulation—from protein–RNA interactions and RNA modifications to stability, translation efficiency, and structure. Trained on ~1 billion curated RNA tokens and enriched with genome-wide SHAPE-seq and evolutionary conservation features, RegLLM captures both base-level precision and long-range regulatory context. Across diverse benchmarks—including binding site prediction, modification mapping, miRNA–target interaction detection, stability estimation, translation efficiency, and RNA structure prediction—RegLLM consistently outperforms existing genomic LLMs. Its architecture generalizes seamlessly across modalities and tasks, delivering accurate and interpretable predictions. RegLLM offers a versatile platform for decoding the post-transcriptional regulatory code and advancing functional RNA genomics.