Beyond Proteins: The Promising yet Challenging Realm of Targeting RNA
In my previous post, I promised a follow-up article about targeting non-protein biomolecules. Today, we'll focus on targeting RNA. If you work in drug discovery, you're likely familiar with the current buzz around targeting RNA with small molecules.
RNA is an appealing target due to its abundance in human cells compared to proteins and DNA. While less than 2% of the human genome encodes proteins, over 70% is transcribed into non-coding RNA. Consequently, RNA is ubiquitous in human cells, existing in various forms such as mRNA, tRNA, lncRNA, pre-mRNA, and miRNA, each playing key roles in biological pathways. So why isn't everyone solely focused on targeting RNA? In this article, I'll address some of the key challenges in targeting RNA compared to the more common protein targets.
I spent over five years at the University of Cambridge leading a project on targeting G-quadruplexes, secondary structures of DNA—the other nucleic acid-containing biomolecule. During that time, I spent many sleepless nights thinking about some of the key challenges in this field. While I plan to write about targeting DNA in the future, those in a hurry can refer to this review we published on G-quadruplex DNA. The parallels between targeting RNA and DNA motivated me to write this article. This overview is highly opinionated, and given the relative novelty of the field, my views may evolve. I'll update them in future articles on this website as necessary.
Small Molecules Targeting RNA: Balancing Potency and Pharmacokinetics
Firstly, lets focus on small molecules that target RNA. Due to RNA's negatively charged phosphate backbone and shallow binding surfaces, the literature is flooded with compounds featuring flat aromatic rings and positively charged or basic groups. Many in the field argue that RNA's intrinsic structural differences, such as its negatively charged surface and shallow binding sites, necessitate thinking beyond the "Rule of 5". Several studies indeed show that RNA-targeting small molecules generally have lower lipophilicity, higher polar surface area, and more hydrogen bond donors.
While I agree that compounds targeting RNA will differ from those targeting proteins, the fundamental properties governing small molecule pharmacokinetics remain unchanged. Yes, the rules have evolved over time—and researchers often point to compounds beyond the "Rule of 5"—but physicochemical properties like lipophilicity and hydrogen bond donors still remain crucial for achieving good pharmacokinetic (PK) properties.
I anticipate some readers may challenge this view, citing successes of large PROTAC compounds that far exceed the "Rule of 5" parameters. However, recent publications have highlighted the importance of hydrogen bond donors and effective polar surface area in achieving good PK properties, even for PROTAC compounds.
That said, it's not impossible to develop RNA-binding small molecules with favourable PK properties. The key lies in finding the optimal balance of hydrogen bond donors, basicity, and aromaticity to maintain both potency and desirable PK properties. I do not like the idea of designing RNA-specific screening libraries with very high polar surface area, high number of hydrogen bond donors, and high basicity. While such libraries may yield hits, improving pharmacokinetic properties of these hits without significant loss of potency to the target RNA will be extremely challenging.
Selectivity Challenges
Achieving a small molecule that selectively targets just one RNA is extremely challenging and may even be impossible. In many cases, such selectivity might not be necessary, as numerous blockbuster drugs aren't completely selective for their targets. If functional selectivity in cells for the mechanism of interest can be achieved, poor binding selectivity might be acceptable. However, it's generally advisable to aim for target-selective compounds to avoid off-target toxicity downstream. It's rare to initiate a drug discovery campaign seeking unselective compounds.
RNA binding surfaces don't vary as much as protein structures, where unique binding pockets are more common. If your compound has a flat aromatic surface with basic groups, it's likely non-selective. The amount of potential off-target RNA and DNA structures far outnumber your target RNA. Even if 5-10 fold selectivity is achieved in vitro, the concentration of off-target RNAs with similar binding surfaces could be hundreds of times greater than your target RNA.
Higher selectivity has been achieved using larger compounds that bind to multiple regions of the same RNA. While excellent as chemical biology tools, these compounds face similar pharmacokinetic optimisation challenges as PROTACs, potentially to an even greater extent due to their more polar nature with numerous hydrogen bond donors. I'm not claiming it's impossible, just extremely challenging. I'll be the first to applaud when such large compounds targeting multiple sites on RNA reaches the clinic.
Comparing RNA to Proteins: Structural Challenges
Let's shift our focus to compare RNA with our familiar protein targets. RNA, a nucleic acid, contains a negatively charged phosphate backbone. This contrasts with proteins, which feature neutral, positive, and negatively charged amino acids on their surface. The overall surface charge varies among proteins. Proteins generally have buried surfaces and pockets that small molecules can target (there are exceptions, and we'll explore them in future articles). RNA, however, typically lacks such buried surfaces, with some exceptions like riboswitches. RNA is usually a single-stranded chain of nucleotides that folds and forms Watson-Crick base pairs, similar to DNA. However, various regions of the RNA strand can't form stable base pairs and instead adopt structures like bulges, hairpins, quadruplexes, triplexes, and knots.
RNA structures are highly sensitive to changes in ionic conditions, pH, and temperature. These structural changes are often more significant than those observed in proteins, although some protein classes are also quite flexible. During my time at the University of Cambridge, I observed these changes firsthand in circular dichroism experiments. G-quadruplex DNA and RNA folded differently in the presence of sodium versus potassium cations in the buffer. Most G-quadruplex-forming sequences didn't fold at all in buffers containing lithium instead of potassium or sodium.
I believe this structural flexibility of RNA is a major reason why most computational prediction methods based on AI lag behind protein structure prediction methods like AlphaFold and RosettaFold. The primary challenge is that experimental structures can vary depending on buffer composition. Even if an algorithm predicts a correct conformation from many possible structures, RMSD calculations might suggest otherwise. The main issue is the lack of an ideal buffer for folding RNA, as the ionic, pH, and temperature environment around an RNA structure in the human body isn't constant. Finding the perfect assay conditions to fold RNA into its most disease-relevant conformation remains a significant challenge.
Hit Identification Challenges in Targeting RNA
Given this structural complexity, how does one initiate a hit identification campaign? It's undoubtedly challenging.
Before embarking on any screen, it's crucial to invest significant time in understanding the structure of your target RNA, given its structural flexibility. For known RNAs, investigate any information about the functionally relevant form. NMR and Circular Dichroism experiments are widely used to understand RNA structural dynamics. Once you've determined the optimal ionic and pH conditions for your buffer, you can plan your screen. Remember that screen choice also influences buffer composition, and you have limited flexibility here.
Working with a novel RNA presents even greater challenges. To determine RNA structure, you can employ chemical modification methods using reagents like DMS and SHAPE to elucidate secondary structure. Combine these results with results from NMR, CD, and computational calculations to deduce the RNA conformation. Even then, careful consideration of buffer compositions in these experiments is essential.
One might argue for bypassing structural concerns and screening via HTS with a functional reporter assay in cells. While possible, this approach introduces downstream complexities. How will you confirm hits without biophysical binding data? And how will you develop a biophysical binding assay without insight into structural diversity across various buffers? Conducting a medicinal chemistry campaign becomes challenging unless you design orthogonal functional reporter assays to confirm that cellular functional readouts stem from binding to your target RNA alone, and not from secondary effects. This is difficult, though not impossible.
The key takeaway is this: RNA structures are flexible and may fold differently under various buffer conditions. It's crucial to study this behaviour early in RNA-targeted drug discovery campaigns.
The field of targeting RNA with small molecules is relatively young in industrial settings. With advances in computational tools for RNA structure prediction, cryo-EM, and various new chemical biology techniques, combined with decades of industrial medicinal chemistry expertise in property optimisation, I'm optimistic that we'll soon see several new RNA-targeting small molecules in the clinic.
For a more comprehensive scientific overview on targeting RNA, here's an excellent review from Matthew Disney's lab.