Why CRISPR R&D Needs to Think About Data Workflows

Why CRISPR R&D Needs to Think About Data Workflows

Gene editing is growing up. Ten years after Science magazine named CRISPR its 2015 “Breakthrough of the Year,” this revolutionary gene editing technology has become a workhorse of modern biology. In 2023, the FDA made history by approving the world’s first CRISPR therapy for sickle cell anemia. Scientists continue to develop next-generation CRISPR technologies at breakneck speed.

Now that these technologies have cemented their place as a valued method in the biological toolkit, what about the data workflows that support them? Like a good workbench, proper data systems can help keep these tools organized and accessible, providing the scaffold on which to build better science and better medicines.

What is CRISPR used for?

The applications of CRISPR are numerous, spanning basic research, drug discovery, and the clinic. These include, in order of proximity from lab to patient:

  • Genome-wide CRISPR screening: This is effective for identifying genes that drive a disease phenotype—useful for target identification—or modulating a drug effect. For example, scientists at AstraZeneca have used CRISPR screening to identify genes that enhance lipid nanoparticle (LNP) delivery. CRISPR screening can be performed in a pooled or arrayed (e.g., on a plate) format, mostly in vitro with cell lines or induced pluripotent stem cells (iPSCs).
  • Knockout and transgenic animal models: These are much faster to generate using targeted CRISPR technology than with traditional random integration or homologous recombination methods. In the context of drug discovery, such knockouts and transgenics can serve as in vivo preclinical models.
  • CRISPR diagnostics: These can rapidly detect disease or even cancer at point-of-care. The tests work by first identifying and cutting a target gene present in the virus or tumor cell of interest, which then triggers cleavage of a reporter molecule (i.e., an RNA tethered to a fluorophore). A pair of CRISPR-based diagnostic tests for COVID-19—DETECTR and SHERLOCK—were authorized for emergency use by the FDA in 2022.
  • Ex vivo CRISPR therapies are the first kind of CRISPR cures to have reached patients, most notably with the approval of Casgevy for sickle cell anemia in late 2023. To treat the disease, patient-derived cells are genetically modified in a laboratory, and then reintroduced back into the patient. For some ex vivo CRISPR therapies, patients must undergo chemotherapy to remove the old disease-bearing cells and allow the CRISPR-modified cells to fully populate.
  • In vivo CRISPR therapies are, unlike ex vivo therapies, delivered directly to patients via viral or other means, as a one-time treatment. Because they do not require complex or arduous intermediate steps like stem cell extraction or chemotherapy, they are a primary goal of the field, but off-target effects remain a major challenge.
  • CRISPR-edited cell therapies such as CAR-T and TIL therapies are not CRISPR therapies per se as they do not directly modify a disease-causing gene. However, they employ CRISPR editing to increase the potency or reduce the immunogenicity of these immunotherapies by modifying both cellular pathways and surface receptors.

 

As these examples illustrate, CRISPR has become an established technology that is deeply embedded in the drug discovery process and has broken into the medical setting. While the applications listed above vary greatly, the fundamental CRISPR technology—whether used in cell lines, animal models, or patients—remains the same.

Research teams working on one application might quickly find themselves working on another. The ability to access, share, and apply knowledge gained from one application to another can be a significant competitive advantage for organizations, especially as priorities and goals shift, or when teams are trying to collaborate or merge.

CRISPR 2.0, 3.0, and AI

CRISPR is not only established, but racing ahead. These days, it feels like improvements and extensions to CRISPR are published every day. For example:

  • New Cas variants. There are many naturally occurring variants of the Cas nuclease. These variants can be harnessed for different purposes. For example, while Cas9 makes blunt DNA cuts, its cousin Cas12 makes staggered cuts, making it more efficient at site-specific insertions. Cas12 and Cas13 are the related nucleases used for CRISPR diagnostics, thanks to a unique mechanism in which targeted cleavage of double-stranded DNA triggers a flurry of single-stranded cutting. On top of this, variants can be engineered to increase editing efficiency and fidelity.
  • Base and prime editing. To increase precision and decrease off-target cutting, prime and base editing avoid DNA breaks altogether, merely nicking the DNA rather than cleaving it. In base editing, a deaminase chemically converts single bases. Similarly, in prime editing, a reverse transcriptase can convert bases with a larger repertoire of conversions, as well as perform insertions and deletions, with minimal off-target effects. (The co-developer of base editing, David Liu, PhD, recently won the 2025 Breakthrough Prize.)
  • Epigenome editing. CRISPR can also edit the epigenome, using deactivated Cas9 plus a histone-modifying effector enzyme (or other transcriptional repression and activation domains) to modulate gene expression.
  • Targeted conditioning. Not all improvements in CRISPR-Cas technology concern the system itself. For example, CRISPR Therapeutics is pursuing antibody-drug conjugates to make chemotherapy more targeted and less toxic during preparation to receive CRISPR-edited cells.
  • AAV and LNP delivery. Given the goal of delivering CRISPR in vivo, delivery methods are a top priority for the field. Adeno-associated virus (AAV) is a favored method of delivery, thanks to its low immunogenicity and pathogenicity as well as stable expression. However, drawbacks include their small packaging capacity and natural tendency to target the liver. Researchers are using antibody-targeted AAVs to overcome tropism issues and create miniature CRISPR systems through engineering and the use of smaller Cas variants. As an alternative to AAVs, LNPs are an increasingly popular approach, with some therapies using this delivery method already in clinical trials.

Artificial intelligence (AI) is making its mark in the CRISPR field, just like every other: recent exciting work has used large-language models to generate new Cas variants and guide RNAs (gRNAs, which direct Cas to the right DNA site) completely de novo.

With this veritable zoo of Cas variants, CRISPR system components, delivery modes, and other related technologies, keeping track of which elements are relevant or optimal for a given project can be confusing, time-consuming, and error-prone. Different systems require different workflows. For example, LNP versus viral delivery follows entirely different workflows. LNP delivery involves Cas protein production and purification, gRNA registration, RNP complex formation, and subsequent LNP encapsulation.

Virally delivered CRISPR involves the creation of plasmids for both CRISPR-Cas elements and accessory viral elements, followed by viral packaging and purification. The list of variations goes on, as researchers mix and match CRISPR components and delivery systems in very intricate and complicated ways.

More than a CRISPR design tool

The multifarious CRISPR applications and available variations on CRISPR systems can create challenges when it comes to knowledge sharing and project tracking. Many organizations are still using only open-source point solutions, online design tools, or even Excel for these purposes. This approach is not scalable and insufficient in a pharmaceutical setting, for multiple reasons:

  • As noted above, not only must researchers trace many CRISPR system components and the functional performance of their various permutations, but different systems require different workflows and intermediates. This is an enormous amount of complexity that is difficult to capture and follow using a disarray of Excel files.
  • Given that CRISPR is not only a research tool but a therapeutic modality in and of itself, it is subject to regulatory guidance that is still evolving. This makes traceability and data governance of the utmost importance, an impossible task with simple spreadsheets, scattered analysis files, or free online tools.
  • Though CRISPR is now reaching maturity, its fast growth also means that nomenclature is highly diverse and may differ across or within organizations. When knowledge sharing or integrating data, e.g., in the case of companies working with external partners to bring CRISPR therapies to the clinic, using a central repository or data management system with consistent terminology makes this process far more efficient.

It should be clear now that spreadsheets and widgets are not enough for enterprise-level research: like a carpenter, to build something quickly yet correctly, you need a sturdy and durable workbench that can withstand big projects and last a long time. It should have all the shelves and drawers needed to organize tools so that they are always within reach.

An ideal workbench is personalized and offers versatile clamping options and adjustable height, to accommodate diverse projects. It might even integrate power tools and rulers for added convenience and efficiency, as well as safety features to prevent injury.

By analogy, anyone leveraging CRISPR in biopharma R&D or embarking on CRISPR therapeutic development needs to adopt a purpose-built, thoroughly tested data management system that can weather organizational change and handle enterprise-level projects. These systems need advanced tracking capabilities, including molecule and batch registration, combined with consistency and uniqueness checks, so that the many CRISPR system “tools” are at the disposal of R&D teams and never confused.

Knowledgeable software and scientific support providers can collaborate with organizations to tailor data systems according to research and business needs, giving them flexibility in terms of workflows. Ideally, data management systems will also integrate capabilities for gRNA prediction and analytical characterization (the “power tools” and “measuring devices”) that are GMP-compliant and suitable for regulatory purposes (the “safety features”).

A full-fledged, mature software system forms an “information workbench” that positions CRISPR research and development teams to be more agile and make more informed decisions, ensuring a smooth passage of CRISPR into a long and productive adulthood.

Share:
error: Content is protected !!