Broad Institute looks to Intel, Google Cloud to upgrade genomic data processing

Broad Institute looks to Intel, Google Cloud to upgrade genomic data processing

After half a decade of working together to develop software to support genomic research, Intel and the Broad Institute of MIT and Harvard are taking their relationship to the next level.

In collaboration with Google, another of the Broad Institute’s long-standing partners, the duo has upgraded the institute’s genomic data processing software to run using Google Cloud’s second-generation, or N2, virtual machines. The N2 instances are able to store more data than N1 machines and process those data even faster than their predecessors, thanks to the addition of Intel’s Xeon Scalable Processor technology to the platform.

In benchmarking tests comparing the run times and processing abilities of the Broad Institute’s genomic analyses on its N1 and N2 machines, the upgraded processors were able to perform an entire genetic variant identification workflow in just under 25 hours from start to finish, compared to almost 33 hours on the N1 machines. Additionally, the median cost of performing that workflow dropped from $6 per whole genome sequencing sample for the N1 instances to $5 per sample using the N2 machines.

The institute estimated that, in the case of some of its genomics workflows, the software and hardware upgrades are saving its researchers about 34% in data processing costs and cutting their time spent performing those analyses by around 25%.

“Speeding up the execution of high-load workflows means researchers get their analysis results that much sooner,” Geraldine Van der Auwera, director of outreach and communications for the Broad Institute’s data sciences platform, told Fierce Medtech.

“That can have all sorts of benefits depending on what they’re working on: They can iterate more quickly on trying out new ideas if they’re in a phase of experimentation—for example, if they’re developing a new type of analysis method; they can tackle studies with larger numbers of participants, which in many cases is necessary to produce strong statistical confidence and uncover novel insights; and they can more quickly deliver answers to questions with clinical impact,” Van der Auwera said.

Additionally, she added, “Lowering the cost of data processing and analysis is also very important for enabling larger studies, as well as making these types of analyses financially accessible to more researchers.”

The improvements are also available to the researchers around the world who use the Broad Institute’s Terra platform. Initially developed with the help of Google’s life science-focused sibling Verily, Terra is an open-source biomedical research platform currently used by almost 20,000 scientists for analysis and collaboration. Researchers with approved credentials can also access and analyze study data representing nearly 2 million participants.

As of the beginning of this year, Microsoft has also jumped aboard the initiative. It joined Verily and the institute as an equal partner in Terra’s development, contributing its cloud computing services to the cause to help host what the platform’s makers estimate will soon be dozens of exabytes—or billions of gigabytes—of healthcare data.

“These upgrades empower researchers who use the Terra platform to take advantage of Intel’s hardware-driven accelerations and cost reductions with minimal effort while enjoying the advantages of using the cloud for accessing data, running analyses at scale and collaborating with others around the world,” Van der Auwera said.

“The most exciting part, in my opinion, is that through Terra, anyone can now take advantage of these hardware upgrades to boost their genomics research without having to invest effort into infrastructure development and maintenance,” she continued. “Our collaboration with Intel has always been predicated on the goal of empowering the entire genomics ecosystem, not just the Broad Institute’s own research teams. With this new set of upgrades, we’re taking that goal to the next level: We’re not only sharing the software, we’re also putting the actual capabilities directly in people’s hands through Terra.”

Intel and the institute originally teamed up in 2016, striking a $25 million, five-year partnership to build software platforms that help biopharmaceutical developers, academic researchers and healthcare providers share and process genomic data more easily.

Since then, they’ve worked together to integrate Intel’s library of genetic data processing workflows into the Broad Institute’s Genome Analysis Toolkit. They also launched the Intel-Broad Center for Genomic Data Engineering, through which biomedical researchers and software engineers can build new analytical tools and workflows for their peers to use.

Share:
error: Content is protected !!