Categories: FAANG

NVIDIA, Evozyne Create Generative AI Model for Proteins

Using a pretrained AI model from NVIDIA, startup Evozyne created two proteins with significant potential in healthcare and clean energy.

A joint paper released today describes the process and the biological building blocks it produced. One aims to cure a congenital disease, another is designed to consume carbon dioxide to reduce global warming.

Initial results show a new way to accelerate drug discovery and more.

“It’s been really encouraging that even in this first round the AI model has produced synthetic proteins as good as naturally occurring ones,” said Andrew Ferguson, Evozyne’s co-founder and a co-author of the paper. “That tells us it’s learned nature’s design rules correctly.”

A Transformational AI Model

Evozyne used NVIDIA’s implementation of ProtT5, a transformer model that’s part of NVIDIA BioNeMo, a software framework and service for creating AI models for healthcare.

“BioNeMo really gave us everything we needed to support model training and then run jobs with the model very inexpensively — we could generate millions of sequences in just a few seconds,” said Ferguson, a molecular engineer working at the intersection of chemistry and machine learning.

The model lies at the heart of Evovyne’s process called ProT-VAE. It’s a workflow that combines BioNeMo with a variational autoencoder that acts as a filter.

“Using large language models combined with variational autoencoders to design proteins was not on anybody’s radar just a few years ago,” he said.

Model Learns Nature’s Ways

Like a student reading a book, NVIDIA’s transformer model reads sequences of amino acids in millions of proteins. Using the same techniques neural networks employ to understand text, it learned how nature assembles these powerful building blocks of biology.

The model then predicted how to assemble new proteins suited for functions Evozyne wants to address.

“The technology is enabling us to do things that were pipe dreams 10 years ago,” he said.

A Sea of Possibilities

Machine learning helps navigate the astronomical number of possible protein sequences, then efficiently identifies the most useful ones.

The traditional method of engineering proteins, called directed evolution, uses a slow, hit-or-miss approach. It typically only changes a few amino acids in sequence at a time.

Evozyne’s ProT-VAE process uses a powerful transformer model in NVIDIA BioNeMo to generate useful proteins for drug discovery and energy sustainability.

By contrast, Evozyne’s approach can alter half or more of the amino acids in a protein in a single round. That’s the equivalent of making hundreds of mutations.

“We’re taking huge jumps which allows us to explore proteins never seen before that have new and useful functions,” he said.

Using the new process, Evozyne plans to build a range of proteins to fight diseases and climate change.

Slashing Training Time, Scaling Models

“NVIDIA’s been an incredible partner on this work,” he said.

“They scaled jobs to multiple GPUs to speed up training,” said Joshua Moller, a data scientist at Evozyne. “We were getting through entire datasets every minute.”

That reduced the time to train large AI models from months to a week. “It allowed us to train models — some with billions of trainable parameters — that just would not be possible otherwise,” Ferguson said.

Much More to Come

The horizon for AI-accelerated protein engineering is wide.

“The field is moving incredibly quickly, and I’m really excited to see what comes next,” he said, noting the recent rise of diffusion models.

“Who knows where we will be in five years’ time.”

Sign up for early access to the NVIDIA BioNeMo to see how it can accelerate your applications.

AI Generated Robotic Content

Recent Posts

Hello can anyone provide insight into making these or have made them?

submitted by /u/austingoeshard [link] [comments]

2 hours ago

A Gentle Introduction to Multi-Head Attention and Grouped-Query Attention

This post is divided into three parts; they are: • Why Attention is Needed •…

2 hours ago

10 Must-Know Python Libraries for MLOps in 2025

MLOps, or machine learning operations, is all about managing the end-to-end process of building, training,…

2 hours ago

Variational Rectified Flow Matching

We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by…

2 hours ago

Build a scalable AI video generator using Amazon SageMaker AI and CogVideoX

In recent years, the rapid advancement of artificial intelligence and machine learning (AI/ML) technologies has…

2 hours ago

GenLayer launches a new method to incentivize people to market your brand using AI and blockchain

With applications like Rally already live in beta, GenLayer presents a new category of intelligent…

3 hours ago