Fold proteins with Chai-1
In biology, function follows form quite literally: the physical shapes of proteins dictate their behavior. Measuring those shapes directly is difficult and first-principles physical simulation prohibitively expensive.
And so predicting protein shape from content — determining how the one-dimensional chain of amino acids encoded by DNA folds into a 3D object — has emerged as a key application for machine learning and neural networks in biology.
In this example, we demonstrate how to run the open source Chai-1 protein structure prediction model on Modal’s flexible serverless infrastructure. For details on how the Chai-1 model works and what it can be used for, see the authors’ technical report on bioRxiv.
This simple script is meant as a starting point showing how to handle fiddly bits like installing dependencies, loading weights, and formatting outputs so that you can get on with the fun stuff. To experience the full power of Modal, try scaling inference up and running on hundreds or thousands of structures!
Setup
This simple script is meant as a starting point showing how to handle fiddly bits like installing dependencies, loading weights, and formatting outputs so that you can get on with the fun stuff. To experience the full power of Modal, try scaling inference up and running on hundreds or thousands of structures!
import hashlib
import json
from pathlib import Path
from uuid import uuid4
import modal
here = Path(__file__).parent # the directory of this file
MINUTES = 60 # seconds
app = modal.App(name="example-chai1-inference")
Fold a protein from the command line
The logic for running Chai-1 is encapsulated in the function below, which you can trigger from the command line by running
modal run chai1
This will set up the environment for running Chai-1 inference in Modal’s cloud, run it, and then save the results remotely and locally. The results are returned in the Crystallographic Information File format, which you can render with the online Molstar Viewer.
To see more options, run the command with the --help
flag.
Installing Chai-1 Python dependencies on Modal
Code running on Modal runs inside containers built from container images that include that code’s dependencies.
Because Modal images include GPU drivers by default, installation of higher-level packages like chai_lab that require GPUs is painless.
Here, we do it with one line, using the uv package manager for extra speed.
image = modal.Image.debian_slim(python_version="3.12").run_commands(
"uv pip install --system --compile-bytecode chai_lab==0.5.0 hf_transfer==0.1.8"
)
Storing Chai-1 model weights on Modal with Volumes
Not all “dependencies” belong in a container image. Chai-1, for example, depends on the weights of several models.
Rather than loading them dynamically at run-time (which would add several minutes of GPU time to each inference), or installing them into the image (which would require they be re-downloaded any time the other dependencies changed), we load them onto a Modal Volume. A Modal Volume is a file system that all of your code running on Modal (or elsewhere!) can access. For more on storing model weights on Modal, see this guide.
chai_model_volume = (
modal.Volume.from_name( # create distributed filesystem for model weights
"chai1-models",
create_if_missing=True,
)
)
models_dir = Path("/models/chai1")
The details of how we handle the download here (e.g. running concurrently for extra speed) are in the Addenda.