# Installation and Running Your First Prediction You will need a machine running Linux; AlphaFold 3 does not support other operating systems. Full installation requires up to 1 TB of disk space to keep genetic databases (SSD storage is recommended) and an NVIDIA GPU with Compute Capability 8.0 or greater (GPUs with more memory can predict larger protein structures). We have verified that inputs with up to 5,120 tokens can fit on a single NVIDIA A100 80 GB, or a single NVIDIA H100 80 GB. We have verified numerical accuracy on both NVIDIA A100 and H100 GPUs. Especially for long targets, the genetic search stage can consume a lot of RAM – we recommend running with at least 64 GB of RAM. We provide installation instructions for a machine with an NVIDIA A100 80 GB GPU and a clean Ubuntu 22.04 LTS installation, and expect that these instructions should aid others with different setups. If you are installing locally outside of a Docker container, please ensure CUDA, cuDNN, and JAX are correctly installed; the [JAX installation documentation](https://jax.readthedocs.io/en/latest/installation.html#nvidia-gpu) is a useful reference for this case. Please note that the Docker container requires that the host machine has CUDA 12.6 installed. The instructions provided below describe how to: 1. Provision a machine on GCP. 1. Install Docker. 1. Install NVIDIA drivers for an A100. 1. Obtain genetic databases. 1. Obtain model parameters. 1. Build the AlphaFold 3 Docker container or Singularity image. ## Provisioning a Machine Clean Ubuntu images are available on Google Cloud, AWS, Azure, and other major platforms. Using an existing Google Cloud project, we provisioned a new machine: * We recommend using `--machine-type a2-ultragpu-1g` but feel free to use `--machine-type a2-highgpu-1g` for smaller predictions. * If desired, replace `--zone us-central1-a` with a zone that has quota for the machine you have selected. See [gpu-regions-zones](https://cloud.google.com/compute/docs/gpus/gpu-regions-zones). ```sh gcloud compute instances create alphafold3 \ --machine-type a2-ultragpu-1g \ --zone us-central1-a \ --image-family ubuntu-2204-lts \ --image-project ubuntu-os-cloud \ --maintenance-policy TERMINATE \ --boot-disk-size 1000 \ --boot-disk-type pd-balanced ``` This provisions a bare Ubuntu 22.04 LTS image on an [A2 Ultra](https://cloud.google.com/compute/docs/accelerator-optimized-machines#a2-vms) machine with 12 CPUs, 170 GB RAM, 1 TB disk and NVIDIA A100 80 GB GPU attached. We verified the following installation steps from this point. ## Installing Docker These instructions are for rootless Docker. ### Installing Docker on Host Note these instructions only apply to Ubuntu 22.04 LTS images, see above. Add Docker's official GPG key. Official Docker instructions are [here](https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository). The commands we ran are: ```sh sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc ``` Add the repository to apt sources: ```sh echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin sudo docker run hello-world ``` ### Enabling Rootless Docker Official Docker instructions are [here](https://docs.docker.com/engine/security/rootless/#distribution-specific-hint). The commands we ran are: ```sh sudo apt-get install -y uidmap systemd-container sudo machinectl shell $(whoami)@ /bin/bash -c 'dockerd-rootless-setuptool.sh install && sudo loginctl enable-linger $(whoami) && DOCKER_HOST=unix:///run/user/1001/docker.sock docker context use rootless' ``` ## Installing GPU Support ### Installing NVIDIA Drivers Official Ubuntu instructions are [here](https://documentation.ubuntu.com/server/how-to/graphics/install-nvidia-drivers/). The commands we ran are: ```sh sudo apt-get -y install alsa-utils ubuntu-drivers-common sudo ubuntu-drivers install sudo nvidia-smi --gpu-reset nvidia-smi # Check that the drivers are installed. ``` Accept the "Pending kernel upgrade" dialog if it appears. You will need to reboot the instance with `sudo reboot now` to reset the GPU if you see the following warning: ```text NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. ``` Proceed only if `nvidia-smi` has a sensible output. ### Installing NVIDIA Support for Docker Official NVIDIA instructions are [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). The commands we ran are: ```sh curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update sudo apt-get install -y nvidia-container-toolkit nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json systemctl --user restart docker sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place ``` Check that your container can see the GPU: ```sh docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi ``` Example output: ```text Mon Nov 11 12:00:00 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA A100-SXM4-80GB Off | 00000000:00:05.0 Off | 0 | | N/A 34C P0 51W / 400W | 1MiB / 81920MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ ``` ## Obtaining AlphaFold 3 Source Code Install `git` and download the AlphaFold 3 repository: ```sh git clone https://github.com/google-deepmind/alphafold3.git ``` ## Obtaining Genetic Databases This step requires `wget` and `zstd` to be installed on your machine. On Debian-based systems install them by running `sudo apt install wget zstd`. AlphaFold 3 needs multiple genetic (sequence) protein and RNA databases to run: * [BFD small](https://bfd.mmseqs.com/) * [MGnify](https://www.ebi.ac.uk/metagenomics/) * [PDB](https://www.rcsb.org/) (structures in the mmCIF format) * [PDB seqres](https://www.rcsb.org/) * [UniProt](https://www.uniprot.org/uniprot/) * [UniRef90](https://www.uniprot.org/help/uniref) * [NT](https://www.ncbi.nlm.nih.gov/nucleotide/) * [RFam](https://rfam.org/) * [RNACentral](https://rnacentral.org/) We provide a bash script `fetch_databases.sh` that can be used to download and set up all of these databases. This process takes around 45 minutes when not installing on local SSD. We recommend running the following in a `screen` or `tmux` session as downloading and decompressing the databases takes some time. ```sh cd alphafold3 # Navigate to the directory with cloned AlphaFold 3 repository. ./fetch_databases.sh [] ``` This script downloads the databases from a mirror hosted on GCS, with all versions being the same as used in the AlphaFold 3 paper, to the directory ``. If not specified, the default `` is `$HOME/public_databases`. :ledger: **Note: The download directory `` should *not* be a subdirectory in the AlphaFold 3 repository directory.** If it is, the Docker build will be slow as the large databases will be copied during the image creation. :ledger: **Note: The total download size for the full databases is around 252 GB and the total size when unzipped is 630 GB. Please make sure you have sufficient hard drive space, bandwidth, and time to download. We recommend using an SSD for better genetic search performance.** :ledger: **Note: If the download directory and datasets don't have full read and write permissions, it can cause errors with the MSA tools, with opaque (external) error messages. Please ensure the required permissions are applied, e.g. with the `sudo chmod 755 --recursive ` command.** Once the script has finished, you should have the following directory structure: ```sh mmcif_files/ # Directory containing ~200k PDB mmCIF files. bfd-first_non_consensus_sequences.fasta mgy_clusters_2022_05.fa nt_rna_2023_02_23_clust_seq_id_90_cov_80_rep_seq.fasta pdb_seqres_2022_09_28.fasta rfam_14_9_clust_seq_id_90_cov_80_rep_seq.fasta rnacentral_active_seq_id_90_cov_80_linclust.fasta uniprot_all_2021_04.fa uniref90_2022_05.fa ``` Optionally, after the script finishes, you may want copy databases to an SSD. You can use theses two scripts: * `src/scripts/gcp_mount_ssd.sh []` Mounts and formats an unmounted GCP SSD drive to the specified path. It will skip the either step if the disk is either already formatted or already mounted. The default `` is `/mnt/disks/ssd`. * `src/scripts/copy_to_ssd.sh [] []` this will copy as many files that it can fit on to the SSD. The default `` is `$HOME/public_databases`, and must match the path used in the `fetch_databases.sh` command above, and the default `` is `/mnt/disks/ssd/public_databases`. ## Obtaining Model Parameters To request access to the AlphaFold 3 model parameters, please complete [this form](https://forms.gle/svvpY4u2jsHEwWYS6). Access will be granted at Google DeepMind’s sole discretion. We will aim to respond to requests within 2–3 business days. You may only use AlphaFold 3 model parameters if received directly from Google. Use is subject to these [terms of use](https://github.com/google-deepmind/alphafold3/blob/main/WEIGHTS_TERMS_OF_USE.md). Once access has been granted, download the model parameters to a directory of your choosing, referred to as `` in the following instructions. As with the databases, this should *not* be a subdirectory in the AlphaFold 3 repository directory. ## Building the Docker Container That Will Run AlphaFold 3 Then, build the Docker container. This builds a container with all the right python dependencies: ```sh docker build -t alphafold3 -f docker/Dockerfile . ``` Create an input JSON file, using either the example in the [README](https://github.com/google-deepmind/alphafold3?tab=readme-ov-file#installation-and-running-your-first-prediction) or a [custom input](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md), and place it in a directory, e.g. `$HOME/af_input`. You can now run AlphaFold 3! ```sh docker run -it \ --volume $HOME/af_input:/root/af_input \ --volume $HOME/af_output:/root/af_output \ --volume :/root/models \ --volume :/root/public_databases \ --gpus all \ alphafold3 \ python run_alphafold.py \ --json_path=/root/af_input/fold_input.json \ --model_dir=/root/models \ --output_dir=/root/af_output ``` where `$HOME/af_input` is the directory containing the input JSON file; `$HOME/af_output` is the directory where the output will be written to; and `` and `` are the directories containing the databases and model parameters. The values of these directories must match the directories used in previous steps for downloading databases and model weights, and for the input file. :ledger: Note: You may also need to create the output directory, `$HOME/af_output` directory before running the `docker` command and make it and the input directory writable from the docker container, e.g. by running `chmod 755 $HOME/af_input $HOME/af_output`. In most cases `docker` and `run_alphafold.py` will create the output directory if it does not exist. :ledger: **Note: In the example above the databases have been placed on the persistent disk, which is slow.** If you want better genetic and template search performance, make sure all databases are placed on a local SSD. If you have some databases on an SSD in the `` directory and some databases on a slower disk in the `` directory, you can mount both directories and specify `db_dir` multiple times. This will enable the fast access to databases with a fallback to the larger, slower disk: ```sh docker run -it \ --volume $HOME/af_input:/root/af_input \ --volume $HOME/af_output:/root/af_output \ --volume :/root/models \ --volume :/root/public_databases \ --volume :/root/public_databases_fallback \ --gpus all \ alphafold3 \ python run_alphafold.py \ --json_path=/root/af_input/fold_input.json \ --model_dir=/root/models \ --db_dir=/root/public_databases \ --db_dir=/root/public_databases_fallback \ --output_dir=/root/af_output ``` If you get an error like the following, make sure the models and data are in the paths (flags named `--volume` above) in the correct locations. ``` docker: Error response from daemon: error while creating mount source path '/srv/alphafold3_data/models': mkdir /srv/alphafold3_data/models: permission denied. ``` `run_alphafold.py` supports many flags for controlling performance, running on multiple input files, specifying external binary paths, and more. See ```sh docker run alphafold3 python run_alphafold.py --help ``` for more information. ## Running Using Singularity Instead of Docker You may prefer to run AlphaFold 3 within Singularity. You'll still need to *build* the Singularity image from the Docker container. Afterwards, you will not have to depend on Docker (at structure prediction time). ### Install Singularity Official Singularity instructions are [here](https://docs.sylabs.io/guides/3.3/user-guide/installation.html). The commands we ran are: ```sh wget https://github.com/sylabs/singularity/releases/download/v4.2.1/singularity-ce_4.2.1-jammy_amd64.deb sudo dpkg --install singularity-ce_4.2.1-jammy_amd64.deb sudo apt-get install -f ``` ### Build the Singularity Container From the Docker Image After building the *Docker* container above with `docker build -t`, start a local Docker registry and upload your image `alphafold3` to it. Singularity's instructions are [here](https://github.com/apptainer/singularity/issues/1537). The commands we ran are: ```sh docker run -d -p 5000:5000 --restart=always --name registry registry:2 docker tag alphafold3 localhost:5000/alphafold3 docker push localhost:5000/alphafold3 ``` Then build the Singularity container: ```sh SINGULARITY_NOHTTPS=1 singularity build alphafold3.sif docker://localhost:5000/alphafold3:latest ``` You can confirm your build by starting a shell and inspecting the environment. For example, you may want to ensure the Singularity image can access your GPU. You may want to restart your computer if you have issues with this. ```sh singularity exec --nv alphafold3.sif sh -c 'nvidia-smi' ``` You can now run AlphaFold 3! ```sh singularity exec --nv alphafold3.sif <> ``` For example: ```sh singularity exec \ --nv \ --bind $HOME/af_input:/root/af_input \ --bind $HOME/af_output:/root/af_output \ --bind :/root/models \ --bind :/root/public_databases \ alphafold3.sif \ python run_alphafold.py \ --json_path=/root/af_input/fold_input.json \ --model_dir=/root/models \ --db_dir=/root/public_databases \ --output_dir=/root/af_output ``` Or with some databases on SSD in location ``: ```sh singularity exec \ --nv \ --bind $HOME/af_input:/root/af_input \ --bind $HOME/af_output:/root/af_output \ --bind :/root/models \ --bind :/root/public_databases \ --bind :/root/public_databases_fallback \ alphafold3.sif \ python run_alphafold.py \ --json_path=/root/af_input/fold_input.json \ --model_dir=/root/models \ --db_dir=/root/public_databases \ --db_dir=/root/public_databases_fallback \ --output_dir=/root/af_output ```