Tutorial: optimizing an antibody against a ligand
Sometimes our starting docked structure is not very good, and the binder may loose the target after some nanoseconds. This is more frequent when simulating docked ligands. locuaz supports the addition of positional restraints so users can get started with their optimizations, until a better binder is found.
In this tutorial we will use what we learnt in previous tutorials, plus some other new tricks, to optimize a nanobody against a tirosol molecule, like the one on Figure 1.
As usual, activate your locuaz environment and get the necessary files.
Necessary files
As always we’re going to need a starting PDB and as in Tutorial: using Tleap topologies, we’ll also need a set of tleap related files in order to rebuild the topology of our system after each mutation.
tir.pdb
: the PDB file of the pre-equilibrated complex. As usual, target chains go first, also, remember that since we are using Tleap, residues should be numbered on a continuous progression.tleap
: Tleap dir with the script to build the topology of the system each time a mutation is performed. Remember to avoid solvating and creating a box in this file, since the solvent will already be present. Another thing to notice is the usage ofaddions
. We keep this commands since Tleap will be responsible of keeping neutrality of the system. Avoid usingaddions2
since we need it to replace water molecules each time it ads ions, to keep the N of the system constant. You’ll also findlig.frcmod
andlig.prep
, the auxiliary tirosol parameters.config_ligand.yaml
: the input file to run the protocol.mdp
directory: minimization, NVT and NPT GROMACS input files.
If you are finding it hard to get a PDB of your system with chainID information, check the FAQ.
The configuration file
We will focus on the new options that didn’t show up on the previous tutorials.
protocol
protocol:
epochs: 10
new_branches: 3
constant_width: false
memory_size: 4
failed_memory_size: 6
constant_width
: when this value is set tofalse
,new_branches
means the number of branches (new mutations) that are obtained from each previous branch. Check Platform DAGs for more info.We’re setting
new_branches
to 3 because we will input 4 bins in the next section.
creation
creation:
sites: 2
sites_interfacing: true
sites_interfacing_probe_radius: 3.0
sites_probability: uniform
aa_bins: ["CDEST", "AGIMLV", "PFWY", "RNQHK"]
aa_bins_criteria: without
aa_probability: uniform
As we said, we’re dealing with a rather small system, so we’re going to mutate 2 sites at the same time and on each branch we’re going to try 3 different amino acids from 3 different bins so we can sample the solution space quickly.
mutation
mutation:
mutator: dlpr
allowed_nonstandard_residues: [UNL]
allowed_nonstandard_residues
: locuaz cleans up the PDBs that go into the mutator program, so they don’t have any non-protein residues that may cause the mutator to error. We have to add our target ligand as an exception to this rule so that it’s included in the side-chain optimization that’s performed after inserting a mutation.
pruning
pruning:
pruner: metropolis
kT: 0.593
pruner: metropolis
: uses the Metropolis-Hastings criteria to decide if a new branch passes to the next epoch. Does not work with multiple scorers.kT
: product between the boltzmann constant and a temperature. The current value corresponds to a temperature of 300K.
md
md:
gmx_mdrun: "gmx mdrun"
mdp_names:
min_mdp: min.mdp
nvt_mdp: short_nvt.mdp
npt_mdp: short_npt_posres.mdp
mps: true
numa_regions: 1
use_tleap: true
maxwarn: 2
box_type: octahedron
npt_restraints:
posres: 50
posres_water: 50
mps
: when set totrue
, locuaz will use the NVIDIA Multi-Process Server (MPS), to run multiple MD simulations per GPU. This usually decreases the speed of each run, but considerably increases the total throughput. Useful when using a variable width DAG protocol which may make the number of branches explode. Check this blog post for more info.numa_regions
: when using MPS, locuaz will automatically set these options:ngpus
,mpi_procs
,omp_procs
andpinoffsets
. To be able to do this effectively, it needs to know CPU affinity of each GPU, which should follow the NUMA layout. Check the FAQ if you don’t know how many regions you have.npt_restraints
: This is where we set the value for our positional restraints. Remember also to define the-DPOSRES
and-DPOSRES_WATER
flags in your NPT mdp file so these take effect.
scoring
scoring:
scorers: [autodockvina]
allowed_nonstandard_residues: [UNL]
nthreads: 6
mpi_procs: 1
allowed_nonstandard_residues
: when scoring, the NPT trajectory is split in “sanitized” PDB frames. That is, they receive a treatment to make sure the scorers don’t error out when meeting unexpected artifacts, like non-standard residues. All scorers but gmxmmpbsa use this PDBs. Since we want to score the interaction between our nanobody and a tirosol molecule, we need to add it to this list of residue names, so locuaz doesn’t remove it from the PDB frames.
Running the protocol
Nothing new here, we just run the protocol with our config file:
mamba activate locuaz
python /home/user/locuaz/locuaz/protocol.py config_ligand.yaml
It’s educational to look at the DAG with the branch names that locuaz draws. Given the elevated branching, it’s difficult to see the whole DAG, but check Figure 2 for a part of it.