Restraints
Best practice guide
As you probably saw in the previous step dedicated to structure preparation, there are many ways how to obtain structures of molecules that you want to dock. The next step is to define the way you expect these molecules to interact. HADDOCK is an information-driven tool, which means that the more available information about binding you have, the more meaningful your results will be. Based on the available information we distinguish between the following options:
- What information about binding is available?
- Complementary software related to restraints for HADDOCK
What information about binding is available?

1.) Information about the interface is available
Unambiguous Interaction restraints
If your predictions are highly reliable and you wish to have all of them applied during docking, define them as unambiguous restraints (using the unambig_fname
parameter).
Unambiguous restraints are not subject to random removal, therefore all of them must be satisfied.
These can be for example:
- chain-break restraints generated by the
haddock3-restraint restrain_bodies
command line - template-derived pairwise distance restraints (tutorial)
- MS crosslink data (tutorial)
- cryo-EM connectivity data (tutorial)
Ambiguous Interaction Restraints (AIRs)
Nevertheless, as in life, in science one also needs to be somewhat critical to the data one works with.
If you are not 100% sure about the interaction information and want to be cautious while incorporating it into your docking, use ambiguous interaction restraints (using the ambig_fname
parameter).
Here, for each docking trial, a fraction of these restraints will be randomly removed, which ensures a wider sampling satisfying always a different subset of predefined restraints.
Thus, if some of the restraints are artificial, these can be filtered out if the complex satisfying them is unfavorable.
Of course, you can tune this random removal approach by modifying the npart
parameter or turning it off by setting randremoval=false
.
For AIRs, it is important to define the residues at the interface for each molecule based on experimental data that provides information on the interaction interface.
In the definition of those residues, one distinguishes between "active" and "passive" residues.
-
The "active" residues are of central importance for the interaction between the two molecules AND are solvent accessible. Either main chain or side chain relative accessibility should be typically > 40%, sometimes a lower cutoff might be used as well, for example, the HADDOCK server uses by default 15%. Throughout the simulation, these active residues are restrained to be part of the interface, if possible, otherwise incurring in a scoring penalty.
-
The "passive" residues are all solvent-accessible surface neighbors of active residues (<6.5Å). They contribute to the interaction but are deemed of less importance. If such a residue does not belong in the interface there is no scoring penalty.
In general, an AIR is defined as an ambiguous intermolecular distance between any atom of an active residue of molecule A and any atom of both active and passive residues of molecule B (and inversely for molecule B). This procedure can be performed:
- locally using the
haddock3-restraints active_passive_to_ambig
command line - online using GenTBL server
Using ambiguous restraints for docking is described in several tutorials:
Other kinds of restraints
-
Hydrogen bonds restraints: Another type of restraint not subject to random removal (accessed using
hbond_fname
parameter). -
DNA/RNA restraints: Automatically generated base-pair restraints using the
dnarest = true
parameter.
HADDOCK can utilize plenty of experimental information. Here we describe other types of restraints supported by HADDOCK:

2.) Information about the interface is not available
If there is no direct information about the interacting residues available, one can still browse through the available literature or employ bioinformatic prediction tools to gain some information about the potential complex. HADDOCK offers a plethora of ways for these scenarios.
Information about the quaternary structure of proteins (symmetry)
Symmetry restraints
HADDOCK offers the possibility to define multiple symmetry relationships within or in between molecules. This is done by using symmetry distance restraints. By defining multiple pairs of distances between the CA atoms of two chains, various symmetries can be enforced. Symmetry restraints are described in the manual here.
Ab-initio multi-body docking with symmetry restraints is described this Ab-initio tutorial (HADDOCK2.4).
Non-crystallographic symmetry restraints (NCS)
The NCS option imposes non-crystallographic symmetry restraints: It enforces that two molecules, a fraction thereof or even two sub-domains within the same molecule, should be identical without defining any symmetry operation between them. Non-crystallographic symmetry restraints are described in the manual here.
Ab-initio multi-body docking with NCS restraints is described here.
Membrane Z-positioning restraints
These restraints do not deal with symmetry, but can be useful in guiding the docking of membrane proteins. This type of restraint is used to keep segments within or outside of a defined Z-coordinate range. They can be used for docking of membrane proteins but can be used generically as well.
They are described in the HADDOCK manual here.
Ab-initio docking
Random interaction restraints
Haddock3 [rigidbody]
module offers to define random AIRs from solvent-accessible residues (>20% relative accessibility) in case there is no experimental information, by turning on the ranair = true
parameter.
The sampling will be done from the defined segments.
This can be useful for ab-initio docking to sample the entire protein surface.
To ensure a thorough sampling of the surface, the number of structures generated at the rigid-body stage [rigidbody]
should be increased (e.g. sampling=10000
), depending on the extent of the surface to be sampled.
These random restraints are described here.
Random interaction restraints are used in the binding site tutorial.
Center of mass restraints
Center of mass (COM) restraints are distance restraints that ensure close proximity of two molecules. Such restraints can be useful in multi-body (N>2) docking to ensure that all molecules are in contact and thus promote compactness of the docking solutions. Similarly to the contact surface restraints, they can be useful in combination with random interaction restraints definition (see above) or in the refinement of molecular complexes.
COM restraints are mentioned in multiple tutorials, for example:
- Refining the interface of the cryo-EM fitted models with HADDOCK
- HADDOCK 2.4 CASP-CAPRI T70 Ab-initio docking tutorial
- Modelling a homo-oligomeric complex from MS cross-links.
Surface contact restraints
Surface contact restraints can be useful in multi-body (N>2) docking to ensure that all molecules are in contact and thus promote compactness of the docking solutions.
As for the random AIRs, surface contact restraints can be used in ab-initio docking; in such a case it is important to have enough sampling of the random starting orientations and this significantly increases the number of structures for rigid-body docking.
They can be useful in combination with random interaction restraints definition (see above) or in the refinement of molecular complexes.
They can be turned on by setting the contact_airs = true
parameter.
Optimal settings for docking using bioinformatics predictions
When we are less certain about the interacting residues, it is better to enhance sampling by increasing the number of structures generated in each phase of docking.
This can be performed by tuning:
- Increasing the number of generated complexes by tuning the
sampling
parameter in[rigidbody]
module. - Selecting more complexes to be refined:
select = 400
parameter in[seletop]
module. - Split the predicted AIRs into smaller subsets, and generate a
.tgz
archive.
Parameter | Module/parameter | default value | optimal value |
---|---|---|---|
Number of generated structures for rigid body docking [rigidbody] |
| 1000 | 10000 |
Provide multiple AIRs as tar gz archive |
| .tbl | .tbl.gz |
Number of trials for rigid body minimisation |
| 5 | 1 |
Number of structures selected for later refinements in [seletop] |
| 200 | 400 |
IMPORTANT NOTE: Splitting your very ambiguous interaction restraints into multiple files can allow further de-noising (in addition to randremoval = true
). This is performed by generating multiple restraints files, combining them in a single .tgz
archive and finally using it from the ambig_fname
parameter.
Have a look at the examples using multiple ambiguous restraints:
- In your haddock3 local installation:
examples/docking-multiple-ambig
- Online
Here is an example:
# General parameters
#####################
# ...
# Workflow / Modules
#####################
# ...
[rigidbody]
sampling = 10000
ambig_fname = "noisy_ambigs.tbl.tgz"
[seletop]
select = 400
# ... refinements steps ...
More about optimal settings for different docking scenarios can be found here.
Getting restraints HADDOCK-ready
Several methods can allow you to generate restraints for haddock3:
- locally using the
haddock3-restraints
command line interface: Holds multiple subcommands that should cover the majority of the usages. - online using GenTBL server
Dos and Don'ts
Don't | Do instead |
---|---|
define the entire protein as active | define only key interacting residues as active, if they are not known, define the surface of one molecule as passive |
Complementary software related to restraints for HADDOCK
In BonvinLab, a number of complementary web servers have been developed to help users to reevaluate restraints.
ARCTIC-3D
ARCTIC-3D, standing for Automatic Retrieval and Clustering of Interfaces in Complexes, is a data mining algorithm that searches for experimental interfaces in the PDB and cluster interaction sites together. It is also able to directly generate AIRs for haddock3.
CPORT
CPORT is an algorithm for the prediction of protein-protein interface residues. It combines six interface prediction methods into a consensus predictor.
Tutorials using CPORT:
DISVIS
DISVIS visualizes and quantifies the information content of distance restraints between macromolecular complexes.
Tutorial describing DisVis:
- DisVis tutorial
- HADDOCK2.4 tutorial for the use of MS crosslinks
- Integrative modelling of the RNA polymerase III apo complex
Any more questions about restraints for HADDOCK?
Have a look at: