RMSD Clustering module
Submodules
Module contents
RMSD clustering module.
This module takes in input the RMSD matrix calculated in the previous step and performs a hierarchical clustering procedure on it, leveraging scipy routines for this purpose.
Essentially, the procedure amounts at lumping the input models in a progressively coarser hierarchy of clusters, called the dendrogram.
Four parameters can be defined in this context:
linkage: governs the way clusters are merged together in the creation of the dendrogram
criterion: defines the prescription to cut the dendrogram and obtain the desired clusters
n_clusters: number of desired clusters (if criterion is maxclust).
clust_cutoff: value of distance that separates distinct clusters (if criterion is
distance
)min_population : analogously to the clustfcc module, it is the minimum number of models that should be present in a cluster to consider it. If criterion is maxclust, the value is ignored.
This module passes the path to the RMSD matrix is to the next step of the workflow through the rmsd_matrix.json file, thus allowing to execute several clustrmsd modules (possibly with different parameters) on the same RMSD matrix.
- class haddock.modules.analysis.clustrmsd.HaddockModule(order: int, path: Path, initial_params: Path | str = PosixPath('/opt/hostedtoolcache/Python/3.10.16/x64/lib/python3.10/site-packages/haddock/modules/analysis/clustrmsd/defaults.yaml'))[source]
Bases:
BaseHaddockModule
HADDOCK3 module for clustering with RMSD.