Selection of top clusters module

Submodules

Module contents

Select models from the top clusters.

This module selects a number of models from a number of clusters. The selection is based on the score of the models within the clusters.

In the standard HADDOCK analysis, the top 4 models of the top 10 clusters are shown. In case seletopclusts is run after a sampling module, we can keep a few models from all the clusters to have more diversity at the refinement stage(s).

class haddock.modules.analysis.seletopclusts.HaddockModule(order: int, path: Path, *ignore: Any, init_params: str | Path = PosixPath('/opt/hostedtoolcache/Python/3.10.16/x64/lib/python3.10/site-packages/haddock/modules/analysis/seletopclusts/defaults.yaml'), **everything: Any)[source]

Bases: BaseHaddockModule

Haddock Module for ‘seletopclusts’.

classmethod confirm_installation() None[source]

Confirm if module is installed.

name: str = 'seletopclusts'

Default Parameters

Easy

sortby

default: ‘score’
type: string
title: Method used to define best cluster.
short description: Best cluster can be defined based either on cluster models scores, or on cluster size. By default, ‘score’ is selected.
long description: if the selection is done by ‘score’ the average score of the top (4) models of each cluster is used to define the cluster rank. When clustering by ‘size’, a bigger cluster size corresponds to a higher rank. By default, ‘score’ is selected.
group: analysis
explevel: easy

top_cluster

default: 1000
type: integer
title: Number of clusters to consider
min: 1
max: 99999
short description: Number of clusters to consider (ranked by score)
long description: Number of clusters to consider (ranked by score)
group: analysis
explevel: easy

top_models

default: 10
type: integer
title: Number of best-ranked models to select per cluster
min: 1
max: 99999
short description: Number of best-ranked models to select per cluster. By default, 10 models are selected.
long description: Number of best-ranked models to select per cluster. By default, 10 models are selected. If one expects to find many clusters (e.g., in the case of vague restraints), this number can be reduced to speed up the protocol. Instead, if few clusters are expected (very specific and localized restraints) this number can be increased.
group: analysis
explevel: easy