haddock.modules.analysis.contactmap.contmap module
Module computing contact maps of complexes, alone or grouped by cluster.
Chord diagram functions were adapted from: https://plotly.com/python/v3/filled-chord-diagram/
- class haddock.modules.analysis.contactmap.contmap.ClusteredContactMap(models: list[Path], output: Path, params: dict)[source]
Bases:
object
ContactMap analysis for set of clustered structures.
- static aggregate_contacts(contacts_holder: dict, contact_keys: list[str], contacts: list[dict], key1: str, key2: str) None [source]
Aggregate single models data belonging to a cluster.
- Parameters:
contacts_holder (dict) – Dictionnary holding list of contact data
contact_keys (list[str]) – Order of the keys to access the dictionnary
contacts (list[dict]) – Singel model contact data.
key1 (str) – Name of the key to access first entry in data.
key2 (str) – Name of the key to access second entry in data.
- class haddock.modules.analysis.contactmap.contmap.ContactsMap(model: Path, output: Path, params: dict)[source]
Bases:
object
ContactMap analysis for single structure.
- generate_output(res_res_contacts: list[dict], all_heavy_interchain_contacts: list[dict]) None [source]
Generate several outputs based on contacts.
- Parameters:
res_res_contacts (list[dict]) – List of residue-residue contacts
all_heavy_interchain_contacts (list[dict]) – List of heavy atoms interchain contacts
- class haddock.modules.analysis.contactmap.contmap.ContactsMapJob(output, params, name, contact_obj)[source]
Bases:
SupportsRun
A Job dedicated to the running of contact maps objects.
- haddock.modules.analysis.contactmap.contmap.add_chordchart_legends(fig: Figure) None [source]
Add custom legend to chordchart.
- Parameters:
fig (go.Figure) – A plotly figure.
- haddock.modules.analysis.contactmap.contmap.check_square_matrix(data_matrix: ndarray[tuple[int, ...], dtype[_ScalarType_co]]) int [source]
Check if the matrix is a square one.
- Parameters:
data_matrix (NDArray (2DArray)) – The matrix to be checked.
- Returns:
nb_rows (int) – Number of rows in this matrix.
- haddock.modules.analysis.contactmap.contmap.compute_distance_matrix(all_atm_coords: list[list[float]]) ndarray[tuple[int, ...], dtype[float64]] [source]
Compute all vs all distance matrix.
Paramaters
- all_atm_coordslist[list[float]]
List of atomic coordinates.
- returns:
dist_matrix (NDFloat) – N*N distance matrix between all coordinates.
- haddock.modules.analysis.contactmap.contmap.contacts_to_connect_matrix(matrix: ndarray[tuple[int, ...], dtype[float64]], labels: list[str]) list[list[int]] [source]
.
- Parameters:
matrix (NDFloat) – A square contact matrix.
labels (list[str]) – List of labels corresponding row & columns entries.
- Returns:
connect_matrix (list[list[Union[str, int]]]) – The connectivity matrix without self contacts.
- haddock.modules.analysis.contactmap.contmap.control_pts(angle: list[float], radius: float) list[tuple[float, float]] [source]
Generate control points to draw a SVGpath.
- Parameters:
angle (list[float]) – A list containing angular coordinates of the control points b0, b1, b2.
radius (float) – The distance from b1 to the origin O(0,0)
- Returns:
control_points (list[tuple[float, float]]) – The set of control points.
- Raises:
ValueError – Raised if the number of angular coordinates is not equal to 3.
- haddock.modules.analysis.contactmap.contmap.ctrl_rib_chords(side1: tuple[float, float], side2: tuple[float, float], radius: float) list[list[tuple[float, float]]] [source]
Generate poligons points aiming at drawing ribbons.
- Parameters:
side1 (tuple[float, float]) –
- List of angular variables of the ribbon arc ends defining
the ribbon starting (ending) arc
side2 (tuple[float, float]) –
- List of angular variables of the ribbon arc ends defining
the ribbon starting (ending) arc
radius (float, optional) – Circle radius size
- Returns:
list[list[tuple[float, float]]] – _description_
- haddock.modules.analysis.contactmap.contmap.datakey_to_colorscale(data_key: str, color_scale: str = 'Greys') str [source]
Convert color scale into reverse if data implies to do it.
- data_keystr
A dictionary key pointing to data type.
- color_scalestr
Name of a base plotpy color_scale.
- Returns:
color_scale (str) – Possibly the reverse name of the color_scale.
- haddock.modules.analysis.contactmap.contmap.extract_heavyatom_contacts(matrix: ndarray[tuple[int, ...], dtype[float64]], resdt: dict, res1_key: str, res2_key: str, contact_distance: float = 4.5) list[dict[str, float | str]] [source]
Generate contacts data.
- Parameters:
matrix (NDFloat) – The distance matrix.
resdt (dict) – Residues data with atom indices as returned by get_ordered_coords().
res1_key (str) – First residue of interest.
res2_key (str) – Second residue of interest.
contact_distance (float) – Distance defining a contact.
- Returns:
all_contacts (list[dict[str, Union[float, str]]]) – List holding contact data
- haddock.modules.analysis.contactmap.contmap.extract_pdb_coords(line: str) list[float] [source]
Extract coordinated from a PDB line.
- Parameters:
line (str) – A strandard ATOM/HETATM pdb record.
- Returns:
coords (list[float]) – List of the X, Y and Z coordinate of this atom.
- haddock.modules.analysis.contactmap.contmap.extract_pdb_dt(path: Path) dict [source]
Read and extract ATOM/HETATM records from a pdb file.
- Parameters:
path (Path) – Path to a pdb file.
- Returns:
pdb_chains (dict) – A dictionary of the pdb file accesible using chains as keys.
- haddock.modules.analysis.contactmap.contmap.extract_submatrix(matrix: ndarray[tuple[int, ...], dtype[float64]], indices: list[int], indices2: list[int] | None = None) ndarray[tuple[int, ...], dtype[float64]] [source]
Extract submatrix based on desired indices.
Paramaters
- matrixNDFloat
A N*N matrix.
- indiceslist[int]
List of row indices to extract from this matrix
- indices2list[int]
- List of columns indices to extract from this matrix.
if unspecified, indices2 == indices and symetric matrix is extracted.
- returns:
submat (NDFloat) – The extracted submatrix.
- haddock.modules.analysis.contactmap.contmap.gen_contact_dt(matrix: ndarray[tuple[int, ...], dtype[float64]], resdt: dict, res1_key: str, res2_key: str) dict [source]
Generate contacts data.
- Parameters:
matrix (NDFloat) – The distance matrix.
resdt (dict) – Residues data with atom indices as returned by get_ordered_coords().
res1_key (str) – First residue of interest.
res2_key (str) – Second residue of interest
- Returns:
cont_dt (dict) – Dictionary holding contact data
- haddock.modules.analysis.contactmap.contmap.get_all_ideograms_ends(chains: dict, gap: float = 0.031415926535897934) tuple[list[tuple[float, float]], list[tuple[float, float]]] [source]
Generate both chain and residues ideograms ends.
- Parameters:
chains (dict) – Dictionary mapping to list of residues labels.
gap (float, optional) – Gap distance used to separate two ideograms, by default 2*PI*0.005
- Returns:
tuple[ideo_ends, chain_ideo_ends] – A tuple containing residues ideo ends and chains ideo ends.
ideo_ends (list[tuple[float, float]]) – List of residues ideograms start and ending positions.
chain_ideo_ends (list[tuple[float, float]]) – List of chain ideograms start and ending positions.
- haddock.modules.analysis.contactmap.contmap.get_chains_ideograms_ends(chains: dict[str, list[str]], gap: float = 0.031415926535897934) tuple[list[tuple[float, float]], ndarray[tuple[int, ...], dtype[float64]]] [source]
Build ideogram ends to represent protein chains.
- Parameters:
chains (dict[str, list[str]]) – Dictionary mapping chains with their respective set of residues labels.
gap (float, optional) – Gap between two ideograms, by default 2*PI*0.005
- Returns:
chain_ideo_ends (list[tuple[float, float]]) – Ideogram ends to represent protein chains.
chain_ideogram_length (NDFloat)
- haddock.modules.analysis.contactmap.contmap.get_clusters_sets(models: list[PDBFile]) dict [source]
Split models by clusters ids.
- Parameters:
models (list) – List of pdb models/complexes.
- Returns:
clusters_sets (dict) – Dictionary of models acccessible by their cluster ids as keys.
- haddock.modules.analysis.contactmap.contmap.get_cont_type(resn1: str, resn2: str) str [source]
Generate polarity key between two residues.
- Parameters:
resn1 (str) – 3 letters code of fist residue.
resn2 (str) – 3 letters code of second residue.
- Returns:
pol_key (str) – Combined residues polarities
- haddock.modules.analysis.contactmap.contmap.get_ideogram_ends(ideogram_len: ndarray[tuple[int, ...], dtype[float64]], gap: float) list[tuple[float, float]] [source]
Generate ideogram ends.
Paramaters
- ideogram_lenNDArray
Length of each ideograms.
- gapfloat
Gap to add in between each ideogram.
- returns:
ideo_ends (list[tuple[float]]) – List of start and end position for each ideograms.
- haddock.modules.analysis.contactmap.contmap.get_ordered_coords(pdb_chains: dict) tuple[list[list[float]], list[str], dict] [source]
Generate list of all atom coordinates.
- Parameters:
pdb_chains (dict) –
- A dictionary of the pdb file accesible using chains as keys,
as provided by the extract_pdb_dt() function.
- Returns:
all_coords (list[list[float]]) – All atomic coordinates in a single list.
resid_keys (list[str]) – Ordered list of residues keys.
resid_dt (dict) – Dictionary of coordinates indices for each residue.
- haddock.modules.analysis.contactmap.contmap.invPerm(perm: list[int]) list[int] [source]
Generate the inverse of a permutation.
- Parameters:
perm (_type_) – A permutation.
- Returns:
inv (list[int]) – Inverse of a permutation.
- haddock.modules.analysis.contactmap.contmap.make_chordchart(_contact_matrix: list[list[int]], _dist_matrix: list[list[float]], _interttype_matrix: list[list[str]], _labels: list[str], gap: float = 0.031415926535897934, output_fpath: str | Path = 'chordchart.html', title: str = 'Chord diagram', offline: bool = False) str | Path [source]
Generate a plotly chordchart graph.
- Parameters:
_contact_matrix (list[list[int]]) – The contact matrix
_dist_matrix (list[list[float]]) – The distance matrix
_interttype_matrix (list[list[str]]) – The interaction type matrix
_labels (list[str]) – Labels of each matrix rows (and columns as supposed to be symetric)
gap (float, optional) – Gap between two ideograms, by default 2*PI*0.005
output_fpath (Union[str, Path], optional) – Path to the output file, by default ‘chordchart.html’
title (str, optional) – Title to give to the diagram, by default ‘Chord diagram’
- Returns:
output_fpath (Union[str, Path]) – Path to the genereated output file.
- haddock.modules.analysis.contactmap.contmap.make_contactmap_report(contactmap_jobs: list[ContactsMapJob], outputpath: str | Path) str | Path [source]
Generate a HTML navigation page holding all generated files.
- Parameters:
contact_jobs (list[Union[ClusteredContactMap, ContactsMap]]) – All the terminated jobs
outputpath (Union[str, Path]) – Output filepath where to write the report.
- Returns:
outputpath (Union[str, Path]) – Path to the generated report.
- haddock.modules.analysis.contactmap.contmap.make_ideo_shape(path: str, line_color: str, fill_color: str) dict [source]
Generate data to draw a ideogram shape.
- Parameters:
path (str) – A SVGPath to be drawn.
line_color (str) – Color of the shape boundary.
fill_color (str) – Shape filling color fr the ribbon shape.
- Returns:
dict – Data enabling to draw a ideogram shape in layout.
- haddock.modules.analysis.contactmap.contmap.make_ideogram_arc(radius: float, _phi: tuple[float, float], nb_points: float = 50) ndarray[tuple[int, ...], dtype[float64]] [source]
Generate ideogran arc.
- Parameters:
radius (float) – The circle radius.
phi (tuple[float, float]) – Tuple of ends angle coordinates of an arc.
nb_points (float) – Parameter that controls the number of points to be evaluated on an arc
- Returns:
arc_positions (NDArray) – Array of 2D coorinates defining an arc.
- haddock.modules.analysis.contactmap.contmap.make_layout(title: str, plot_size: float, layout_shapes: list[dict]) Layout [source]
Generate the chart layout.
- Parameters:
title (str) – Title to be given to the chart.
plot_size (float) – Size of the chart.
layout_shapes (list[dict]) – Shapes to be drawn.
- Returns:
layout (go.Layout) – The plotly layout.
- haddock.modules.analysis.contactmap.contmap.make_q_bezier(control_points: list[tuple[float, float]]) str [source]
Define the Plotly SVG path for a quadratic Bezier curve.
defined by the list of its control points.
- Parameters:
control_points (list[tuple[float, float]]) – List of control points
- Returns:
svgpath (str) – An SVG path
- haddock.modules.analysis.contactmap.contmap.make_ribbon(side1: tuple[float, float], side2: tuple[float, float], line_color: str, fill_color: str, radius: float = 0.2) dict [source]
Generate data to draw a ribbon.
- Parameters:
side1 (list[float]) –
- List of angular variables of first ribbon arc ends defining
the ribbon starting (ending) arc.
side2 (list[float]) –
- List of angular variables of the other ribbon arc ends defining
the ribbon starting (ending) arc.
line_color (str) – Color of the shape boundary.
fill_color (str) – Shape filling color fr the ribbon shape.
radius (float, optional) – Circle radius size, by default 0.2.
- Returns:
dict – Data enabling to draw a ribbon in layout.
- haddock.modules.analysis.contactmap.contmap.make_ribbon_arc(theta0: float, theta1: float) str [source]
Generate a SVGpath to draw a ribbon arc.
- Parameters:
theta0 (float) – Starting angle value
theta1 (float) – Ending angle value
- Returns:
string_arc (str) – A string representing the SVGpath of the ribbon arc.
- Raises:
ValueError – If provided theta0 and theta1 angles are incorrect for a ribbon.
ValueError – If the angle coordinates for an arc side of a ribbon are not in the appropriate range [0, 2*pi]
- haddock.modules.analysis.contactmap.contmap.make_ribbon_ends(matrix: ndarray[tuple[int, ...], dtype[_ScalarType_co]], row_sum: list[int], ideo_ends: list[tuple[float, float]], L: int) list[list[tuple[float, float]]] [source]
Generate all connecting ribbons coordinates.
- Parameters:
matrix (NDArray) – The data matrix.
row_sum (list[int]) – Number of connexions in each row.
ideo_ends (list[tuple[float, float]]) – List of start and end position for each ideograms.
- Returns:
ribbon_boundary (list[list[tuple[float, float]]]) – Matrix of per residue ribbons start and end positions.
- haddock.modules.analysis.contactmap.contmap.min_dist(matrix: ndarray[tuple[int, ...], dtype[float64]]) float [source]
Find minimum value in a matrix.
- haddock.modules.analysis.contactmap.contmap.moduloAB(val: float, lb: float, ub: float) float [source]
Map a real number onto the unit circle.
The unit circle is identified with the interval [lb, ub), ub-lb=2*PI.
- Parameters:
val (float) – The value to be mapped into the unit circle.
lb (float) – The lower boundary.
ub (float) – The upper boundary
- Returns:
moduloab (float) – The modulo of val between lb and ub
- haddock.modules.analysis.contactmap.contmap.split_labels_by_chains(labels: list[str]) dict[str, list[str]] [source]
Map each label to its chain.
- Parameters:
labels (list[str]) – List of residues keys. e.g.: A-SER-123 (chain A, serine 123)
- Returns:
chains (dict[str, list[str]]) – Dictionary mapping chains with their respective set of residues labels.
- haddock.modules.analysis.contactmap.contmap.to_color_weight(distance: float, max_dist: float, min_dist: float = 2.0, min_weight: float = 0.2, max_weight: float = 0.9) float [source]
Compute color weight based on distance.
- Parameters:
distance (float) – The distance to weight.
max_dist (float) – The max distance observed in the dataset.
min_dist (float, optional) – The minumu, distance observed in the dataset, by default 2.
min_weight (float, optional) – Color wight for the maximum distance, by default 0.2
max_weight (float, optional) – Color wight for the minimum distance, by default 0.90
- Returns:
weight (float) – The color weight. in range [min_weight, max_weight]
- haddock.modules.analysis.contactmap.contmap.to_full_matrix(half_matrix: list[int | float | str], diag_val: int | float | str) ndarray[tuple[int, ...], dtype[_ScalarType_co]] [source]
Generate a full matrix from a half matrix.
- Parameters:
half_matrix (list[Any]) – Values of the N*(N-1)/2 half matrix.
diag_val (Any) – Value to be placed in diagonal of the full matrix.
- Returns:
matrix (NDArry) – The reconstituted full matrix.
- haddock.modules.analysis.contactmap.contmap.to_nice_label(label: str) str [source]
Convert a label into a user friendly label.
- Parameters:
label (str) – Label name as found in csv
- Returns:
nicelabel (str) – User friendly description of the label.
- haddock.modules.analysis.contactmap.contmap.to_rgba_color_string(connect_color: tuple[int, int, int], alpha: float) str [source]
Generate a rgba string from list of colors and alpha.
- Parameters:
connect_color (list[int]) – A 3-values list of integers defining the red, green and blue colors.
alpha (float) – color_weight
- Returns:
rgba_color (str) – The html like rgba colors. e.g.: ‘rgba(123, 123, 123, 0.5)’
- haddock.modules.analysis.contactmap.contmap.topX_models(models: list[PDBFile], topX: int = 10) list[Any] [source]
Sort and return subset of top X best models.
- Parameters:
models (list) – List of pdb models/complexes.
topX (int) – Number of models to return after sorting.
- Returns:
subset_bests (list) – List of top X best models.
- haddock.modules.analysis.contactmap.contmap.tsv_to_chordchart(tsv_path: Path, sep: str = '\t', data_key: str = 'ca-ca-dist', contact_threshold: float = 7.5, filter_intermolecular_contacts: bool = True, output_fname: Path | str = 'contacts_chordchart.html', title: str = 'Chord diagram', offline: bool = False) Path | str [source]
Read a tsv file and generate a chord diagram from it.
Paramters
- tsv_pathPath
Path a the .tsv file containing contact data.
- sepstr
Separator character used to split data in each line.
- data_keystr
Data key used to draw the plot.
- contact_thresholdfloat
- Upper boundary of maximum value to be plotted.
any value above it will be set to this value.
- output_fnameUnion[Path, str]
Path where to generate the graph.
- titlestr
Title to give to the Chord diagram
- returns:
chord_chart_fpath (Union[Path, str]) – Path to the generated graph
- haddock.modules.analysis.contactmap.contmap.tsv_to_heatmap(tsv_path: Path, sep: str = '\t', data_key: str = 'ca-ca-dist', contact_threshold: float = 7.5, colorscale: str = 'Greys', output_fname: Path | str = 'contacts.html', offline: bool = False) Path | str [source]
Read a tsv file and generate a heatmap from it.
Paramters
- tsv_pathPath
Path a the .tsv file containing contact data.
- sepstr
Separator character used to split data in each line.
- data_keystr
Data key used to draw the plot.
- contact_thresholdfloat
- Upper boundary of maximum value to be plotted.
any value above it will be set to this value.
- output_fnamePath
Path to the generated graph.
- returns:
output_filepath (Union[Path, str]) – Path to the generated file.
- haddock.modules.analysis.contactmap.contmap.within_2PI(val: float) bool [source]
Check if float value is within unit circle value range.
- Parameters:
val (float) – The value to be tested.
- haddock.modules.analysis.contactmap.contmap.write_res_contacts(res_res_contacts: list[dict], header: list[str], path: Path | str, sep: str = '\t', interchain_data: bool | dict | None = None) Path [source]
Write a tsv file based on residues-residues contacts data.
- Parameters:
res_res_contacts (list[dict]) – List of dict holding data for each residue-residue contacts.
header (list[str]) – Ordered list of keys to access in the dicts.
path (Path) – Path to the output file to generate.
sep (str) – Character used to separate data within a line.
- Returns:
path (Path) – Path to the generated file.