libplots: plotting functionalities
Plotting functionalities.
- haddock.libs.libplots.ClRank
A dict representing clusters’ rank.
key (int): cluster’s id
value(int): cluster’s rank
- haddock.libs.libplots.box_plot_data(capri_df: DataFrame, cl_rank: dict[int, int]) DataFrame [source]
Retrieve box plot data.
- Parameters:
capri_df (pandas DataFrame) – capri table dataframe
cl_rank (dict) – {cluster_id : cluster_rank} dictionary
- Returns:
gb_full (pandas DataFrame) – DataFrame of all the clusters to be plotted
- haddock.libs.libplots.box_plot_handler(capri_filename: str | Path, cl_rank: dict[int, int], format: Literal['png', 'pdf', 'svg', 'jpeg', 'webp'] | None, scale: float | None, offline: bool = False) list[Figure] [source]
Create box plots.
The idea is that for each of the top X-ranked clusters we create a box plot showing how the basic statistics are distributed within each model.
- Parameters:
capri_filename (str or Path) – capri single structure filename
cl_rank (dict) – {cluster_id : cluster_rank} dictionary
format (str) – Produce images in the selected format.
scale (int) – scale for images.
- haddock.libs.libplots.box_plot_plotly(gb_full: DataFrame, y_ax: str, cl_rank: dict[int, int], format: Literal['png', 'pdf', 'svg', 'jpeg', 'webp'] | None, scale: float | None, offline: bool = False) Figure [source]
Create a scatter plot in plotly.
- Parameters:
gb_full (pandas DataFrame) – data to box plot
y_ax (str) – variable to plot
cl_rank (dict) – {cluster_id : cluster_rank} dictionary
format (str) – Produce images in the selected format.
scale (int) – scale of image
- Returns:
fig_list (list) – a list of figures
- haddock.libs.libplots.clean_capri_table(df: DataFrame) DataFrame [source]
Create a tidy capri table for the report.
It also combines mean and std values in one column. Also it drops the columns that are not needed in the report.
Makes inplace changes to the dataframe.
- Parameters:
df (pandas DataFrame) – dataframe of capri values
- Returns:
pandas DataFrame – DataFrame of capri table with new column names
- haddock.libs.libplots.clt_table_handler(clt_file, ss_file, is_cleaned=False)[source]
Create a dataframe including data for tables.
The idea is to create tidy tables that report statistics available in capri_clt.tsv and capri_ss.tsv files.
- Parameters:
clt_file (str or Path) – path to capri_clt.tsv file
ss_file (str or Path) – path to capri_ss.tsv file
is_cleaned (bool) – is the run going to be cleaned?
- Returns:
df_merged (pandas DataFrame) – a data frame including data for tables
- haddock.libs.libplots.create_html(json_content: str, plot_id: int = 1, plotly_js_import: str | None = None, figure_height: int = 800, figure_width: int = 1000) str [source]
Create html content given a plotly json.
- Parameters:
json_content (str) – plotly json content
plot_id (int) – plot id to be used in the html content
figure_height (int) – figure height (in pixels)
figure_width (int) – figure width (in pixels)
- Returns:
html_content (str) – html content
- haddock.libs.libplots.create_other_cluster(clusters_df: DataFrame, structs_df: DataFrame, max_clusters: int) tuple[DataFrame, DataFrame] [source]
Combine all clusters with rank >= max_clusters into an “Other” cluster.
- Parameters:
clusters_df (pandas DataFrame) – DataFrame of clusters
structs_df (pandas DataFrame) – DataFrame of structures
max_clusters (int) – From which cluster rank to consider as “Other”
- Returns:
tuple with clusters_df and structs_df
- haddock.libs.libplots.export_plotly_figure(fig: Figure, output_fname: str | Path, figure_height: int = 1000, figure_width: int = 1000, offline: bool = False) None [source]
Write a plotly figure.
- Parameters:
fig (Figure) – The plotly Figure object
output_fname (Union[str, Path]) – Where to write it
figure_height (int, optional) – Height of the figure (in pixels), by default 1000
figure_width (int, optional) – Width of the figure (in pixels), by default 1000
offline (bool, optional) – If True add the plotly js library to the file, by default False
- haddock.libs.libplots.fig_to_html(fig: Figure, fpath: str | Path, plot_id: int = 1, figure_height: int = 800, figure_width: int = 1000, offline: bool = False) None [source]
Workaround plotly html file generation.
- Parameters:
fig (Figure) – A Figure object created by Plotly
fpath (Union[str, Path]) – Where to write the content
json_content (str) – plotly json content
plot_id (int) – plot id to be used in the html content
figure_height (int) – figure height (in pixels)
figure_width (int) – figure width (in pixels)
offline (bool) – If set to False, use the cdn url to obtain the javascript content for the rendering.
- haddock.libs.libplots.find_best_struct(df: DataFrame, max_best_structs: int = 4) DataFrame [source]
Find best structures for each cluster.
- Parameters:
df (pd.DataFrame) – The loaded capri_ss.tsv dataframe
max_best_structs (int) – The maximum number of best structures to return.
- Returns:
best_df (pd.DataFrame) – DataFrame of best structures with cluster_id and best<model-cluster_ranking> columns and empty strings for missing values.
- haddock.libs.libplots.heatmap_plotly(matrix: ndarray[tuple[int, ...], dtype[float64]], labels: dict | None = None, xlabels: list | None = None, ylabels: list | None = None, color_scale: str = 'Greys_r', title: str | None = None, output_fname: Path = PosixPath('contacts.html'), offline: bool = False, hovertemplate: str | None = None, customdata: list[list[Any]] | None = None, delineation_traces: list[dict[str, float]] | None = None) Path [source]
Generate a plotly heatmap based on matrix content.
- Parameters:
matrix (NDFloat) – The 2D matrix containing data to be shown.
labels (dict) – Labels of the horizontal (x), vertical (y) and colorscale (color) axis.
xlabels (list) – List of columns names.
ylabels (list) – List of row names.
color_scale (str) – Color scale to use.
title (str) – Title of the figure.
output_fname (Path) – Path to the output filename to generate.
hovertemplate (Optional[str]) – Custrom string used to format data for hover annotation in plotly.
customdata (Optional[list[list[list[int]]]]) – A matrix of cluster ids, used for extra hover annotation in plotly.
delineation_traces (Optional[list[dict[str, float]]]) – A list of dict enabling to draw lines separating cluster ids.
- Returns:
output_fname (Path) – Path to the generated filename
- haddock.libs.libplots.in_capri(column: str, df_columns: Index) bool [source]
Check if the selected column is in the set of available columns.
- Parameters:
column (str) – column name
df_columns (pandas.DataFrame.columns) – columns of a pandas.DataFrame
- Returns:
resp (bool) – if True, the column is present
- haddock.libs.libplots.make_alascan_plot(df: DataFrame, clt_id: int, scan_res: str = 'ALA', offline: bool = False) None [source]
Make a plotly interactive plot.
Score components are here weighted by their respective contribution to the total score.
- Parameters:
df (pandas.DataFrame) – DataFrame containing the results of the alanine scan.
clt_id (int) – Cluster ID.
scan_res (str, optional) – Residue name used for the scan, by default “ALA”
- haddock.libs.libplots.make_traceback_plot(tr_subset, plot_filename, offline=False)[source]
Create a traceback barplot with the 40 best ranked models.
- Parameters:
tr_subset (pandas.DataFrame) – DataFrame containing the top traceback results
plot_filename (Path) – Path to the output filename to generate
- haddock.libs.libplots.offline_js_manager(fpath: str | Path, offline: bool) str [source]
Build string to access plotly javascript content.
- Parameters:
fpath (FilePath) – Path to the figure about to be written.
offline (bool) – if True use the offline approach.
- Returns:
plotly_js_import (str) – HTML solution for the importation of the plotly javascript content.
- haddock.libs.libplots.read_capri_table(capri_filename: str | Path, comment: str = '#') DataFrame [source]
Read capri table with pandas.
- Parameters:
capri_filename (str or Path) – capri single structure filename
comment (str) – the string used to denote a commented line in capri tables
- Returns:
capri_df (pandas DataFrame) – dataframe of capri values
- haddock.libs.libplots.report_generator(boxes: list[Figure], scatters: list[Figure], tables: list, step: str, directory: str | Path = '.', offline: bool = False) None [source]
Create a figure include plots and tables.
The idea is to create a report.html file that includes all the plots and tables generated by the command analyse.
- Parameters:
boxes (list) – list of box plots generated by box_plot_handler
scatters (list) – list of scatter plots generated by scatter_plot_handler
table (list) – a list including tables generated by clt_table_handler
directory (Path) – path to the output folder
offline (bool) – If True, the HTML will be generated for offline use.
- haddock.libs.libplots.report_plots_handler(plots, shared_xaxes=False, shared_yaxes=False)[source]
Create a figure that holds subplots.
The idea is that for each type (scatters or boxes), the individual plots are considered subplots. In the report, some of the axes are shared. The settings for sharing axes depends on the type (scatters or boxes).
- Parameters:
plots (list) – list of plots generated by analyse command
shared_xaxes (boolean or str (default False)) – a parameter of plotly.subplots.make_subplots
shared_yaxes (boolean or str (default False)) – a parameter of plotly.subplots.make_subplots
- Returns:
fig – an instance of plotly.graph_objects.Figure
- haddock.libs.libplots.scatter_plot_data(capri_df: DataFrame, cl_rank: dict[int, int]) tuple[DataFrameGroupBy, DataFrame] [source]
Retrieve scatter plot data.
- Parameters:
capri_df (pandas DataFrame) – capri table dataframe
cl_rank (dict) – {cluster_id : cluster_rank} dictionary
- Returns:
gb_cluster (pandas DataFrameGroupBy) – capri DataFrame grouped by cluster_id
gb_other (pandas DataFrame) – DataFrame of clusters not in the top cluster ranking
- haddock.libs.libplots.scatter_plot_handler(capri_filename: str | Path, cl_rank: dict[int, int], format: Literal['png', 'pdf', 'svg', 'jpeg', 'webp'] | None, scale: float | None, offline: bool = False) list[Figure] [source]
Create scatter plots.
- The idea is that for each pair of variables of interest (SCATTER_PAIRS,
declared as global) we create a scatter plot.
If available, each scatter plot containts cluster information.
- Parameters:
capri_filename (str or Path) – capri single structure filename
cl_rank (dict) – {cluster_id : cluster_rank} dictionary
format (str) – Produce images in the selected format.
scale (int) – scale for images.
- Returns:
fig_list (list) – a list of figures
- haddock.libs.libplots.scatter_plot_plotly(gb_cluster: DataFrameGroupBy, gb_other: DataFrame, cl_rank: dict[int, int], x_ax: str, y_ax: str, colors: list[str], format: Literal['png', 'pdf', 'svg', 'jpeg', 'webp'] | None, scale: float | None, offline: bool = False) Figure [source]
Create a scatter plot in plotly.
- Parameters:
gb_cluster (pandas DataFrameGroupBy) – capri DataFrame grouped by cluster_id
gb_other (pandas DataFrame) – DataFrame of clusters not in the top cluster ranking
cl_rank (dict) – {cluster_id : cluster_rank} dictionary
x_ax (str) – name of the x column
y_ax (str) – name of the y column
colors (list) – list of colors to be used
format (str) – Produce images in the selected format.
scale (int) – scale for images.
- Returns:
fig – an instance of plotly.graph_objects.Figure