visualization

Visualization class

class dpet.visualization.Visualization(analysis: EnsembleAnalysis)

Bases: object

Visualization class for ensemble analysis.

Parameters:

analysis (EnsembleAnalysis) – An instance of EnsembleAnalysis providing data for visualization.

alpha_angles(bins: int = 50, save: bool = False, ax: Axes = None) Axes

Plot the distribution of alpha angles.

Parameters:
  • bins (int) – The number of bins for the histogram. Default is 50.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (plt.Axes, optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

asphericity(bins: int = 50, hist_range: Tuple = None, violin_plot: bool = True, location: str = 'mean', save: bool = False, color: str = 'blue', multiple_hist_ax: bool = False, ax: Union[None, Axes, ndarray, List[Axes]] = None) Axes

Plot asphericity distribution in each ensemble. Asphericity is calculated based on the gyration tensor.

Parameters:
  • bins (int, optional) – The number of bins for the histogram. Default is 50.

  • hist_range (Tuple, optional) – A tuple with a min and max value for the histogram. Default is None, which corresponds to using the min a max value across all data.

  • violin_plot (bool, optional) – If True, a violin plot is visualized. Default is True.

  • location (str, optional) – Select between “median” or “mean” or “both” to show in violin plot. Default value is “mean”.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • color (str, optional) – Color of the violin plot. Default is blue.

  • multiple_hist_ax (bool, optional) – If True, each histogram will be plotted on separate axes. Default is False.

  • ax (Union[None, plt.Axes, np.ndarray, List[plt.Axes]], optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

average_distance_maps(ticks_fontsize: int = 14, cbar_fontsize: int = 14, title_fontsize: int = 14, dpi: int = 96, use_ylabel: bool = True, save: bool = False, ax: Union[None, List[List[Axes]], List[Axes]] = None) List[Axes]

Plot the average distance maps for selected ensembles.

Parameters:
  • ticks_fontsize (int, optional) – Font size for tick labels on the plot axes. Default is 14.

  • cbar_fontsize (int, optional) – Font size for labels on the color bar. Default is 14.

  • title_fontsize (int, optional) – Font size for titles of individual subplots. Default is 14.

  • dpi (int, optional) – Dots per inch (resolution) of the output figure. Default is 96.

  • use_ylabel (bool, optional) – If True, y-axis labels are displayed on the subplots. Default is True.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, List[List[plt.Axes]], List[plt.Axes]], optional) – A list or 2D list of Axes objects to plot on. Default is None, which creates new axes.

Returns:

Returns a 1D list of Axes objects representing the subplot grid.

Return type:

List[plt.Axes]

Notes

This method plots the average distance maps for selected ensembles, where each distance map represents the average pairwise distances between residues in a protein structure.

ca_com_distances(min_sep: int = 2, max_sep: Optional[int] = None, get_names: bool = True, inverse: bool = False, save: bool = False, ax: Union[None, Axes, ndarray, List[Axes]] = None) List[Axes]

Plot the distance maps comparing the center of mass (COM) and alpha-carbon (CA) distances within each ensemble.

min_sepint, optional

Minimum separation distance between atoms to consider. Default is 2.

max_sepint or None, optional

Maximum separation distance between atoms to consider. Default is None, which means no maximum separation.

get_namesbool, optional

Whether to get the residue names for the features. Default is True.

inversebool, optional

Whether to compute the inverse distances. Default is False.

figsizetuple, optional

Figure size in inches (width, height). Default is (6, 2.5).

savebool, optional

If True, save the plot as an image file. Default is False.

List[plt.Axes]

A list containing Axes objects corresponding to the plots for CA and COM distances.

This method plots the average distance maps for the center of mass (COM) and alpha-carbon (CA) distances within each ensemble. It computes the distance matrices for COM and CA atoms and then calculates their mean values to generate the distance maps. The plots include color bars indicating the distance range.

comparison_matrix(score: str, featurization_params: dict = {}, bootstrap_iters: int = None, bootstrap_frac: float = 1.0, bootstrap_replace: bool = True, confidence_level: float = 0.95, significance_level: float = 0.05, bins: Union[int, str] = 50, random_seed: int = None, verbose: bool = False, ax: Union[None, Axes] = None, figsize: Tuple[int] = (6.0, 5.0), dpi: int = 100, cmap: str = 'viridis_r', title: str = None, cbar_label: str = None, textcolors: Union[str, tuple] = ('black', 'white')) dict

Generates and visualizes the pairwise comparison matrix for the ensembles. This function computes the comparison matrix using the specified score type and feature. It then visualizes the matrix using a heatmap.

score, featurization_params, bootstrap_iters, bootstrap_frac, bootstrap_replace, bins, random_seed, verbose:

See the documentation of EnsembleAnalysis.comparison_scores for more information about these arguments.

ax: Union[None, plt.Axes], optional

Axes object where to plot the comparison heatmap. If None (the default value) is provided, a new Figure will be created.

figsize: Tuple[int], optional

The size of the figure for the heatmap. Default is (6.00, 5.0). Only takes effect if ax is not None.

dpi: int, optional

DPIs of the figure for the heatmap. Default is 100. Only takes effect if ax is not None.

confidence_level, significance_level, cmap, title, cbar_label, textcolors:

See the documentation of dpet.visualization.plot_comparison_matrix for more information about these arguments.

results: dict
A dictionary containing the following keys:

ax: the Axes object with the comparison matrix heatmap. scores: comparison matrix. See EnsembleAnalysis.comparison_scores

for more information.

codes: codes of the ensembles that were compared. fig: Figure object, only returned when a new figure is created

inside this function.

The comparison matrix is annotated with the scores, and the axes are labeled with the ensemble labels.

contact_prob_maps(log_scale: bool = True, avoid_zero_count: bool = False, threshold: float = 0.8, dpi: int = 96, save: bool = False, cmap_color: str = 'Blues', ax: Union[None, List[Axes], ndarray] = None) Union[List[Axes], ndarray]
dimensionality_reduction_scatter(color_by: str = 'rg', save: bool = False, ax: Union[None, List[Axes]] = None, kde_by_ensemble: bool = False, size: int = 10, plotly=False, n_comp=2) List[Axes]

Plot the results of dimensionality reduction using the method specified in the analysis.

Parameters:
  • color_by (str, optional) – The feature extraction method used for coloring points in the scatter plot. Options are “rg”, “prolateness”, “asphericity”, “sasa”, and “end_to_end”. Default is “rg”.

  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • ax (Union[None, List[plt.Axes]], optional) – A list of Axes objects to plot on. Default is None, which creates new axes.

  • kde_by_ensemble (bool, optional) – If True, the KDE plot will be generated for each ensemble separately. If False, a single KDE plot will be generated for the concatenated ensembles. Default is False.

Returns:

List containing Axes objects for the scatter plot of original labels, clustering labels, and feature-colored labels, respectively.

Return type:

List[plt.Axes]

Raises:

NotImplementedError – If the dimensionality reduction method specified in the analysis is not supported.

end_to_end_distances(rg_norm: bool = False, bins: int = 50, hist_range: Tuple = None, violin_plot: bool = True, location: str = 'mean', dpi=96, save: bool = False, color: str = 'blue', multiple_hist_ax=False, ax: Union[None, Axes, ndarray, List[Axes]] = None) Union[Axes, List[Axes]]

Plot end-to-end distance distributions.

Parameters:
  • rg_norm (bool, optional) – Normalize end-to-end distances on the average radius of gyration.

  • bins (int, optional) – The number of bins for the histogram. Default is 50.

  • hist_range (Tuple, optional) – A tuple with a min and max value for the histogram. Default is None, which corresponds to using the min a max value across all data.

  • violin_plot (bool, optional) – If True, a violin plot is visualized. Default is True.

  • location (str, optional) – Select between “median” or “mean” or “both” to show in violin plot. Default value is “mean”

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, plt.Axes, np.ndarray, List[plt.Axes]], optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

  • color (str, optional) – Change the color of the violin plot

  • multiple_hist_ax (bool, optional) – If True, it will plot each histogram in a different axis.

Returns:

The Axes object or a list of Axes objects containing the plot(s).

Return type:

Union[plt.Axes, List[plt.Axes]]

global_sasa(bins: int = 50, hist_range: Tuple = None, violin_plot: bool = True, location: str = 'mean', save: bool = False, dpi=96, color: str = 'blue', multiple_hist_ax: bool = False, ax: Union[None, Axes, ndarray, List[Axes]] = None) Axes

Plot the distribution of SASA for each conformation within the ensembles.

Parameters:
  • bins (int, optional) – The number of bins for the histogram. Default is 50.

  • hist_range (Tuple, optional) – A tuple with a min and max value for the histogram. Default is None, which corresponds to using the min a max value across all data.

  • violin_plot (bool, optional) – If True, a violin plot is visualized. Default is True.

  • location (str, optional) – Select between “median” or “mean” or “both” to show in violin plot. Default is “mean”.

  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • color (str, optional) – Color of the violin plot. Default is blue.

  • multiple_hist_ax (bool, optional) – If True, it will plot each histogram in a different axis.

  • ax (Union[None, plt.Axes, np.ndarray, List[plt.Axes]], optional) – The matplotlib Axes object on which to plot. If None, a new Axes object will be created. Default is None.

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

pca_1d_histograms(save: bool = False, sel_dim=1, ax: Union[None, List[Axes]] = None) List[Axes]

Plot 1D histogram when the dimensionality reduction method is “pca” or “kpca”.

Parameters:
  • save (bool, optional) – If True the plot will be saved in the data directory. Default is False.

  • ax (Union[None, List[plt.Axes]], optional) – A list of Axes objects to plot on. Default is None, which creates new axes.

  • selected_dim (int, optional) – To select the specific component (dimension) for which you want to visualize the histogram distribution. Default is 1.

Returns:

A list of plt.Axes objects representing the subplots created.

Return type:

List[plt.Axes]

pca_2d_landscapes(save: bool = False, ax: Union[None, List[Axes]] = None) List[Axes]

Plot 2D landscapes when the dimensionality reduction method is “pca” or “kpca”.

Parameters:
  • save (bool, optional) – If True the plot will be saved in the data directory. Default is False.

  • ax (Union[None, List[plt.Axes]], optional) – A list of Axes objects to plot on. Default is None, which creates new axes.

Returns:

A list of plt.Axes objects representing the subplots created.

Return type:

List[plt.Axes]

pca_cumulative_explained_variance(save: bool = False, ax: Union[None, Axes] = None) Axes

Plot the cumulative variance. Only applicable when the dimensionality reduction method is “pca”.

Parameters:
  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • ax (Union[None, plt.Axes], optional) – An Axes object to plot on. Default is None, which creates a new axes.

Returns:

The Axes object for the cumulative explained variance plot.

Return type:

plt.Axes

pca_residue_correlation(sel_dims: List[int], save: bool = False, ax: Union[None, List[Axes]] = None) List[Axes]

Plot the correlation between residues based on PCA weights.

Parameters:
  • sel_dims (List[int]) – A list of indices specifying the PCA dimensions to include in the plot.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, List[plt.Axes]], optional) – A list of Axes objects to plot on. Default is None, which creates new axes.

Returns:

A list of plt.Axes objects representing the subplots created.

Return type:

List[plt.Axes]

Notes

This method generates a correlation plot showing the weights of pairwise residue distances for selected PCA dimensions. The plot visualizes the correlation between residues based on the PCA weights.

The analysis is only valid on PCA and kernel PCA dimensionality reduction with ‘ca_dist’ feature extraction.

pca_rg_correlation(save: bool = False, ax: Union[None, List[Axes]] = None) List[Axes]

Examine and plot the correlation between PC dimension 1 and the amount of Rg. Typically high correlation can be detected in case of IDPs/IDRs .

Parameters:
  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • ax (Union[None, List[plt.Axes]], optional) – A list of Axes objects to plot on. Default is None, which creates new axes.

Returns:

A list of plt.Axes objects representing the subplots created.

Return type:

List[plt.Axes]

per_residue_mean_sasa(figsize: Tuple[int, int] = (15, 5), pointer: List[int] = None, save: bool = False, ax: Union[None, Axes] = None) Axes

Plot the average solvent-accessible surface area (SASA) for each residue among all conformations in an ensemble.

Parameters:
  • figsize (Tuple[int, int], optional) – Tuple specifying the size of the figure. Default is (15, 5).

  • pointer (List[int], optional) – List of desired residues to highlight with vertical dashed lines. Default is None.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, plt.Axes], optional) – The matplotlib Axes object on which to plot. If None, a new Axes object will be created. Default is None.

Returns:

Axes object containing the plot.

Return type:

plt.Axes

plot_histogram_grid(feature: str = 'ca_dist', ids: Union[ndarray, List[list]] = None, n_rows: int = 2, n_cols: int = 3, subplot_width: int = 2.0, subplot_height: int = 2.2, bins: Union[str, int] = None, dpi: int = 90) Axes

Plot a grid if histograms for distance or angular features. Can only be be used when analyzing ensembles of proteins with same number of residues. The function will create a new matplotlib figure for histogram grid.

Parameters:
  • feature (str, optional) – Feature to analyze. Must be one of ca_dist (Ca-Ca distances), a_angle (alpha angles), phi or psi (phi or psi backbone angles).

  • ids (Union[list, List[list]], optional) – Residue indices (integers starting from zero) to define the residues to analyze. For angular features it must be a 1d list with N indices of the residues. For distance features it must be 2d list/array of shape (N, 2) in which N is the number of residue pairs to analyze are 2 their indices. Each of the N indices (or pair of indices) will be plotted in an histogram of the grid. If this argument is not provided, random indices will be sampled, which is useful for quickly comparing the distance or angle distributions of multiple ensembles.

  • n_rows (int, optional) – Number of rows in the histogram grid.

  • n_cols (int, optional) – Number of columns in the histogram grid.

  • subplot_width (int, optional) – Use to specify the Matplotlib width of the figure. The size of the figure will be calculated as: figsize = (n_cols*subplot_width, n_rows*subplot_height).

  • subplot_height (int, optional) – See the subplot_width argument.

  • bins (Union[str, int], optional) – Number of bins in all the histograms.

  • dpi (int, optional) – DPI of the figure.

Returns:

ax – The Axes object for the histogram grid.

Return type:

plt.Axes

plot_rama_grid(ids: Union[ndarray, List[list]] = None, n_rows: int = 2, n_cols: int = 3, subplot_width: int = 2.0, subplot_height: int = 2.2, dpi: int = 90) Axes

Plot a grid if Ramachandran plots for different residues. Can only be be used when analyzing ensembles of proteins with same number of residues. The function will create a new matplotlib figure for the scatter plot grid.

Parameters:
  • ids (Union[list, List[list]], optional) – Residue indices (integers starting from zero) to define the residues to analyze. For angular features it must be a 1d list with N indices of the residues. Each of the N indices will be plotted in an scatter plot in the grid. If this argument is not provided, random indices will be sampled, which is useful for quickly comparing features of multiple ensembles.

  • n_rows (int, optional) – Number of rows in the scatter grid.

  • n_cols (int, optional) – Number of columns in the scatter grid.

  • subplot_width (int, optional) – Use to specify the Matplotlib width of the figure. The size of the figure will be calculated as: figsize = (n_cols*subplot_width, n_rows*subplot_height).

  • subplot_height (int, optional) – See the subplot_width argument.

  • dpi (int, optional) – DPI of the figure.

Returns:

ax – The Axes object for the scatter plot grid.

Return type:

plt.Axes

prolateness(bins: int = 50, hist_range: Tuple = None, violin_plot: bool = True, location: str = 'mean', save: bool = False, color: str = 'blue', multiple_hist_ax: bool = False, ax: Union[None, Axes, ndarray, List[Axes]] = None) Axes

Plot prolateness distribution in each ensemble. Prolateness is calculated based on the gyration tensor.

Parameters:
  • bins (int, optional) – The number of bins for the histogram. Default is 50.

  • hist_range (Tuple, optional) – A tuple with a min and max value for the histogram. Default is None, which corresponds to using the min a max value across all data.

  • violin_plot (bool, optional) – If True, a violin plot is visualized. Default is True.

  • location (str, optional) – Select between “median”, “mean”, or “both” to show in violin plot. Default is “mean”.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • color (str, optional) – Color of the violin plot. Default is blue.

  • multiple_hist_ax (bool, optional) – If True, each histogram will be plotted on separate axes. Default is False.

  • ax (Union[None, plt.Axes, np.ndarray, List[plt.Axes]], optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

radius_of_gyration(bins: int = 50, hist_range: Tuple = None, multiple_hist_ax: bool = False, violin_plot: bool = False, location: str = 'mean', dpi: int = 96, save: bool = False, ax: Union[None, Axes, ndarray, List[Axes]] = None, color: str = 'blue') Union[Axes, List[Axes]]

Plot the distribution of the radius of gyration (Rg) within each ensemble.

Parameters:
  • bins (int, optional) – The number of bins for the histogram. Default is 50.

  • hist_range (Tuple, optional) – A tuple with a min and max value for the histogram. Default is None, which corresponds to using the min and max value across all data.

  • multiple_hist_ax (bool, optional) – If True, it will plot each histogram in a different axis.

  • violin_plot (bool, optional) – If True, a violin plot is visualized. Default is False.

  • location (str, optional) – Select between “median” or “mean” or “both” to show in violin plot. Default value is “mean”

  • dpi (int, optional) – The DPI (dots per inch) of the output figure. Default is 96.

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, plt.Axes, np.ndarray, List[plt.Axes]], optional) – The axes on which to plot. If None, new axes will be created. Default is None.

Returns:

Returns a single Axes object or a list of Axes objects containing the plot(s).

Return type:

Union[plt.Axes, List[plt.Axes]]

Notes

This method plots the distribution of the radius of gyration (Rg) within each ensemble in the analysis.

The Rg values are binned according to the specified number of bins (bins) and range (hist_range) and displayed as histograms. Additionally, dashed lines representing the mean and median Rg values are overlaid on each histogram.

ramachandran_plots(two_d_hist: bool = True, linespaces: Tuple = (- 180, 180, 80), save: bool = False, ax: Union[None, Axes, ndarray, List[Axes]] = None) Union[List[Axes], Axes]

Ramachandran plot. If two_d_hist=True it returns a 2D histogram for each ensemble. If two_d_hist=False it returns a simple scatter plot for all ensembles in one plot.

Parameters:
  • two_d_hist (bool, optional) – If True, it returns a 2D histogram for each ensemble. Default is True.

  • linespaces (tuple, optional) – You can customize the bins for 2D histogram. Default is (-180, 180, 80).

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, plt.Axes, np.ndarray, List[plt.Axes]], optional) – The axes on which to plot. If None, new axes will be created. Default is None.

Returns:

If two_d_hist=True, returns a list of Axes objects representing the subplot grid for each ensemble. If two_d_hist=False, returns a single Axes object representing the scatter plot for all ensembles.

Return type:

Union[List[plt.Axes], plt.Axes]

relative_dssp_content(dssp_code='H', save: bool = False, ax: Axes = None) Axes

Plot the relative ss content in each ensemble for each residue.

Parameters:
  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • ax (plt.Axes, optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

  • dssp_code (str, optional) – The selected dssp code , it could be selected between ‘H’ for Helix, ‘C’ for Coil and ‘E’ for strand. It works based on the simplified DSSP codes

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

rg_vs_asphericity(save: bool = False, ax: Axes = None) Axes

Plot the Rg versus Asphericity and get the pearson correlation coefficient to evaluate the correlation between Rg and Asphericity.

Parameters:
  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • ax (plt.Axes, optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

rg_vs_prolateness(save: bool = False, ax: Axes = None) Axes

Plot the Rg versus Prolateness and get the Pearson correlation coefficient to evaluate the correlation between Rg and Prolateness.

Parameters:
  • save (bool, optional) – If True, the plot will be saved in the data directory. Default is False.

  • ax (plt.Axes, optional) – The axes on which to plot. Default is None, which creates a new figure and axes.

Returns:

The Axes object containing the plot.

Return type:

plt.Axes

ss_flexibility(pointer: List[int] = None, figsize: Tuple[int, int] = (15, 5), save: bool = False, ax: Union[None, Axes] = None) Axes

Generate a plot of the site-specific flexibility parameter.

This plot shows the site-specific measure of disorder, which is sensitive to local flexibility based on the circular variance of the Ramachandran angles φ and ψ for each residue in the ensemble. The score ranges from 0 for identical dihedral angles for all conformers at the residue i to 1 for a uniform distribution of dihedral angles at the residue i.

Parameters:
  • pointer (List[int], optional) – A list of desired residues. Vertical dashed lines will be added to point to these residues. Default is None.

  • figsize (Tuple[int, int], optional) – The size of the figure. Default is (15, 5).

  • save (bool, optional) – If True, save the plot as an image file. Default is False.

  • ax (Union[None, plt.Axes], optional) – The matplotlib Axes object on which to plot. If None, a new Axes object will be created. Default is None.

Returns:

The matplotlib Axes object containing the plot.

Return type:

plt.Axes

ss_order(pointer: List[int] = None, figsize: Tuple[int, int] = (15, 5), save: bool = False, ax: Union[None, Axes] = None) Axes

Generate a plot of the site-specific order parameter.

This plot shows the site-specific order parameter, which abstracts from local chain flexibility. The parameter is still site-specific, as orientation correlations in IDRs and IDPs decrease with increasing sequence distance.

Parameters:
  • pointer (List[int], optional) – A list of desired residues. Vertical dashed lines will be added to point to these residues. Default is None.

  • figsize (Tuple[int, int], optional) – The size of the figure. Default is (15, 5).

  • save (bool, optional) – If True, the plot will be saved as an image file. Default is False.

  • ax (Union[None, plt.Axes], optional) – The matplotlib Axes object on which to plot. If None, a new Axes object will be created. Default is None.

Returns:

The matplotlib Axes object containing the plot.

Return type:

plt.Axes

dpet.visualization.plot_comparison_matrix(ax: Axes, comparison_out: ndarray, codes: List[str], confidence_level: float = 0.95, significance_level: float = 0.05, cmap: str = 'viridis_r', title: str = 'New Comparison', cbar_label: str = 'score', textcolors: Union[str, tuple] = ('black', 'white'))

Plot a matrix with all-vs-all comparison scores of M ensembles as a heatmap. If plotting the results of a regular all-vs-all analysis (no bootstraping involved), it will just plot the M x M comparison scores, with empty values on the diagonal. If plotting the results of an all-vs-all analysis with bootstrapping it will plot the M x M confidence intervals for the scores. The intervals are obtained by using the ‘percentile’ method. Additionally, it will plot an asterisk for those non-diagonal entries in for which the inter-ensemble scores are significantly higher than the intra-ensemble scores according to a Mann–Whitney U test.

Parameters:
  • ax (plt.Axes) – Axes object where the heatmap should be created.

  • comparison_out (dict) – A dictionary containing the output of the comparison_scores method of the dpet.ensemble_analysis.EnsembleAnalysis class. It must contain the following key-value pairs: scores: NumPy array with shape (M, M, B) containing the comparison scores for M ensembles and B bootstrap iterations. If no bootstrap analysis was performed, B = 1, otherwise it will be B > 1. p_values (optional): used only when a bootstrap analysis was performed. A (M, M) NumPy array storiging the p-values obtained by comparing with a statistical test the inter-ensemble and intra-ensemble comparison scores.

  • codes (List[str]) – List of strings with the codes of the ensembles.

  • confidence_level (float, optional) – Condifence level for the bootstrap intervals of the comparison scores.

  • significance_level (float, optional) – Significance level for the statistical test used to compare inter and intra-ensemble comparison scores.

  • cmap (str, optional) – Matplotlib colormap name to use in the heatmap.

  • title (str, optional) – Title of the heatmap.

  • cbar_label (str, optional) – Label of the colorbar.

  • textcolors (Union[str, tuple], optional) – Color of the text for each cell of the heatmap, specified as a string. By providing a tuple with two elements, the two colors will be applied to cells with color intensity above/below a certain threshold, so that ligher text can be plotted in darker cells and darker text can be plotted in lighter cells.

Returns:

ax – The same updated Axes object from the input. The comparison_out will be updated to store confidence intervals if performing a bootstrap analysis.

Return type:

plt.Axes

Notes

The comparison matrix is annotated with the scores, and the axes are labeled with the ensemble labels.

dpet.visualization.plot_histogram(ax: Axes, data: List[ndarray], labels: List[str], bins: Union[int, List] = 50, range: Tuple = None, title: str = 'Histogram', xlabel: str = 'x', ylabel: str = 'Density')

Plot an histogram for different features.

Parameters:
  • ax (plt.Axes) – Matplotlib axis object where the histograms will be for plotted.

  • data (List[np.array]) – List of NumPy array storing the data to be plotted.

  • labels (List[str]) – List of strings with the labels of the arrays.

  • bins – Number of bins.

  • range (Tuple, optional) – A tuple with a min and max value for the histogram. Default is None, which corresponds to using the min a max value across all data.

  • title (str, optional) – Title of the axis object.

  • xlabel (str, optional) – Label of the horizontal axis.

  • ylabel (str, optional) – Label of the vertical axis.

Returns:

Axis objects for the histogram plot of original labels.

Return type:

plt.Axes

dpet.visualization.plot_violins(ax: Axes, data: List[ndarray], labels: List[str], location: str = 'mean', title: str = 'Histogram', xlabel: str = 'x', color: str = 'blue')

Make a violin plot.

Parameters:
  • ax (plt.Axes) – Matplotlib axis object where the histograms will be for plotted.

  • data (List[np.array]) – List of NumPy array storing the data to be plotted.

  • labels (List[str]) – List of strings with the labels of the arrays.

  • location (str, optional) – Select between “median” or “mean” to show in violin plot. Default value is “mean”

  • title (str, optional) – Title of the axis object.

  • xlabel (str, optional) – Label of the horizontal axis.

Returns:

Axis objects for the histogram plot of original labels.

Return type:

plt.Axes