dpet.featurization package

Submodules

dpet.featurization.angles module

dpet.featurization.angles.featurize_a_angle(traj: Trajectory, get_names: bool = True, atom_selector: str = 'protein and name CA') Union[ndarray, Tuple[ndarray, List[str]]]

Calculate Cα-based torsion angles angles from a given MD trajectory.

Parameters:
  • traj (mdtraj.Trajectory) – The MDTraj trajectory containing the protein structure.

  • get_names (bool, optional) – If True, returns feature names along with the alpha angles. Default is True.

  • atom_selector (str, optional) – The atom selection string for C-alpha atoms. Default is “protein and name CA”.

Returns:

If get_names is True, returns a tuple with alpha angle values and feature names. If get_names is False, returns an array of alpha angle values.

Return type:

Union[np.ndarray, Tuple[np.ndarray, List[str]]]

Notes

This function calculates Cα-based torsion angles angles from a given MD trajectory.

dpet.featurization.angles.featurize_phi_psi(traj: Trajectory, get_names: bool = True, ravel: bool = True) Union[ndarray, Tuple[ndarray, List[str]]]

Calculate phi (ϕ) and psi (ψ) angles from a given MD trajectory.

Parameters:
  • traj (mdtraj.Trajectory) – The MDTraj trajectory containing the protein structure.

  • get_names (bool, optional) – If True, returns feature names along with the phi and psi angles. Default is True.

  • ravel (bool, optional) – If True, returns the angles in a flattened array. Default is True.

Returns:

If get_names is True, returns a tuple with phi and psi angles along with feature names. If get_names is False, returns an array of phi and psi angles.

Return type:

Union[np.ndarray, Tuple[np.ndarray, List[str]]]

Notes

This function calculates phi (ϕ) and psi (ψ) angles from a given MD trajectory.

dpet.featurization.angles.featurize_tr_angle(traj: Trajectory, type: str, min_sep: int = 2, max_sep: Union[None, int, float] = None, ravel: bool = True, get_names: bool = True) array

Calculate trRosetta angles between pair of residues.

Parameters:
  • traj (mdtraj.Trajectory) – The MDtraj trajectory object.

  • type (str) – The type of angle to calculate. Supported options: ‘omega’ or ‘phi’.

  • min_sep (int, optional) – The minimum sequence separation for angle calculations. Default is 2.

  • max_sep (Union[None, int, float], optional) – The maximum sequence separation for angle calculations. Default is None.

  • ravel (bool, optional) – Whether to flatten the output array. Default is True.

  • get_names (bool, optional) – Whether to return the names of the calculated features. Default is True.

Returns:

angles – A numpy array storing the angle values.

Return type:

numpy.ndarray

Notes

This function calculates trRosetta angles between pairs of residues. For more information, refer to the original trRosetta paper (https://pubmed.ncbi.nlm.nih.gov/31896580/).

dpet.featurization.angles.featurize_tr_omega(traj: Trajectory, min_sep: int = 2, max_sep: int = None, ravel: bool = True, get_names: bool = True) Union[ndarray, Tuple[ndarray, List[str]]]

Calculate omega angles from trRosetta. These angles are torsion angles defined between a pair of residues i and j and involving the following atoms:

Ca(i) – Cb(i) – Cb(j) – Ca(j)

If a residue does not have a Cb atom, a pseudo-Cb will be added automatically.

Parameters:
  • traj (mdtraj.Trajectory) – The MDTraj trajectory containing the protein structure.

  • min_sep (int, optional) – The minimum separation between residues for computing omega angles. Default is 2.

  • max_sep (int, optional) – The maximum separation between residues for computing omega angles. Default is None.

  • ravel (bool, optional) – If True, returns a flattened array of omega angle values. Default is True.

  • get_names (bool, optional) – If True, returns feature names along with the omega angles. Default is True.

Returns:

If ravel is True, returns a flattened array of omega angle values. If ravel is False, returns a 3D array of omega angle values. If get_names is True, returns a tuple with feature names along with the omega angles.

Return type:

Union[np.ndarray, Tuple[np.ndarray, List[str]]]

Notes

This function calculates omega angles from trRosetta for a given MD trajectory.

dpet.featurization.angles.featurize_tr_phi(traj: Trajectory, min_sep: int = 2, max_sep: int = None, ravel: bool = True, get_names: bool = True) Union[ndarray, Tuple[ndarray, List[str]]]

Calculate phi angles from trRosetta. These angles are defined between a pair of residues i and j and involve the following atoms:

Ca(i) – Cb(i) – Cb(j)

If a residue does not have a Cb atom, a pseudo-Cb will be added automatically.

Parameters:
  • traj (mdtraj.Trajectory) – The MDTraj trajectory containing the protein structure.

  • min_sep (int, optional) – The minimum separation between residues for computing phi angles. Default is 2.

  • max_sep (int, optional) – The maximum separation between residues for computing phi angles. Default is None.

  • ravel (bool, optional) – If True, returns a flattened array of phi angle values. Default is True.

  • get_names (bool, optional) – If True, returns feature names along with the phi angles. Default is True.

Returns:

If ravel is True, returns a flattened array of phi angle values. If ravel is False, returns a 3D array of phi angle values. If get_names is True, returns a tuple with feature names along with the phi angles.

Return type:

Union[np.ndarray, Tuple[np.ndarray, List[str]]]

Notes

This function calculates phi angles from trRosetta for a given MD trajectory.

dpet.featurization.angles.get_angles(a, b, c)

Calculate planar angles defined by 3 sets of points.

dpet.featurization.angles.get_dihedrals(a, b, c, d)

calculate dihedral angles defined by 4 sets of points.

dpet.featurization.distances module

dpet.featurization.distances.calc_ca_dmap(traj: Trajectory) ndarray

Calculate the (N, L, L) distance maps between C-alpha atoms for visualization.

Parameters:

traj (mdtraj.Trajectory) – The MDtraj trajectory object.

Returns:

dmap – The distance maps of shape (N, L, L), where N is the number of frames and L is the number of C-alpha atoms.

Return type:

numpy.ndarray

Notes

This function calculates the distance maps between C-alpha atoms for visualization purposes.

dpet.featurization.distances.calc_com_dmap(traj: Trajectory) ndarray

Calculate the (N, L, L) distance maps between center of mass (COM) atoms for visualization.

Parameters:

traj (mdtraj.Trajectory) – The MDtraj trajectory object.

Returns:

dmap – The distance maps of shape (N, L, L), where N is the number of frames and L is the number of center of mass (COM) atoms.

Return type:

numpy.ndarray

Notes

This function calculates the distance maps between center of mass (COM) atoms for visualization purposes.

dpet.featurization.distances.featurize_ca_dist(traj: Trajectory, get_names: bool = True, atom_selector: str = 'name CA', *args, **kwargs) Union[ndarray, Tuple[ndarray, List[str]]]

Calculate C-alpha distances between pairs of residues.

Parameters:
  • traj (mdtraj.Trajectory) – The MDtraj trajectory object.

  • min_sep (int, optional) – The minimum sequence separation for distance calculations. Default is 2.

  • max_sep (int or None, optional) – The maximum sequence separation for distance calculations. Default is None.

  • inverse (bool, optional) – Whether to calculate inverse distances. Default is False.

  • get_names (bool, optional) – Whether to return the names of the calculated features. Default is True.

  • atom_selector (str, optional) – The atom selection string. Default is “name CA”.

Returns:

distances – The calculated C-alpha distances. If get_names is True, returns a tuple containing distances and corresponding feature names.

Return type:

numpy.ndarray or Tuple

Notes

This function calculates C-alpha distances between pairs of residues.

dpet.featurization.distances.featurize_com_dist(traj: Trajectory, min_sep: int = 2, max_sep: int = None, inverse: bool = False, get_names: bool = True, atom_selector: str = 'name == CA') Union[ndarray, Tuple[ndarray, List[str]]]

Calculate center of mass (COM) distances between pairs of residues.

Parameters:
  • traj (mdtraj.Trajectory) – The MDtraj trajectory object.

  • min_sep (int, optional) – The minimum sequence separation for distance calculations. Default is 2.

  • max_sep (int or None, optional) – The maximum sequence separation for distance calculations. Default is None.

  • inverse (bool, optional) – Whether to calculate inverse distances. Default is False.

  • get_names (bool, optional) – Whether to return the names of the calculated features. Default is True.

  • atom_selector (str, optional) – The atom selection string. Default is “name == CA”.

Returns:

distances – The calculated center of mass (COM) distances. If get_names is True, returns a tuple containing distances and corresponding feature names.

Return type:

numpy.ndarray or Tuple

Notes

This function calculates center of mass (COM) distances between pairs of residues.

dpet.featurization.distances.rmsd(traj: Trajectory)

dpet.featurization.ensemble_level module

Calculate features at the ensemble level.

dpet.featurization.ensemble_level.calc_flory_scaling_exponent(traj: Trajectory) Tuple[float]

Calculate the apparent Flory scaling exponent in an ensemble. Code adapted from:

Parameters:

traj (mdtraj.Trajectory) – Input trajectory object.

Returns:

resuts – Tuple containing the nu (Flory scaling exponent), error on nu, R0 and error on R0. All values are calculated by fitting the internal scaling profiles. For more information see https://pubmed.ncbi.nlm.nih.gov/38297118/ and https://pubs.acs.org/doi/full/10.1021/acs.jpcb.3c01619.

Return type:

tuple

dpet.featurization.ensemble_level.calc_flory_scaling_exponent_cg(traj: Trajectory) Tuple[float]

Calculate the apparent Flory scaling exponent in an ensemble. Code adapted from:

Parameters:

traj (mdtraj.Trajectory) – Input trajectory object.

Returns:

resuts – Tuple containing the nu (Flory scaling exponent), error on nu, R0 and error on R0. All values are calculated by fitting the internal scaling profiles. For more information see https://pubmed.ncbi.nlm.nih.gov/38297118/ and https://pubs.acs.org/doi/full/10.1021/acs.jpcb.3c01619.

Return type:

tuple

dpet.featurization.glob module

dpet.featurization.glob.compute_asphericity(trajectory: Trajectory)
dpet.featurization.glob.compute_end_to_end_distances(trajectory: Trajectory, atom_selector: str, rg_norm: bool = False)
dpet.featurization.glob.compute_ensemble_sasa(trajectory: Trajectory)
dpet.featurization.glob.compute_prolateness(trajectory: Trajectory)

dpet.featurization.utils module

dpet.featurization.utils.get_max_sep(L: int, max_sep: Union[None, int, float]) int

Get the maximum separation between indices.

Parameters:
  • L (int) – The size of the matrix.

  • max_sep (Union[None, int, float]) – The maximum separation between indices.

Returns:

The maximum separation between indices.

Return type:

int

Notes

This function calculates the maximum separation between indices based on the size of the matrix and the provided maximum separation value.

dpet.featurization.utils.get_triu_indices(L: int, min_sep: int = 1, max_sep: Union[None, int, float] = None) List[list]

Get the upper triangle indices of a square matrix with specified minimum and maximum separations.

Parameters:
  • L (int) – The size of the square matrix.

  • min_sep (int, optional) – The minimum separation between indices. Default is 1.

  • max_sep (Union[None, int, float], optional) – The maximum separation between indices. Default is None.

Returns:

A list of lists containing the upper triangle indices of the square matrix.

Return type:

List[list]

Notes

This function returns the upper triangle indices of a square matrix with the specified minimum and maximum separations.

Module contents