libpdb: PDB helping functions

Parse molecular structures in PDB format.

haddock.libs.libpdb.add_TER_on_chain_breaks(input_pdb: str | Path, output_pdb: str | Path) → None[source]

Detect chain breaks and add TER statements between them.

Parameters:

input_pdb (FilePath) – Input PDB filepath with potential chain breaks.
output_pdb (FilePath) – Output PDB filepath with added TER statements between chain breaks.

haddock.libs.libpdb.check_combination_chains(combination: list[PDBFile]) → list[str][source]: Check if chain IDs are unique for each pdb in combination.

haddock.libs.libpdb.format_atom_name(atom: str, element: str) → str[source]

Format PDB atom name.

Further Reading:

https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/tutorials/pdbintro.html

Parameters:

atom (str) – The atom name.
element (str) – The atom element code.

Returns:

str – Formatted atom name.

haddock.libs.libpdb.get_new_models(pdb_file_path: str | Path) → list[Path][source]

Get new PDB models if they exist.

If no new models are found, return the original path within a list.

haddock.libs.libpdb.get_pdb_file_suffix_variations(file_name: str | Path, sep: str = '_') → list[Path][source]

List suffix variations of a PDB file in the current path.

If file.pdb is given, and files file_1.pdb, file_2.pdb, exist in the folder, those will be listed.

Parameters:

file_name (str or Path) – The name of the file with extension.
sep (str) – The separation between the file base name and the suffix. Defaults to “_”.

Returns:

list – List of Paths with the identified PBD files. If no files are found return an empty list.

haddock.libs.libpdb.get_supported_residues(haddock_topology: str | Path) → list[str][source]: Read the topology file and identify which data is supported.

haddock.libs.libpdb.identify_chainseg(pdb_file_path: str | Path, sort: bool = True) → tuple[list[str], list[str]][source]: Return segID OR chainID.

haddock.libs.libpdb.read_RECORD_section(lines: ~typing.Iterable[str], section_slice: slice, func: ~typing.Callable[[~typing.Iterable[str]], ~typing.Iterable[str]] = <class 'set'>) → Iterable[str][source]

Create a set of observations from a section of the ATOM line.

Returns:: set – A set of the observations.

haddock.libs.libpdb.read_chainids(lines: ~typing.Iterable[str], *, section_slice: slice = slice(21, 22, None), func: ~typing.Callable[[~typing.Iterable[str]], ~typing.Iterable[str]] = <class 'list'>) → Iterable[str]

Create a set of observations from a section of the ATOM line.

Returns:: set – A set of the observations.

haddock.libs.libpdb.read_segids(lines: ~typing.Iterable[str], *, section_slice: slice = slice(72, 76, None), func: ~typing.Callable[[~typing.Iterable[str]], ~typing.Iterable[str]] = <class 'list'>) → Iterable[str]

Create a set of observations from a section of the ATOM line.

Returns:: set – A set of the observations.

haddock.libs.libpdb.sanitize(pdb_file_path: FilePathT, overwrite: bool = True, custom_topology: str | Path | None = None) → FilePathT | Path[source]: Sanitize a PDB file.

haddock.libs.libpdb.split_by_chain(pdb_file_path: str | Path) → list[Path][source]: Split a PDB file into multiple structures for each chain.

haddock.libs.libpdb.split_ensemble(pdb_file_path: Path, dest: str | Path | None = None) → list[Path][source]

Split a multimodel PDB file into different structures.

Parameters:: dest (str or pathlib.Path) – Destination folder.

haddock.libs.libpdb.swap_segid_chain(pdb_file_path: str | Path, new_pdb_file_path: str | Path) → None[source]: Add to the Chain ID column the found Segid.

haddock.libs.libpdb.tidy(pdb_file_path: str | Path, new_pdb_file_path: str | Path) → None[source]: Tidy PDB structure.