Welcome to PBxplore’s documentation!

PBxplore is a suite of tools dedicated to Protein Block analysis.

Protein Blocks (PBs) are structural prototypes defined by de Brevern et al [1]. The 3-dimensional local structure of a protein backbone can be modelized as an 1-dimensional sequence of PBs. In principle, any conformation of any amino acid could be represented by one of the sixteen available Protein Blocks.

_images/PBs.jpg

Schematic representation of the sixteen protein blocks, labeled from a to p (Creative commons 4.0 CC-BY).

PBxplore provides both a Python library and command-line tools (Utilization). For some features (reading trajectory, plots), PBxplore requires some optional dependencies (Installation). Basically, PBxplore can:

  • assign PBs from either a PDB or either a molecular dynamics trajectory (PB assignation).
  • use analysis tools to perform statistical analysis on PBs (Statistics).
  • use analysis tools to study protein flexibility and deformability (Analyzis).

Protein Blocks

Protein Blocks (PBs) are structural prototypes defined by de Brevern et al [1]. The 3-dimensional local structure of a protein backbone can be modelized as an 1-dimensional sequence of PBs. In principle, any conformation of any amino acid could be represented by one of the sixteen available Protein Blocks.

PBs are labeled from a to p (see Figure 1). PBs m and d can be roughly described as prototypes for alpha-helix and central beta-strand, respectively. PBs a to c primarily represent beta-strand N-caps and PBs e and f, beta-strand C-caps; PBs a to j are specific to coils, PBs k and l to alpha-helix N-caps, and PBs n to p to alpha-helix C-caps.

_images/PBs.jpg

Figure 1. Schematic representation of the sixteen protein blocks, labeled from a to p (Creative commons CC-BY).

_images/1AY7_B.jpg

Figure 2. 3D representation of the barstar protein (PDB ID 1AY7, chain B) (Creative commons CC-BY).

For instance, the 3D-structure of the barstar protein represented in Figure 2 can be translated into a 1D-sequence of PBs:

ZZdddfklpcbfklmmmmmmmmnopafklgoiaklmmmmmmmmpacddddddehkllmmmmnnommmmmmmmmmmmmmnopacddddZZ

The conformations of the 89 residues of the barstar are translated into a sequence of 89 PBs. Note that “Z” corresponds to amino acids for which a PB cannot be assigned. As a matter of fact, the assignment of a given residue n requires is based on the conformations of residues n-2, n-1, n, n+1 and n+2. Therefore, no PB can be assigned to the two first (N-termini) and two last (C-termini) residues of a polypeptide chain.

Neq

Neq is calculated as follows:

_images/Neq.jpg

Where fx is the probability of PB x. Neq quantifies the average number of PBs at a given position in the protein sequence. A Neq value of 1 indicates that only one type of PB is observed. On the opposite, a Neq value of 16 is equivalent to a fully random distribution of PBs.

[1]
    1. de Brevern, C. Etchebest, and S. Hazout. Bayesian Probabilistic Approach for Predicting Backbone Structures in Terms of Protein Blocks. Proteins 41:271-87 (2000).

Installation

Although it’s not possible to cover all possible ways to install PBxplore on any operating system, we try in this document to provide few guidelines regarding PBxplore setup.

Supported Platforms

Currently, PBxplore run with Python >=3.6 on Linux and Mac OS X.

Dependencies

PBxplore requires the following libraries (which will be installed through ` pip`).

NumPy >= 1.6.0
NumPy is the base package for numerical computing in python.
matplotlib [1] >= 1.4.0
All ploting functions use the matplotlib package.
MDAnalysis [2] >= 0.11
We use MDAnalysis to read Gromacs molecular dynamics trajectories. Many other trajectory files are also supported, see list here.

Optionally, PBxplore can use the following packages:

Weblogo3 [3]
Weblogo3 is required to create logo from PB sequences. It has to be installed by the user.

Installing PBxplore

The most straightforward way is to use pip. It will also install the required dependencies:

$ pip install --user pbxplore

If none of the dependencies were previously installed, the installation process can take up to several minutes.

Note

The former command will install the PBxplore command-line scripts in:

  • $HOME/.local/bin on Linux.
  • ~/Library/Python/X.Y/bin on Mac OSX, where X.Y stands for the Python version (for instance 3.8 for Python 3.8).

Do not forget to update your $PATH environment variable to make all PBxplore tools accessible. As an example, with the Bash shell, this gives:

$ # for Linux
$ export PATH=$PATH:$HOME/.local/bin

$ # for Mac OSX
$ export PATH=$PATH:~/Library/Python/X.Y/bin

See the documentation of pip for more information. You may also want to look at virtualenv.

Upgrading PBxplore

Be sure to always have the latest version of PBxplore with:

$ pip install --user --upgrade pbxplore

Testing PBxplore

PBxplore comes with unit tests and regression tests. It requires the package pytest. You can run the tests within Python:

import pbxplore
pbxplore.test()

Uninstalling PBxplore

Run the simple command:

$ pip uninstall pbxplore

PBxplore for advanced users

You can clone PBxplore from GitHub:

$ git clone --depth 1 https://github.com/pierrepo/PBxplore.git

Once in the PBxplore directory, we advise you to create a virtual environment:

$ pip3 install --user virtualenv
$ virtualenv -p python3 venv
$ source venv/bin/activate

You can then install the latest version of PBxplore as a Python module:

$ pip install -e .

You can also run unit tests and regression tests:

$ pip install pytest
$ pytest -v pbxplore/tests

or

$ pip install pytest
$ python setup.py test
[1]J. D. Hunter. Matplotlib: A 2D graphics environment. Computing In Science and Engineering 9 (2007), 90-95. doi:10.1109/MCSE.2007.55
[2]N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J. Comput. Chem. 32 (2011), 2319–2327. doi:10.1002/jcc.21787
[3]G. E. Crooks, G. Hon, J.-M. Chandonia, and S. E. Brenner. WebLogo: A Sequence Logo Generator. Genome Research 14: 1188–90 (2004) doi:10.1101/gr.849004

Utilization

There is 2 ways to use PBxplore:
  • command-line scripts
  • API

Command-line scripts

PBxplore is bundle with a suite of command-line tools. Once installed, they should be available in your $PATH. Here the list:

  • PBassign assign Protein Blocks (PBs) from a set of protein structures.
  • PBcount computes the frequency of PBs at each position along the amino acid sequence.
  • PBstat generates frequency and logo plots, and estimates the Neq.

API

PBxplore is also a python library and provides an API to interfacing with Python applications. You will find here a list of notebooks to help you use the API.

PBassign

PBassign assigns a PB sequence to a protein structure.

Example

First download data:

$ wget https://files.rcsb.org/view/3ICH.pdb

Then perform PBs assignment:

$ PBassign -p 3ICH.pdb -o 3ICH
1 PDB file(s) to process
Read 1 chain(s) in 3ICH.pdb
wrote 3ICH.PB.fasta

Content of 3ICH.PB.fasta :

>3ICH.pdb | chain A
ZZccdfbdcdddddehjbdebjcdddddfklmmmlmmmmmmmmnopnopajeopacfbdc
ehibacehiamnonopgocdfkbjbdcdfblmbccfbghiacdddebehiafkbccddfb
dcfklgokaccfbdcfbhklmmmmmmmpccdfkopafbacddfbgcddddfbacddddZZ

Note that Protein Blocs assignment is only possible for proteins (as its name suggests). As a consequence, processed PDB files must contain protein structures only (please remove any other molecule). In addition, the PDB parser implemented here is pretty straightforward. Be sure your PDB files complies with the ATOM field of the PDB format.

Usage

Here’s the PBassign help text.

Usage: PBassign [options] -p file.pdb|dir [-p file2.pdb] -o output_root_name -g gro_file -x xtc_file

optional arguments:
  -h, --help     show this help message and exit
  -p P           name of a pdb file or name of a directory containing pdb
                 files
  -o O           name for results
  -v, --version  show program's version number and exit

other options to handle molecular dynamics trajectories:
  -x X           name of the topology file
  -g G           name of the trajectory file

-p option

can be used several times. For instance:

$ wget https://files.rcsb.org/view/3ICH.pdb
$ wget https://files.rcsb.org/view/1BTA.pdb
$ wget https://files.rcsb.org/view/1AY7.pdb
$ PBassign -p 3ICH.pdb -p 1BTA.pdb -p 1AY7.pdb -o test1
3 PDB file(s) to process
Read 1 chain(s) in 3ICH.pdb
Read 1 chain(s) in 1BTA.pdb
Read 2 chain(s) in 1AY7.pdb
wrote test1.PB.fasta

All PB assignments are written in the same output file. If a PDB file contains several chains and/or models, PBs assignments are also written in a single output file. From the previous example, the ouput of test1.PB.fasta is:

>3ICH.pdb | chain A
ZZccdfbdcdddddehjbdebjcdddddfklmmmlmmmmmmmmnopnopajeopacfbdc
ehibacehiamnonopgocdfkbjbdcdfblmbccfbghiacdddebehiafkbccddfb
dcfklgokaccfbdcfbhklmmmmmmmpccdfkopafbacddfbgcddddfbacddddZZ
>1BTA.pdb | chain A
ZZdddfklonbfklmmmmmmmmnopafklnoiaklmmmmmnoopacddddddehkllmmm
mngoilmmmmmmmmmmmmnopacdcddZZ
>1AY7.pdb | chain A
ZZbjadfklmcfklmmmmmmmmnnpaafbfkgopacehlnomaccddehjaccdddddeh
klpnbjadcdddfbehiacddfegolaccdddfkZZ
>1AY7.pdb | chain B
ZZcddfklpcbfklmmmmmmmmnopafklgoiaklmmmmmmmmpacddddddehkllmmm
mnnommmmmmmmmmmmmmnopacddddZZ

One can also use the -p option to provide a directory containing PDB files as an input. PBassign will process all PDB files located in the PBdata directory:

$ wget https://files.rcsb.org/view/1AY7.pdb -P demo
$ wget https://files.rcsb.org/view/2LFU.pdb -P demo
$ wget https://files.rcsb.org/view/3ICH.pdb -P demo
$ wget https://files.rcsb.org/view/1BTA.pdb -P demo
$ PBassign -p demo/ -o test2
4 PDB file(s) to process
Read 1 chain(s) in demo/3ICH.pdb
Read 2 chain(s) in demo/1AY7.pdb
Read 1 chain(s) in demo/1BTA.pdb
Read 10 chain(s) in demo/2LFU.pdb
wrote test2.PB.fasta

-x and -g options

Warning

These options use the MDAnalysis library which is installed by PBxplore.

Instead using the -p option, protein structures could come from a molecular dynamics simulation trajectory file. For this, you have to specify a trajectory file with the -x option and a topology file with the -g option. It will accept any trajectory file format handled by the MDAnalysis library. See their [table of supported formats](https://pythonhosted.org/MDAnalysis/documentation_pages/coordinates/init.html#id1) for the full list. Here an example with GROMACS files.

$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.gro
$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.xtc
$ PBassign -x psi_md_traj.xtc -g psi_md_traj.gro -o psi_md_traj
Frame 1/225.
Frame 100/225.
Frame 200/225.
Frame 225/225.
wrote psi_md_traj.PB.fasta

If needed, you can download psi_md_traj.PB.fasta [here](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.PB.fasta).

Tips’n tricks

To flatten the PB sequences obtained in FASTA format, i.e. get PB sequences in a single line each, one solution could be:

$ wget https://files.rcsb.org/view/1AY7.pdb
$ PBassign -p 1AY7.pdb -o 1AY7
$ cat 1AY7.PB.fasta | sed "s/^>.*/\t/" | tr -d "\n" | tr "\t" "\n" > 1AY7.PB.flat

Content of 1AY7.PB.flat :

ZZbjadfklmcfklmmmmmmmmnnpaafbfkgopacehlnomaccddehjaccdddddehklpnbjadcdddfbehiacddfegolaccdddfkZZ
ZZcddfklpcbfklmmmmmmmmnopafklgoiaklmmmmmmmmpacddddddehkllmmmmnnommmmmmmmmmmmmmnopacddddZZ

PBcount

PBcount computes the frequency of PBs at each position along the amino acid sequence.

Note

The following examples require psi_md_traj.PB.fasta and
psi_md_traj2.PB.fasta obtained with PBassign:
$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1.gro
$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1.xtc
$ PBassign -x psi_md_traj_1.xtc -g psi_md_traj_1.gro -o psi_md_traj_1
Frame 1/225.
Frame 100/225.
Frame 200/225.
Frame 225/225.
wrote psi_md_traj_1.PB.fasta

$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_2.gro
$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_2.xtc
$ PBassign -x psi_md_traj_2.xtc -g psi_md_traj_2.gro -o psi_md_traj_2
Frame 1/225.
Frame 100/225.
Frame 200/225.
Frame 225/225.
wrote psi_md_traj_2.PB.fasta

If needed, you can download [psi_md_traj_1.PB.fasta](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1.PB.fasta) and [psi_md_traj_2.PB.fasta](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_2.PB.fasta).

Example

$ PBcount -f psi_md_traj_1.PB.fasta -o psi_md_traj_1
read 225 sequences in psi_md_traj_1.PB.fasta
wrote psi_md_traj_1.PB.count

Content of psi_md_traj_1.PB.count:

         a     b     c     d     e     f     g     h     i     j     k     l     m     n     o     p
1        0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
2        0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
3        0     0     0     4     0   221     0     0     0     0     0     0     0     0     0     0
4        0     0     0     0     0     5     0     0     0     0   220     0     0     0     0     0
[snip]
51       0     0     0     0     0    56     0    98     0     0    71     0     0     0     0     0
52       0    56     0     0     0     0     0     0    94     3     3    69     0     0     0     0
53     144     0    60     2     0     0     0     0     0     0     0     0     1     0     0    18
54       0     0   225     0     0     0     0     0     0     0     0     0     0     0     0     0
55       0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
56       0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0

Note that residues 1, 2, 55 and 56 have a null count of all PBs. These residues are the first and last residues of the structure and no PB can be assigned to them.

If needed, you can download [psi_md_traj_1.PB.count](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1.PB.count).

Usage

Here’s the PBcount help text.

usage: PBcount [-h] -f F -o O [--first-residue FIRST_RESIDUE]

Compute PB frequency along protein sequence.

optional arguments:
  -h, --help            show this help message and exit
  -f F                  name(s) of the PBs file (in fasta format)
  -o O                  name for results
  --first-residue FIRST_RESIDUE
                        define first residue number (1 by default)
  -v, --version         show program's version number and exit

-f option

can be used several times:

$ PBcount -f psi_md_traj_1.PB.fasta -f psi_md_traj_2.PB.fasta -o psi_md_traj_all
read 225 sequences in psi_md_traj_1.PB.fasta
read 225 sequences in psi_md_traj_2.PB.fasta
wrote psi_md_traj_all.PB.count

If needed, you can download [psi_md_traj_all.PB.count](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_all.PB.count).

–first-residue option

By default, the number of the first residue is 1, this option allows to adjust the number associated to the first residue (and to the followings automaticaly).

$ PBcount --first-residue 5 -f psi_md_traj_1.PB.fasta -o psi_md_traj_1_shifted
read 225 sequences in psi_md_traj_1.PB.fasta
wrote psi_md_traj_1_shifted.PB.count

Content of psi_md_traj_1_shifted.PB.count:

         a     b     c     d     e     f     g     h     i     j     k     l     m     n     o     p
5        0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
6        0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
7        0     0     0     4     0   221     0     0     0     0     0     0     0     0     0     0
8        0     0     0     0     0     5     0     0     0     0   220     0     0     0     0     0
9        0   222     0     0     0     0     0     0     0     0     0     3     0     0     0     0
10       6     0   201     0     0     5     0     0     0     0     0     0    12     0     1     0
[snip]

If needed, you can download [psi_md_traj_1_shifted.PB.count](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1_shifted.PB.count).

PBstat

PBstat generates frequency and logo plots, and estimates something similar to entropy called the equivalent number of PBs (Neq).

Note

The following examples require psi_md_traj_all.PB.count obtained with PBassign and then PBcount:

$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1.gro
$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_1.xtc
$ PBassign -x psi_md_traj_1.xtc -g psi_md_traj_1.gro -o psi_md_traj_1
Frame 1/225.
Frame 100/225.
Frame 200/225.
Frame 225/225.
wrote psi_md_traj_1.PB.fasta

$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_2.gro
$ wget https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_2.xtc
$ PBassign -x psi_md_traj_2.xtc -g psi_md_traj_2.gro -o psi_md_traj_2
Frame 1/225.
Frame 100/225.
Frame 200/225.
Frame 225/225.
wrote psi_md_traj_2.PB.fasta

$ PBcount -f psi_md_traj_1.PB.fasta -f psi_md_traj_2.PB.fasta -o psi_md_traj_all
read 225 sequences in psi_md_traj_1.PB.fasta
read 225 sequences in psi_md_traj_2.PB.fasta
wrote psi_md_traj_all.PB.count

Warning

To generate Weblogo-like representations, Weblogo3 is required.

If needed, you can download [psi_md_traj_all.PB.count](https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj_all.PB.count).

Example

$ PBstat -f psi_md_traj_all.PB.count --map --neq --logo -o psi_md_traj_all
Index of first residue in psi_md_traj_all.PB.count is 1
First residue in the output file(s) is 1
wrote psi_md_traj_all.PB.map.png
wrote psi_md_traj_all.PB.Neq
wrote psi_md_traj_all.PB.Neq.png
wrote psi_md_traj_all.PB.logo.png

Usage

Here’s the PBstat help text.

usage: PBstat [-h] -f F -o O [--map] [--neq] [--logo]
              [--image-format {pdf,png,jpg}] [--residue-min RESIDUE_MIN]
              [--residue-max RESIDUE_MAX]

Statistical analysis and graphical representations of PBs.

optional arguments:
  -h, --help            show this help message and exit
  -f F                  name of file that contains PBs frequency (count)
  -o O                  name for results
  --map                 generate map of the distribution of PBs along protein
                        sequence
  --neq                 compute Neq and generate Neq plot along protein
                        sequence
  --logo                generate logo representation of PBs frequency along
                        protein sequence
  --image-format {pdf,png,jpg}
                        File format for all image output.
  --residue-min RESIDUE_MIN
                        defines lower bound of residue frame
  --residue-max RESIDUE_MAX
                        defines upper bound of residue frame
  -v, --version         show program's version number and exit

–map option

generates map of the distribution of PBs along protein sequence.

$ PBstat -f psi_md_traj_all.PB.count --map -o psi_md_traj_all
Index of first residue in psi_md_traj_all.PB.count is 1
First residue in the output file(s) is 1
wrote psi_md_traj_all.PB.map.png
_images/psi_md_traj_all.PB.map.jpg

Distribution of PBs

The color range goes from red to blue. For a given position in the protein sequence, blue corresponds to a null frequency (meaning the particular PB is never met a this position) and red corresponds to a frequency of 1 (meaning the particular PB is always found at this position).

–neq option

computes Neq and generates Neq plot along protein sequence.

$ PBstat -f psi_md_traj_all.PB.count --neq -o psi_md_traj_all
Index of first residue in psi_md_traj_all.PB.count is 1
First residue in the output file(s) is 1
wrote psi_md_traj_all.PB.Neq
wrote psi_md_traj_all.PB.Neq.png

Content of psi_md_traj_all.PB.Neq:

resid       Neq
1          1.00
2          1.00
3          1.90
4          1.91
5          2.87
6          2.30
[snip]
_images/psi_md_traj_all.PB.Neq.jpg

Neq versus residue number

–logo option

generates WebLogo-like representation of PBs frequency along protein sequence.

Warning

This option requires Weblogo3.

$ PBstat -f psi_md_traj_all.PB.count --logo -o psi_md_traj_all
Index of first residue is: 1
wrote psi_md_traj_all.PB.logo.png
_images/psi_md_traj_all.PB.logo.jpg

Logo representation of PBs frequency

–residue-min and –residue-max options

These options define the lower and upper bound of residue frame.

$ PBstat -f psi_md_traj_all.PB.count --map --neq --logo -o psi_md_traj_all_frame --residue-min 15 --residue-max 42
Index of first residue in psi_md_traj_all.PB.count is 1
First residue in the output file(s) is 15
wrote psi_md_traj_all_frame.PB.map.15-42.png
wrote psi_md_traj_all_frame.PB.Neq.15-42
wrote psi_md_traj_all_frame.PB.Neq.15-42.png
wrote psi_md_traj_all_frame.PB.logo.15-42.png
_images/psi_md_traj_all_frame.PB.map.15-42.jpg

PBs distribution with residue frame

_images/psi_md_traj_all_frame.PB.Neq.15-42.jpg

Neq versus residue number with residue frame

_images/psi_md_traj_all_frame.PB.logo.15-42.jpg

Logo representation of PBs frequency

–image-format option

All figure can be produced in either PNG, PDF, or JPEG format. The –image-format option allows to control the file format of the image outputs.

API Cookbook

This page provides some examples, tutorials as notebooks to help you use the API of PBxplore.

Basically, PBxplore is structured around 3 modules:

PBxplore.structure

This module handles the reading of PDB files and trajectory files. Its 2 major functions chain_from_files() and chain_from_trajectory are direclty exposed at the package level.

Look at the notebook PB assignation for further information.

PBxplore.io

This module is about the I/O of all files other than PDB. It handles the reading/writing of fasta files and the writing of text-like analysis files (Neq files).

Look at the notebook Writing PB in file for further information.

PBxplore.analysis

This module handle all analysis functions and ploting functions. You can:

  • generate map of the distribution of PBs along protein sequence with plot_map().
  • compute Neq with compute_neq() and generate the plot with plot_neq().
  • generate WebLogo-like representation of PBs frequency along protein sequence with generate_weblogo().

Look at the notebook Visualize protein deformability for further information.

Note

generate_weblogo() requires Weblogo3.

PB assignation

We hereby demonstrate how to use the API to assign PB sequences.

[1]:
from pprint import pprint
import urllib.request
import os

# print date & versions
import datetime
print("Date & time:",datetime.datetime.now())
import sys
print("Python version:", sys.version)
Date & time: 2021-01-27 13:55:14.789953
Python version: 3.7.9 (default, Oct 19 2020, 15:13:17)
[GCC 7.5.0]
[2]:
import pbxplore as pbx
print("PBxplore version:", pbx.__version__)
PBxplore version: 1.4.0
Use the built-in structure parser
Assign PB for a single structure

The pbxplore.chains_from_files() function is the prefered way to read PDB and PDBx/mmCIF files using PBxplore. This function takes a list of file path as argument, and yield each chain it can read from these files. It provides a single interface to read PDB and PDBx/mmCIF files, to read single model and multimodel files, and to read a single file of a collection of files.

Here we want to read a single file with a single model and a single chain. Therefore, we need the first and only record that is yield by pbxplore.chains_from_files(). This record contains a name for the chain, and the chain itself as a pbxplore.structure.structure.Chain object. Note that, even if we want to read a single file, we need to provide it as a list to pbxplore.chains_from_files().

[3]:
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/1BTA.pdb', '1BTA.pdb')

structure_reader = pbx.chains_from_files([pdb_name])
chain_name, chain = next(structure_reader)
print(chain_name)
print(chain)
1BTA.pdb | chain A
Chain A / model : 1434 atoms

Protein Blocks are assigned based on the dihedral angles of the backbone. So we need to calculate them. The pbxplore.structure.structure.Chain.get_phi_psi_angles() methods calculate these angles and return them in a form that can be directly provided to the assignement function.

The dihedral angles are returned as a dictionnary. Each key of this dictionary is a residue number, and each value is a dictionary with the phi and psi angles.

[4]:
dihedrals = chain.get_phi_psi_angles()
pprint(dihedrals)
{1: {'phi': None, 'psi': -171.65563134448553},
 2: {'phi': -133.80467711845586, 'psi': 153.74322760775027},
 3: {'phi': -134.6617568892695, 'psi': 157.30476083095581},
 4: {'phi': -144.49159910635183, 'psi': 118.59706956501033},
 5: {'phi': -100.12866913978127, 'psi': 92.98634825528089},
 6: {'phi': -83.48980457968895, 'psi': 104.23730726195485},
 7: {'phi': -64.77163869310709, 'psi': -43.25159835828049},
 8: {'phi': -44.47885842536954, 'psi': -25.89184262616925},
 9: {'phi': -94.90790101955957, 'psi': -47.182577907117775},
 10: {'phi': -41.312671692330014, 'psi': 133.73743399231304},
 11: {'phi': -119.15122785547305, 'psi': -11.82789586402356},
 12: {'phi': -174.2119655293399, 'psi': 175.87239770676175},
 13: {'phi': -56.61341695443224, 'psi': -45.74767617535588},
 14: {'phi': -50.78226415072095, 'psi': -45.3742585970337},
 15: {'phi': -57.93584481869442, 'psi': -43.329444361460844},
 16: {'phi': -55.20960354113049, 'psi': -56.47559202715399},
 17: {'phi': -64.51979885245254, 'psi': -18.577118068149446},
 18: {'phi': -70.24273354141474, 'psi': -55.153744337676926},
 19: {'phi': -65.20648546633555, 'psi': -41.28370221159946},
 20: {'phi': -58.98821952110768, 'psi': -35.78957701447905},
 21: {'phi': -66.8659714296852, 'psi': -42.14634696303375},
 22: {'phi': -67.34201665142825, 'psi': -57.40438549689628},
 23: {'phi': -52.297936091413874, 'psi': -66.09120830346023},
 24: {'phi': -61.19010445362886, 'psi': -14.807316930892625},
 25: {'phi': 54.95158694420636, 'psi': 47.59528477656777},
 26: {'phi': -69.51531755580697, 'psi': 161.10806531443862},
 27: {'phi': -57.36300935545188, 'psi': -179.66365615297644},
 28: {'phi': -79.91369005407893, 'psi': -18.494472196394668},
 29: {'phi': -93.51717329199727, 'psi': 13.802530546559774},
 30: {'phi': -38.40653214238887, 'psi': 105.85297788366393},
 31: {'phi': -64.01559307951965, 'psi': -5.507357757886723},
 32: {'phi': 50.06519606710965, 'psi': 3.6047302867544038},
 33: {'phi': -84.83560576923662, 'psi': -176.048770127013},
 34: {'phi': -76.65985981150652, 'psi': -36.89428882663367},
 35: {'phi': -66.20745817863622, 'psi': -36.19018119951471},
 36: {'phi': -80.76844188891471, 'psi': -55.88509876949212},
 37: {'phi': -45.0995601497454, 'psi': -50.82304368319501},
 38: {'phi': -58.512419169182465, 'psi': -56.4318511704347},
 39: {'phi': -44.00775783983471, 'psi': -26.06209153795419},
 40: {'phi': -79.6799641005731, 'psi': -51.3827703817916},
 41: {'phi': -58.80532943671335, 'psi': -49.46425322450557},
 42: {'phi': -75.73059711071141, 'psi': 3.9162670655637033},
 43: {'phi': -177.14613562249534, 'psi': 60.46495675947553},
 44: {'phi': 177.1265816932884, 'psi': -66.62887199130637},
 45: {'phi': -58.436100708193806, 'psi': 149.59997847317612},
 46: {'phi': -102.66050573267097, 'psi': 132.43212727859543},
 47: {'phi': -114.5213275524662, 'psi': 169.33012343233455},
 48: {'phi': -61.39150617820462, 'psi': 136.7035538929314},
 49: {'phi': -113.17589693608565, 'psi': 156.54195412530404},
 50: {'phi': -117.26440335376822, 'psi': 138.51305036902693},
 51: {'phi': -120.03410170277817, 'psi': 81.75707989178757},
 52: {'phi': -77.60981590398825, 'psi': 83.18451037698443},
 53: {'phi': -79.65858964180552, 'psi': 111.40143302647459},
 54: {'phi': -100.37011629225776, 'psi': 150.03395825502497},
 55: {'phi': 49.87330458406236, 'psi': 68.7419980340502},
 56: {'phi': -73.87938409722335, 'psi': -66.7355521840301},
 57: {'phi': -56.20534388077749, 'psi': -35.207843043514686},
 58: {'phi': -66.38284564180043, 'psi': -32.21866387324769},
 59: {'phi': -94.6778115344365, 'psi': 17.6861402216656},
 60: {'phi': -111.48538994784963, 'psi': -38.097764578613976},
 61: {'phi': -70.64502750557983, 'psi': -62.8582975880629},
 62: {'phi': -33.50588994671665, 'psi': -32.02546270762565},
 63: {'phi': -128.57384077349852, 'psi': 62.57927537310066},
 64: {'phi': -12.365761900396421, 'psi': 106.99327496259977},
 65: {'phi': 73.68588813063126, 'psi': 32.13155886020173},
 66: {'phi': -89.05260862755028, 'psi': -69.16778908477181},
 67: {'phi': -77.83088301001709, 'psi': -21.564910924673597},
 68: {'phi': -71.32122280651765, 'psi': -21.859413182600065},
 69: {'phi': -81.4118653034867, 'psi': -55.2935117883826},
 70: {'phi': -52.047970110313145, 'psi': -43.22593946145588},
 71: {'phi': -59.215594114973726, 'psi': -45.283196644537554},
 72: {'phi': -52.67186926130671, 'psi': -38.127901315075064},
 73: {'phi': -75.00963018964649, 'psi': -30.83999517691734},
 74: {'phi': -74.69878930178584, 'psi': -35.042954979175136},
 75: {'phi': -80.22740138668189, 'psi': -37.2721834868002},
 76: {'phi': -63.3253002341084, 'psi': -46.736848174955014},
 77: {'phi': -62.577975558265166, 'psi': -38.836376804396195},
 78: {'phi': -58.4371262613883, 'psi': -30.932534133630554},
 79: {'phi': -77.25603045197096, 'psi': -28.810984281581455},
 80: {'phi': -65.77402807318447, 'psi': -6.587861693755372},
 81: {'phi': 113.27162201541088, 'psi': -14.067924223417435},
 82: {'phi': -63.856071155072016, 'psi': 160.46313493362328},
 83: {'phi': -109.29442965951228, 'psi': 65.33016925110071},
 84: {'phi': -94.29902268445335, 'psi': 87.93029438989075},
 85: {'phi': -52.91938395571083, 'psi': 98.897475962567},
 86: {'phi': -73.44769372512917, 'psi': 114.64881254410926},
 87: {'phi': -114.16119204550665, 'psi': 101.24805765454327},
 88: {'phi': -96.78933556699712, 'psi': 106.74340425527281},
 89: {'phi': -109.02775603395978, 'psi': None}}

The dihedral angles can be provided to the pbxplore.assign() function that assigns a Protein Block to each residue, and that returns the PB sequence as a string. Note that the first and last two residues are assigned to the Z jocker block as some dihedral angles cannot be calculated.

[5]:
pb_seq = pbx.assign(dihedrals)
print(pb_seq)
ZZdddfklonbfklmmmmmmmmnopafklnoiaklmmmmmnoopacddddddehkllmmmmngoilmmmmmmmmmmmmnopacdcddZZ
Assign PB for several models of a single file

A single PDB file can contain several models. Then, we do not want to read only the first chain. Instead, we want to iterate over all the chains.

[6]:
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')

for chain_name, chain in pbx.chains_from_files([pdb_name]):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    print('* {}'.format(chain_name))
    print('  {}'.format(pb_seq))
* 2LFU.pdb | model 1 | chain A
  ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ
* 2LFU.pdb | model 2 | chain A
  ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghiadddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddfklpaccdddddfklmlmgcdehiaddddddfklmmgopZZ
* 2LFU.pdb | model 3 | chain A
  ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpacddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddefklpaccddddddfkgojbdfehpaddddddfkbccfbgZZ
* 2LFU.pdb | model 4 | chain A
  ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddddddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddfklmpccdddddddehiabghehiaddddddfklpccfkZZ
* 2LFU.pdb | model 5 | chain A
  ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadddddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddfklmaccddddddfbgghiafehiadddddddfklpacfZZ
* 2LFU.pdb | model 6 | chain A
  ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadddddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddfklmaccddddehiaehbgcdehiadddddddfehjlpcZZ
* 2LFU.pdb | model 7 | chain A
  ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapadddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddfklpaccdddddfklmaacdfehpadddddehjblckknZZ
* 2LFU.pdb | model 8 | chain A
  ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblbacddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddefklpaccdddddfklmbfbehehiaddddddffkgoiehZZ
* 2LFU.pdb | model 9 | chain A
  ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdcdddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddfklmpccddddehiacbcbdfehpadddddehjklmklmZZ
* 2LFU.pdb | model 10 | chain A
  ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapaccdddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddfklpaccddddddfbcfbacfehpadddddddekpghiaZZ
Read 10 chain(s) in 2LFU.pdb
Assign PB for a set of structures

The pbxplore.chains_from_files() function can also handle several chains from several files.

[7]:
import glob
files = ['1BTA.pdb', '2LFU.pdb', '3ICH.pdb']
for pdb_name in files:
    urllib.request.urlretrieve('https://files.rcsb.org/view/{0}'.format(pdb_name), pdb_name)

print('The following files will be used:')
pprint(files)
for chain_name, chain in pbx.chains_from_files(files):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    print('* {}'.format(chain_name))
    print('  {}'.format(pb_seq))
The following files will be used:
['1BTA.pdb', '2LFU.pdb', '3ICH.pdb']
* 1BTA.pdb | chain A
  ZZdddfklonbfklmmmmmmmmnopafklnoiaklmmmmmnoopacddddddehkllmmmmngoilmmmmmmmmmmmmnopacdcddZZ
Read 1 chain(s) in 1BTA.pdb
* 2LFU.pdb | model 1 | chain A
  ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ
* 2LFU.pdb | model 2 | chain A
  ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghiadddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddfklpaccdddddfklmlmgcdehiaddddddfklmmgopZZ
* 2LFU.pdb | model 3 | chain A
  ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpacddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddefklpaccddddddfkgojbdfehpaddddddfkbccfbgZZ
* 2LFU.pdb | model 4 | chain A
  ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddddddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddfklmpccdddddddehiabghehiaddddddfklpccfkZZ
* 2LFU.pdb | model 5 | chain A
  ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadddddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddfklmaccddddddfbgghiafehiadddddddfklpacfZZ
* 2LFU.pdb | model 6 | chain A
  ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadddddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddfklmaccddddehiaehbgcdehiadddddddfehjlpcZZ
* 2LFU.pdb | model 7 | chain A
  ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapadddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddfklpaccdddddfklmaacdfehpadddddehjblckknZZ
* 2LFU.pdb | model 8 | chain A
  ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblbacddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddefklpaccdddddfklmbfbehehiaddddddffkgoiehZZ
* 2LFU.pdb | model 9 | chain A
  ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdcdddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddfklmpccddddehiacbcbdfehpadddddehjklmklmZZ
* 2LFU.pdb | model 10 | chain A
  ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapaccdddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddfklpaccddddddfbcfbacfehpadddddddekpghiaZZ
* 3ICH.pdb | chain A
  ZZccdfbdcdddddehjbdebjcdddddfklmmmlmmmmmmmmnopnopajeopacfbdcehibacehiamnonopgocdfkbjbdcdfblmbccfbghiacdddebehiafkbccddfbdcfklgokaccfbdcfbhklmmmmmmmpccdfkopafbacddfbgcddddfbacddddZZ
Read 10 chain(s) in 2LFU.pdb
Read 1 chain(s) in 3ICH.pdb
Assign PB for frames in a trajectory

PB sequences can be assigned from a trajectory. To do so, we use the pbxplore.chains_from_trajectory() function that takes the path to a trajectory and the path to the corresponding topology as argument. Any file formats readable by MDAnalysis can be used. Except for its arguments, pbxplore.chains_from_trajectory() works the same as pbxplore.chains_from_files().

[8]:
topology, _ = urllib.request.urlretrieve('https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.gro',
                                         'psi_md_traj.gro')
trajectory, _ = urllib.request.urlretrieve('https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.xtc',
                                           'psi_md_traj.xtc')

for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    print('* {}'.format(chain_name))
    print('  {}'.format(pb_seq))
Frame 1/225.
* psi_md_traj.xtc | frame 0
  ZZfkbcnopabfklmmmmpckbccdfbfkbcghiaghidfklmmmmmpcfklccZZ
* psi_md_traj.xtc | frame 1
  ZZfkbcnhpabfklmmmmpckbccdfbfkbcchiachidfklmmmmmbdfklccZZ
* psi_md_traj.xtc | frame 2
  ZZfkbcnhpabfklmmmmcckbccdfbfkbcchiacdddfklmmmmmbdfklpcZZ
* psi_md_traj.xtc | frame 3
  ZZfkbcnhpacfklmmmmpckbccdfbfkbcehiaedddfklmmmpccdfklpcZZ
* psi_md_traj.xtc | frame 4
  ZZfkbmnopabfklmmmmmmmbccddbfkbcghiaehidfklmmmmmcdfklpcZZ
* psi_md_traj.xtc | frame 5
  ZZfkbcnlpabfklmmmmpmkbccdfbfkbcchiacdddfklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 6
  ZZfkbcnopabfklmmmnofkbccdfbfkbcchiacdddfklmmoccbdfklpcZZ
* psi_md_traj.xtc | frame 7
  ZZfkbmnopabfklmmmmcfkbccdddfkbcchiaghidfklmmmccbdfklpcZZ
* psi_md_traj.xtc | frame 8
  ZZfkbcnhpabfklmmmmockbccdfbfkbcehiaehiafklmmmmcbcfklccZZ
* psi_md_traj.xtc | frame 9
  ZZfkbcnhpabfklmmmmcckbccdfbfkbcghiaehidfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 10
  ZZfkbcnopabfklmmmmpmkbccdfbfklcghiaehidfklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 11
  ZZfkbcghpabfklmmmnockbccdfbfcbcehiaedddfklmmmpcbdfklccZZ
* psi_md_traj.xtc | frame 12
  ZZfkbanopabfklmmmmockbccdfbfkbcchiaehidfklmmmmmbdfklccZZ
* psi_md_traj.xtc | frame 13
  ZZfkbcnhpabfklmmmmockbccdfbfkbcchiaghidfklmmmpccdfklccZZ
* psi_md_traj.xtc | frame 14
  ZZfkbcnbpabfklmmmmockbccdfbfkbcchiaghidfklmmopfbdfklccZZ
* psi_md_traj.xtc | frame 15
  ZZfkbcnlpabfklmmmmockbccdfbfklcchiacdddfklmmmpccdfklccZZ
* psi_md_traj.xtc | frame 16
  ZZfkbcghpabfklmmmmmmkbccdfbfkbcehiachidfklmmmpcbdfklccZZ
* psi_md_traj.xtc | frame 17
  ZZfkbanlpabfklmmmmpckbccdfbfkbcchiaedddfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 18
  ZZfkbcnhpabfklmmmmomkbccdfbfkbcehiaehidfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 19
  ZZfkbcnopabfklmmmmpmkbccdfbfkbcehiaehidfklmmmpmbdfklccZZ
* psi_md_traj.xtc | frame 20
  ZZfkbcnhpabfklmmmmockbccdfbfkbaghiaedddfklmmmgfbafklccZZ
* psi_md_traj.xtc | frame 21
  ZZfkbcnbpabfklmmmmcckbccdfbfkbcehiaghiafklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 22
  ZZfkbcnlpabfklmmmmofkbccdfbfkbcchiaghiafklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 23
  ZZfkbcnopabfklmmmmockbccdfbfkbcchiaehiafklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 24
  ZZfkbcnbpabfklmmmmockbccddbfkbcchiaehidfklmmmpmbdfklccZZ
* psi_md_traj.xtc | frame 25
  ZZfkbcgbpabfklmmmmockbccdfbfkbcchiachidfklmmmccbdfklpcZZ
* psi_md_traj.xtc | frame 26
  ZZfkbcnbpabfklmmmmbgkbccdfbfkbcchiacdidfklmmmpmbdfklccZZ
* psi_md_traj.xtc | frame 27
  ZZfkbcnopabfklmmmmpmkbccdfbfkbaghiaghiafklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 28
  ZZfkbcnhpabfklmmmmockbccdfbfkbcghiaedidfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 29
  ZZfkbcnhpabfklmmmmmmmbccdfbfkbcghiaehidfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 30
  ZZfkbcnlpabfklmmmmmmmbccdfbfklcchiaehidfklmmmpfbdfklccZZ
* psi_md_traj.xtc | frame 31
  ZZfkbonhpabfklmmmmmmmbccdfbfkbcghiaghidfklmmmmmbdfklccZZ
* psi_md_traj.xtc | frame 32
  ZZfkbcfbpabfklmmmmmmmbccdfbdcbcehiaghidfklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 33
  ZZfkbcnhpabfklmmmmmmkbccdfbfkbcchiaghidfklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 34
  ZZfklcnopabfklmmmmbmmbccdfbfklcghiaghidfklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 35
  ZZfklcnopabfklmmmmpfkbccdfbfkbcehiaehiafklmmmccbdfklccZZ
* psi_md_traj.xtc | frame 36
  ZZfkbcnopabfklmmmnofkbccdfbfcbcehiacdddfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 37
  ZZfklcnopabfklmmmmmmmbccdfbfklcghiaghidfklmmmcfbdfklccZZ
* psi_md_traj.xtc | frame 38
  ZZfkbcnopabfklmmmmockbccdfbfkbcehiaehiafklmmmccbdfklpcZZ
* psi_md_traj.xtc | frame 39
  ZZfkbcnopabfklmmmmmmmbccdfbfkbcghiaehiafklmmmpfbdfklccZZ
* psi_md_traj.xtc | frame 40
  ZZfkbcnlpabfklmmmmockbccdddfklcchiaghiafklmmmpfbafklccZZ
* psi_md_traj.xtc | frame 41
  ZZfkbcnopacfklmmmmmmkbccdddfkbcehiaghiafklmmmcfbdfklmcZZ
* psi_md_traj.xtc | frame 42
  ZZfkbcnopabfklmmmnockbccdddfkbcchiaehidfklmmncfbdfklpcZZ
* psi_md_traj.xtc | frame 43
  ZZfkbcnopabfklmmmmockbccdddfkbcchiaehidfklmmmgmbdfklpcZZ
* psi_md_traj.xtc | frame 44
  ZZfkbcnhpabfklmmmmcfkbccdfbfkbcehiaghiafklmmmcfbdfklpcZZ
* psi_md_traj.xtc | frame 45
  ZZfkbcnhpabfklmmmmmgcbccdfbfkbcchiaehidfklmmmccbdfkbccZZ
* psi_md_traj.xtc | frame 46
  ZZfkbcnopabfklmmmmocbbccdfbdfbcehiaehidfklmmmgfbacklccZZ
* psi_md_traj.xtc | frame 47
  ZZfkbcnbpabfklmmmmockbccdfbfkbcchiaehidfklmmoccbdfklccZZ
* psi_md_traj.xtc | frame 48
  ZZfkbmnbpabfklmmmmockbccdfbfkbcchiaehiafklmmommbcfklccZZ
* psi_md_traj.xtc | frame 49
  ZZfkbcnlpabfklmmmmcfkbccdfbfklcghiaehiafklmmmmmbcfklccZZ
* psi_md_traj.xtc | frame 50
  ZZfkbcnopabfklmmmmockbccdfbfcbcehiacdidfklmmmmmpcfklpcZZ
* psi_md_traj.xtc | frame 51
  ZZfkbcnopabfklmmmmockbccdfbfkbcchiaehidfklmmmmmbcfklccZZ
* psi_md_traj.xtc | frame 52
  ZZfkbcnbpabfklmmmmomkbccdfbfklcghiaghidfklmmmmmmcfklccZZ
* psi_md_traj.xtc | frame 53
  ZZfkbmnopcbfklmmmmpckbccdfbfklcghiaghiafklmmmmmccfklpcZZ
* psi_md_traj.xtc | frame 54
  ZZfkbcnbpfbfklmmmmpckbccdddfkbcchiaghiafklmmmmmpcfklpcZZ
* psi_md_traj.xtc | frame 55
  ZZfkbcnbpfbfklmmmmockbccdfbfkbcchiaehidfklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 56
  ZZfkbmnopabfklmmmmcckbccdfbfkbcchiaehiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 57
  ZZfkbcnbpabfklmmmmcfkbccdfbfkbcehiaehiafklmmmmmacdfbacZZ
* psi_md_traj.xtc | frame 58
  ZZfkbcnhpabfklmmmmockbccdfbdfbachiaehiafklmmmmmmcfklccZZ
* psi_md_traj.xtc | frame 59
  ZZfkbcnopabfklmmmmockbccdfbfkbcchiaghidfklmmmmmmccfbacZZ
* psi_md_traj.xtc | frame 60
  ZZfkbcnbpabfklmmmmofkbccdfbfkbcehiaghidfklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 61
  ZZfkbanbpfbfklmmmmockbccdfbfkbcehiaghidfklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 62
  ZZfkbcnbpabfklmmmmomkbccdfbfkbachiaehiafklmmnomacdfbacZZ
* psi_md_traj.xtc | frame 63
  ZZfkbcnbpabfklmmmmpmkbccdfbfkbcchiaehiafklmmmmnocchiacZZ
* psi_md_traj.xtc | frame 64
  ZZfkbcnbpabfklmmmmpckbccdfbfkbcehiaehiafklmmmmmpccfkacZZ
* psi_md_traj.xtc | frame 65
  ZZfkbcnbpcbfklmmmmpckbccdfbfkbcehiaghiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 66
  ZZfkbcnopabfklmmmmockbccdfbfkbcchiaehiafklmmmmmbccfbccZZ
* psi_md_traj.xtc | frame 67
  ZZfkbcnhpabfklmmmmcfkbccdfbfkbcehiaehiafklmmmmmbccfbccZZ
* psi_md_traj.xtc | frame 68
  ZZfkbcnbpabfklmmmmpckbccdfbfkbcehiaehidfklmmmmmbccfbacZZ
* psi_md_traj.xtc | frame 69
  ZZfkbcnbpabfklmmmmockbccdfbfkbcghiaehiafklmmmmmocfklccZZ
* psi_md_traj.xtc | frame 70
  ZZfkbcnopabfklmmmmmmkbccdfbfkbcchiaehidfklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 71
  ZZfkbcnbpabfklmmmmomkbccdfbfkbcehiaghidfklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 72
  ZZfkbcfbpabfklmmmmofkbccdfbfcbcehiaehidfklmmmmmmccfbacZZ
* psi_md_traj.xtc | frame 73
  ZZfkbcnbpabfklmmmmockbccdfbffbcchiaehiafklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 74
  ZZfkbcnbpabfklmmmmcckbccdfbfklcghiaehiafklmmmmmcccfbdcZZ
* psi_md_traj.xtc | frame 75
  ZZfkbcfhpabfklmmmmpmkbccdfbfklcchiaehiafklmmmmmccfklccZZ
* psi_md_traj.xtc | frame 76
  ZZfkbcghpabfklmmmmmmmbccdfbfkbcchiaehiafklmmmmmpacfbacZZ
* psi_md_traj.xtc | frame 77
  ZZfkbcchiabfklmmmmbfkbccdfbfkbcehiaehiafklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 78
  ZZfkbcchiacfklmmmmmmkbccdfbfkbcehiaehiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 79
  ZZfkbcghpabfklmmmmmmkbccdfbfkbcchiaehidfklmmmmmacfklpcZZ
* psi_md_traj.xtc | frame 80
  ZZfkbcghpabfklmmmmmmkbccdfbfkbcghiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 81
  ZZfkbcnhpabfklmmmmockbccdfbfkbcehiaghidfklmmmmmbcchiacZZ
* psi_md_traj.xtc | frame 82
  ZZfkbcghpabfklmmmmmmcbccdfbfklcghiaghiafklmmmmmpcfklccZZ
* psi_md_traj.xtc | frame 83
  ZZfkbcfbpabfklmmmmockbccdfbfklcghiaghidfklmmnomacdfbacZZ
* psi_md_traj.xtc | frame 84
  ZZfkbcnbpabfklmmmmockbccdfbdfbgghiaehiafklmmmmmbccfbccZZ
* psi_md_traj.xtc | frame 85
  ZZfkbcgbpacfklmmmmomkbccdfbfkbcehiaghiafklmmmmmpcfklccZZ
* psi_md_traj.xtc | frame 86
  ZZfkbcghpacfklmmmmmmmbccdfbfkbcehiaehiafklmmmmmccfklccZZ
* psi_md_traj.xtc | frame 87
  ZZfkbcnopabfklmmmmomkbccdfbdcbcehiaghiafklmmmmmacfklpcZZ
* psi_md_traj.xtc | frame 88
  ZZfkbcfbpacfklmmmmmmmbccdfbfkbcghiaehiafklmmmmmacfklacZZ
* psi_md_traj.xtc | frame 89
  ZZfkbcnopabfklmmmmomkbccdfbfklcghiaghiafklmmmmmmcfklpcZZ
* psi_md_traj.xtc | frame 90
  ZZfkbcnhpacfklmmmmockbccdfbfkbcghiaghiafklmmmmmccfklccZZ
* psi_md_traj.xtc | frame 91
  ZZffbcnopacfklmmmmockbccdfbfklaghiaehiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 92
  ZZfkbcnlpacfklmmmmcckbccdfbfkbcghiaehidfklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 93
  ZZdfbanhpacfklmmmmcfkbccdfbfkbcehiaghiafklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 94
  ZZfkbcnhpacfklmmmmockbccdfbfkbcehiaehiafklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 95
  ZZfkbcnopabfklmmmmmmmbccdfbfkbcehiaehiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 96
  ZZfkbcnbpabfklmmmmockbccdfbfkbcehiaghiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 97
  ZZfkbcnbpabfklmmmmmmkbccdfbfklcghiaehiafklmmmmmmccfbccZZ
* psi_md_traj.xtc | frame 98
  ZZdfbfnbpfbfklmmmmmmkbccdfbfkbcchiaehidfklmmmmmpccklccZZ
* psi_md_traj.xtc | frame 99
  ZZfkbcnbpabfklmmmncckbccdfbafbcchiaghidfklmmmmmbcehiacZZ
* psi_md_traj.xtc | frame 100
  ZZfkbcnbpabfklmmmmmmkbccdfbfkbachiaghidfklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 101
  ZZfkbcnbjabfklmmmmmmmbccdfbfklpghiaghidfklmmmmmpcfkbccZZ
* psi_md_traj.xtc | frame 102
  ZZfkbcnbpcbfklmmmmockbccdfbfklcghiaehidfklmmmmmpccklccZZ
* psi_md_traj.xtc | frame 103
  ZZfkbcnbpabfklmmmmockbccdfbfkbcchiaehidfklmmmmmccdhiacZZ
* psi_md_traj.xtc | frame 104
  ZZfkbcnopacfklmmmmpckbccdfbfkbcehiaehiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 105
  ZZfkbcnopabfklmmmmockbccdfbfklcchiaehidfklmmmmmbccfbacZZ
* psi_md_traj.xtc | frame 106
  ZZfkbcnbpcbfklmmmmoccbccdfbfkbcghiaehidfklmmmmmbccfbacZZ
* psi_md_traj.xtc | frame 107
  ZZfkbcnbjfbfklmmmmockbccdfbfkbcghiaehidfklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 108
  ZZfkbckbpcbfklmmmmpmkbccdfbfkbcehiaghidfklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 109
  ZZfkbcnbpabfklmmmmpmkbccdfbfklcghiaghiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 110
  ZZfkbcnbpabfklmmmnockbccdfbfklcehiaghiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 111
  ZZfkbcnbpabfklmmmmockbccdfbfklcghiaehiafklmmmnmpacfbacZZ
* psi_md_traj.xtc | frame 112
  ZZfkbcfbpcbfklmmmmockbccdfbfklcghiaghiafklmmmmmpccfbacZZ
* psi_md_traj.xtc | frame 113
  ZZfkbcnhpabfklmmmmpckbccdfbfklcchiaehiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 114
  ZZfkbcehpabfklmmmmpmkbccdfbfkbcghiaghiafklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 115
  ZZfkbcnbpabfklmmmmomkbccdfbfkbcehiaghiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 116
  ZZfkbcnbpabfklmmmmomkbccdfbfklcghiaghiafklmmmmmcccfbccZZ
* psi_md_traj.xtc | frame 117
  ZZfkbcfbpabfklmmmmomkbccdfbfkbcghiaghiafklmmmmmccdfbccZZ
* psi_md_traj.xtc | frame 118
  ZZfkbcnbpcbfklmmmmockbccdfbfkbcghiaghiafklmmmmmpccfbpcZZ
* psi_md_traj.xtc | frame 119
  ZZfkbcnopabfklmmmmockbccdfbfklcghiaehiafklmmmmmbccfbdcZZ
* psi_md_traj.xtc | frame 120
  ZZfkbcnopabfklmmmmockbccdfbfkbcehiaehiafklmmmmmccdfbacZZ
* psi_md_traj.xtc | frame 121
  ZZfkbcnbpabfklmmmnockbccdfbfklcehiaghiafklmmmmmpccfbccZZ
* psi_md_traj.xtc | frame 122
  ZZfkbcnopabfklmmmmmckbccdfbfklcghiaghiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 123
  ZZfkbcnopabfklmmmmbckbccdfbfkbcchiaghiafklmmnombcfklccZZ
* psi_md_traj.xtc | frame 124
  ZZfkbcehpabfklmmmmcckbccdfbfkbcehiaehiafklmmmmmbcfklpcZZ
* psi_md_traj.xtc | frame 125
  ZZfkbcnbpabfklmmmmockbccdfbffbcchiaghiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 126
  ZZfkbcnopabfklmmmmockbccdfbfklcghiaehiafklmmmmmpccfbccZZ
* psi_md_traj.xtc | frame 127
  ZZfkbcnbpabfklmmmmmmkbccdfbfkbcehiachiafklmmmmmmccfbacZZ
* psi_md_traj.xtc | frame 128
  ZZfkbcnopabfklmmmmockbccdfbfbbcehiaghiafklmmmmmpccfbacZZ
Frame 100/225.
* psi_md_traj.xtc | frame 129
  ZZfkbcnopabfklmmmmmmkbccdfbfbdcehiaehiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 130
  ZZfkbcnopabfklmmmmockbccdfbfkbcghiaghiafklmmmmmcccfbacZZ
* psi_md_traj.xtc | frame 131
  ZZfkbcnbpabfklmmmmockbccdfbfkbcehiaghiafklmmmmmmgcfkacZZ
* psi_md_traj.xtc | frame 132
  ZZfkbmnlpabfklmmmmockbccdfbfkbcghiaehiafklmmmmmpcghiacZZ
* psi_md_traj.xtc | frame 133
  ZZfkbcnopabfklmmmmbckbccdfbfcbcehiaehiafklmmmmmacehiacZZ
* psi_md_traj.xtc | frame 134
  ZZfkbcnbpabfklmmmmockbccdfbfkbcehiaehidfklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 135
  ZZfkbcnopabfklmmmmockbccdfbfkbcehiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 136
  ZZfkbcnbpcbfklmmmmmmkbccdfbfkbcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 137
  ZZfkbmnopabfklmmmmcckbccdfbfkbcghiaghiafklmmmmmpgehiacZZ
* psi_md_traj.xtc | frame 138
  ZZfkbcnbpabfklmmmmmmmbccdfblkbcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 139
  ZZfkbcnbpcbfklmmmmbmkbccdfbfkbcehiaehiafklmmmmmmcehjacZZ
* psi_md_traj.xtc | frame 140
  ZZfkbfnopcbfklmmmmpmkbccdfbfkbcehiaghiafklmmmmmmgehiacZZ
* psi_md_traj.xtc | frame 141
  ZZfkbcnbpcbfklmmmmmmmbccdfbfkbcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 142
  ZZfkbcnopabfklmmmmpmkbccdfbfkbcehiaehiafklmmmmmacehiacZZ
* psi_md_traj.xtc | frame 143
  ZZfkbcnhpabfklmmmmomkbccdfbfkbcehiaghiafklmmmmmacehiacZZ
* psi_md_traj.xtc | frame 144
  ZZfkbcnbpabfklmmmmmmkbccdfbfkbcehiachiafklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 145
  ZZfkbcfbpabfklmmmmbckbccdfbdfbaehiaghiafklmmmmmacehiacZZ
* psi_md_traj.xtc | frame 146
  ZZfkbcnbpcbfklmmmmockbccdfbdfbaehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 147
  ZZfkbcnbpabfklmmmmockbccdfbdhiaghiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 148
  ZZfkbcnbpcbfklmmmmomkbccdfbdcbdehiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 149
  ZZfkbcnbpabfklmmmmcfkbccdfbfkbcehiaghiafklmmmmmpcghiacZZ
* psi_md_traj.xtc | frame 150
  ZZfkbcnbpcbfklmmmmomkbccdfbfkbcehiaehiafklmmnombcghiacZZ
* psi_md_traj.xtc | frame 151
  ZZfkbcnbpabfklmmmmockbccdfbfblcehiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 152
  ZZfkbcnbpabfklmmmmockbccdfbfklcehiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 153
  ZZfkbcnbpabfklmmmmpfkbccdfbfblcehiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 154
  ZZfkbcnbpabfklmmmmpmkbccdfbfklcghiaehiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 155
  ZZfkbcnbpabfklmmmmpmkbccdfbfklcghiaghiafklmmmmmpcghiacZZ
* psi_md_traj.xtc | frame 156
  ZZfkbcnbpabfklmmmmpckbccdfbfklcghiaehiafklmmmmmbcehiacZZ
* psi_md_traj.xtc | frame 157
  ZZfkbcnbpabfklmmmmofkbccdfbfkbcghiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 158
  ZZfkbcnbpabfklmmmnockbccdfbfkbcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 159
  ZZfkbcnbpcbfklmmmmockbccdfbdfbaehiaehiafklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 160
  ZZfkbcnbpabfklmmmmmmkbccdfbdfbachiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 161
  ZZfkbcnbpabfklmmmmockbccdfbacbdehiaehiafklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 162
  ZZfkbcnbpabfklmmmmockbccdfbfkbcehiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 163
  ZZfkbcfbpcbfklmmmmpckbccdfbfkbcghiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 164
  ZZfkbcnhpabfklmmmmomkbccdfbdfbaehiaehiafklmmmmmpcehkacZZ
* psi_md_traj.xtc | frame 165
  ZZfkbcnopabfklmmmmockbccdfbdcbcehiaehiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 166
  ZZfkbcnbpabfklmmmmcfkbccdfbfkbcghiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 167
  ZZfkbcfbpfbfklmmmmockbccdfbfkbcghiaghiafklmmommpcehiacZZ
* psi_md_traj.xtc | frame 168
  ZZfkbcfbpabfklmmmmockbccdfbfkbcghiaehiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 169
  ZZfkbcfbpabfklmmmmockbccdfbfkbcghiaghiafklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 170
  ZZfkbcfbpabfklmmmmockbccdfbfkbcghiachiafklmmmmmacehiacZZ
* psi_md_traj.xtc | frame 171
  ZZfkbcnopabfklmmmmockbccdfbfklcehiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 172
  ZZfkbcnopabfklmmmmockbccdfbfklcghiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 173
  ZZfkbcnopabfklmmmmpckbccdfbfkbcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 174
  ZZfkbmnopabfklmmmmockbccdfbfklcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 175
  ZZfkbcnbpabfklmmmmockbccdfbfklcchiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 176
  ZZfkbcnbpabfklmmmncfkbccdfbfkbcehiaehiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 177
  ZZfkbcnopabfklmmmnockbccdfbfkbcghiaehiafklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 178
  ZZfkbcnbpabfklmmmmcckbccdfbfkbcehiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 179
  ZZfkbcnbpabfklmmmncfkbccdfbfkbcghiaghiafklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 180
  ZZfkbcnopabfklmmmmpckbccdfbfklcehiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 181
  ZZfkbfnbpcbfklmmmmockbccdfbdcbcehiacdddfklmmmmnpcehiacZZ
* psi_md_traj.xtc | frame 182
  ZZfkbcnbpcbfklmmmmpmkbccdfbfkbcghiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 183
  ZZfkbcnbpabfklmmmmockbccdfbfkbcehiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 184
  ZZfkbcnbpabfklmmmmcckbccdfbfklcghiaehiafklmmommccehiacZZ
* psi_md_traj.xtc | frame 185
  ZZfkbcnbpabfklmmmmpfkbccdfbfkbcehiaehiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 186
  ZZfkbcnopabfklmmmmmmkbccdfbfklcghiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 187
  ZZfkbcnopabfklmmmmockbccdfbfklcghiaghiafklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 188
  ZZfkbcnbpabfklmmmmpmkbccdfbfkbcchiafbdcfklmmmmmpcehjacZZ
* psi_md_traj.xtc | frame 189
  ZZfkbcnbpcbfklmmmmockbccdfbfcbcehiafbdcfklmmmmmbcehiacZZ
* psi_md_traj.xtc | frame 190
  ZZfkbcnbpcbfklmmmmockbccdfbfklcghiafbdcfklmmmmmbcehiacZZ
* psi_md_traj.xtc | frame 191
  ZZfkbmnbpcbfklmmmnockbccdfblbacehiafbdcfklmmnmmbcehiacZZ
* psi_md_traj.xtc | frame 192
  ZZfkbcnbpfbfklmmmmpmkbccdfbfbdcehiafbdcfklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 193
  ZZfkbcnbpcbfklmmmnomkbccdfbfklcghiacbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 194
  ZZfkbmnbpabfklmmmmpmkbccdfbfkbcghiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 195
  ZZfkbfnbpabfklmmmmpmkbccdfbfkbcehiafbdcfklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 196
  ZZfkbfnbpabfklmmmmpmkbccdfbfkbcehiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 197
  ZZfkbmnbpabfklmmmmockbccdfbfkbcehiafbdcfklmmmmmpcchiacZZ
* psi_md_traj.xtc | frame 198
  ZZfkbmnbpcbfklmmmmpmkbccdfbdfbcehiafbdcfklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 199
  ZZfkbcnbpabfklmmmnockbccdfbfkbcghiafbdcfklmmmmmpaehiacZZ
* psi_md_traj.xtc | frame 200
  ZZfkbcnbpcbfklmmmmockbccdfbdcbcehiafbdcfklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 201
  ZZfkbcnopabfklmmmmomkbccdfbfbdcehiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 202
  ZZfkbcnopabfklmmmmomkbccdfbfkbcehiafbdcfklmmmmmpcghiacZZ
* psi_md_traj.xtc | frame 203
  ZZfkbcnopabfklmmmmmmkbccdfbfbbcehiafbdcfklmmmmmpcghiacZZ
* psi_md_traj.xtc | frame 204
  ZZfkbcnopabfklmmmmpmkbccdfbfbbcehiafbdcfklmmmmmpcghiacZZ
* psi_md_traj.xtc | frame 205
  ZZfkbcnopabfklmmmmomkbccdfbfbbcehiafbdcfklmmmmmbcghiacZZ
* psi_md_traj.xtc | frame 206
  ZZfkbcnopabfklmmmmpckbccdfbfkbgghiafbdcfklmmmmmpcehjacZZ
* psi_md_traj.xtc | frame 207
  ZZfkbcnopacfklmmmmomkbccdfblkbcghiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 208
  ZZfkbcfbpabfklmmmmockbccdfbfkbcghiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 209
  ZZfkbcnbpabfklmmmmockbccdfbfkbcghiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 210
  ZZfkbcnbpabfklmmmmockbccdfbfkbcghiafbdcfklmmmmmbcehiacZZ
* psi_md_traj.xtc | frame 211
  ZZdfbanbpabfklmmmmockbccdfbfkbcehiafbdcfklmmmnmccehiacZZ
* psi_md_traj.xtc | frame 212
  ZZfkbcnopabfklmmmnofkbccdfbfkbcehiafbdcfklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 213
  ZZfkbcnbpabfklmmmmockbccdfbfkbcehiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 214
  ZZfkbcnopabfklmmmmomkbccdfbfcbcehiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 215
  ZZfkbcnopabfklmmmmpfkbccdfbfkbcchiacbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 216
  ZZfkbcnopabfklmmmmockbccdfbfkbcehiacbdcfklmmmnmpaghiacZZ
* psi_md_traj.xtc | frame 217
  ZZfkbcnbpabfklmmmmcckbccdfbfkbcehiafbdcfklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 218
  ZZfkbcnhpabfklmmmmpckbccdfbfkbcchiafbdcfklmmmmmccehiacZZ
* psi_md_traj.xtc | frame 219
  ZZfkbcnbpabfklmmmmpckbccdfbfkbachiafbdcfklmmmmmpcehiacZZ
* psi_md_traj.xtc | frame 220
  ZZfkbcnbpabfklmmmnockbccdfbfkbcehiafbdcfklmmmnmpaghiacZZ
* psi_md_traj.xtc | frame 221
  ZZdfbanbpabfklmmmnockbccdfblkbcehiafbdcfklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 222
  ZZfkbcnopabfklmmmmcfkbccdfbfkbcehiafbdcfklmmmmmmcehiacZZ
* psi_md_traj.xtc | frame 223
  ZZfkbcnopabfklmmmmcfkbccdfbfkbcehiafbdcfklmmmmmmcghiacZZ
* psi_md_traj.xtc | frame 224
  ZZfkbcnbpabfklmmmmofkbccdfbfkbcghiafbdcfklmmmmmpcehiacZZ
Frame 200/225.
Frame 225/225.
Use a different structure parser

Providing the dihedral angles can be formated as expected by pbxplore.assign(), the source of these angles does not matter. For instance, other PDB parser can be used with PBxplore.

BioPython
[9]:
import Bio
import Bio.PDB
import math
print("BioPython version:", Bio.__version__)

pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')

for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_name):
    for chain in model:
        polypeptides = Bio.PDB.PPBuilder().build_peptides(chain)
        for poly_index, poly in enumerate(polypeptides):
            dihedral_list = poly.get_phi_psi_list()
            dihedrals = {}
            for resid, (phi, psi) in enumerate(dihedral_list, start=1):
                if not phi is None:
                    phi = 180 * phi / math.pi
                if not psi is None:
                    psi = 180 * psi / math.pi
                dihedrals[resid] = {'phi': phi, 'psi': psi}
        print(model, chain)
        pb_seq = pbx.assign(dihedrals)
        print(pb_seq)
BioPython version: 1.78
<Model id=0> <Chain id=A>
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ
<Model id=1> <Chain id=A>
ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghiadddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddfklpaccdddddfklmlmgcdehiaddddddfklmmgopZZ
<Model id=2> <Chain id=A>
ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpacddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddefklpaccddddddfkgojbdfehpaddddddfkbccfbgZZ
<Model id=3> <Chain id=A>
ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddddddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddfklmpccdddddddehiabghehiaddddddfklpccfkZZ
<Model id=4> <Chain id=A>
ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadddddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddfklmaccddddddfbgghiafehiadddddddfklpacfZZ
<Model id=5> <Chain id=A>
ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadddddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddfklmaccddddehiaehbgcdehiadddddddfehjlpcZZ
<Model id=6> <Chain id=A>
ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapadddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddfklpaccdddddfklmaacdfehpadddddehjblckknZZ
<Model id=7> <Chain id=A>
ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblbacddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddefklpaccdddddfklmbfbehehiaddddddffkgoiehZZ
<Model id=8> <Chain id=A>
ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdcdddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddfklmpccddddehiacbcbdfehpadddddehjklmklmZZ
<Model id=9> <Chain id=A>
ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapaccdddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddfklpaccddddddfbcfbacfehpadddddddekpghiaZZ

Writing PB in file

The API allows to write all the files the command line tools can. This includes the outputs of PBassign. The functions to handle several file formats are available in the pbxplore.io module.

[1]:
from pprint import pprint
import urllib.request
import os

# print date & versions
import datetime
print("Date & time:",datetime.datetime.now())
import sys
print("Python version:", sys.version)
Date & time: 2021-01-27 13:55:39.313685
Python version: 3.7.9 (default, Oct 19 2020, 15:13:17)
[GCC 7.5.0]
[2]:
import pbxplore as pbx
print("PBxplore version:", pbx.__version__)
PBxplore version: 1.4.0
Fasta files

The most common way to save PB sequences is to write them in a fasta file.

PBxplore allows two ways to write fasta files. The sequences can be written either all at once or one at a time. To write a batch of sequences at once, we need a list of sequences and a list of the corresponding sequence names. The writing function here is pbxplore.io.write_fasta().

[3]:
names = []
pb_sequences = []
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')
for chain_name, chain in pbx.chains_from_files([pdb_name]):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    names.append(chain_name)
    pb_sequences.append(pb_seq)

pprint(names)

pprint(pb_sequences)

with open('output.fasta', 'w') as outfile:
    pbx.io.write_fasta(outfile, pb_sequences, names)
['2LFU.pdb | model 1 | chain A',
 '2LFU.pdb | model 2 | chain A',
 '2LFU.pdb | model 3 | chain A',
 '2LFU.pdb | model 4 | chain A',
 '2LFU.pdb | model 5 | chain A',
 '2LFU.pdb | model 6 | chain A',
 '2LFU.pdb | model 7 | chain A',
 '2LFU.pdb | model 8 | chain A',
 '2LFU.pdb | model 9 | chain A',
 '2LFU.pdb | model 10 | chain A']
['ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ',
 'ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghiadddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddfklpaccdddddfklmlmgcdehiaddddddfklmmgopZZ',
 'ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpacddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddefklpaccddddddfkgojbdfehpaddddddfkbccfbgZZ',
 'ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddddddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddfklmpccdddddddehiabghehiaddddddfklpccfkZZ',
 'ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadddddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddfklmaccddddddfbgghiafehiadddddddfklpacfZZ',
 'ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadddddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddfklmaccddddehiaehbgcdehiadddddddfehjlpcZZ',
 'ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapadddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddfklpaccdddddfklmaacdfehpadddddehjblckknZZ',
 'ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblbacddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddefklpaccdddddfklmbfbehehiaddddddffkgoiehZZ',
 'ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdcdddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddfklmpccddddehiacbcbdfehpadddddehjklmklmZZ',
 'ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapaccdddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddfklpaccddddddfbcfbacfehpadddddddekpghiaZZ']
Read 10 chain(s) in 2LFU.pdb
[4]:
!cat output.fasta
!rm output.fasta
>2LFU.pdb | model 1 | chain A
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcd
dddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddf
klmaccddddddfbgniaghiapaddddddfklnoambZZ
>2LFU.pdb | model 2 | chain A
ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghia
dddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddf
klpaccdddddfklmlmgcdehiaddddddfklmmgopZZ
>2LFU.pdb | model 3 | chain A
ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpa
cddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddef
klpaccddddddfkgojbdfehpaddddddfkbccfbgZZ
>2LFU.pdb | model 4 | chain A
ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddd
dddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddf
klmpccdddddddehiabghehiaddddddfklpccfkZZ
>2LFU.pdb | model 5 | chain A
ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadd
dddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddf
klmaccddddddfbgghiafehiadddddddfklpacfZZ
>2LFU.pdb | model 6 | chain A
ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadd
dddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddf
klmaccddddehiaehbgcdehiadddddddfehjlpcZZ
>2LFU.pdb | model 7 | chain A
ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapa
dddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddf
klpaccdddddfklmaacdfehpadddddehjblckknZZ
>2LFU.pdb | model 8 | chain A
ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblba
cddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddef
klpaccdddddfklmbfbehehiaddddddffkgoiehZZ
>2LFU.pdb | model 9 | chain A
ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdc
dddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddf
klmpccddddehiacbcbdfehpadddddehjklmklmZZ
>2LFU.pdb | model 10 | chain A
ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapacc
dddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddf
klpaccddddddfbcfbacfehpadddddddekpghiaZZ

Sequences can be written once at a time using the pbxplore.io.write_fasta_entry() function.

[5]:
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')

with open('output.fasta', 'w') as outfile:
    for chain_name, chain in pbx.chains_from_files([pdb_name]):
        dihedrals = chain.get_phi_psi_angles()
        pb_seq = pbx.assign(dihedrals)
        pbx.io.write_fasta_entry(outfile, pb_seq, chain_name)
Read 10 chain(s) in 2LFU.pdb
[6]:
!cat output.fasta
!rm output.fasta
>2LFU.pdb | model 1 | chain A
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcd
dddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddf
klmaccddddddfbgniaghiapaddddddfklnoambZZ
>2LFU.pdb | model 2 | chain A
ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghia
dddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddf
klpaccdddddfklmlmgcdehiaddddddfklmmgopZZ
>2LFU.pdb | model 3 | chain A
ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpa
cddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddef
klpaccddddddfkgojbdfehpaddddddfkbccfbgZZ
>2LFU.pdb | model 4 | chain A
ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddd
dddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddf
klmpccdddddddehiabghehiaddddddfklpccfkZZ
>2LFU.pdb | model 5 | chain A
ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadd
dddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddf
klmaccddddddfbgghiafehiadddddddfklpacfZZ
>2LFU.pdb | model 6 | chain A
ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadd
dddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddf
klmaccddddehiaehbgcdehiadddddddfehjlpcZZ
>2LFU.pdb | model 7 | chain A
ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapa
dddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddf
klpaccdddddfklmaacdfehpadddddehjblckknZZ
>2LFU.pdb | model 8 | chain A
ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblba
cddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddef
klpaccdddddfklmbfbehehiaddddddffkgoiehZZ
>2LFU.pdb | model 9 | chain A
ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdc
dddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddf
klmpccddddehiacbcbdfehpadddddehjklmklmZZ
>2LFU.pdb | model 10 | chain A
ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapacc
dddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddf
klpaccddddddfbcfbacfehpadddddddekpghiaZZ

By default, the lines in fasta files are wrapped at 60 caracters as defined in pbxplore.io.fasta.FASTA_WIDTH. Both pbxplore.io.write_fasta() and pbxplore.io.write_fasta_entry() have a width optionnal argument that allows to control the wrapping.

[7]:
print(pb_sequences[0])
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ
[8]:
with open('output.fasta', 'w') as outfile:
    for width in (60, 70, 80):
        pbx.io.write_fasta_entry(outfile, pb_sequences[0],
                                        'width={} blocks'.format(width),
                                        width=width)
[9]:
!cat output.fasta
!rm output.fasta
>width=60 blocks
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcd
dddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddf
klmaccddddddfbgniaghiapaddddddfklnoambZZ
>width=70 blocks
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddf
klopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniagh
iapaddddddfklnoambZZ
>width=80 blocks
ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopaddddd
fhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ
Dihedral angles

One needs the phi and psi dihedral angles to assign protein block sequences. Having these angles, it is sometime convenient to store them in a file. This can be done easily.

[10]:
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')

with open('output.phipsi', 'w') as outfile:
    for chain_name, chain in pbx.chains_from_files([pdb_name]):
        dihedral = chain.get_phi_psi_angles()
        for res in sorted(dihedral):
            phi = "{:8.2f}".format(dihedral[res]["phi"]) if dihedral[res]["phi"] else "    None"
            psi = "{:8.2f}".format(dihedral[res]["psi"]) if dihedral[res]["psi"] else "    None"
            print("{} {:6d} {} {} ".format(chain_name, res, phi, psi), file=outfile)
Read 10 chain(s) in 2LFU.pdb

Note it’s better to write the dihedral for each PDB/frame due to the high memory cost to store all of them in a list.

The output is formated with one line per residue. The first columns repeat the name given for the chain, then is the residue id followed by the phi and the psi angle. If an angle is not defined, ‘None’ is written instead.

[11]:
!head output.phipsi
!tail output.phipsi
!rm output.phipsi
2LFU.pdb | model 1 | chain A    276     None    44.77
2LFU.pdb | model 1 | chain A    277   -81.67    15.27
2LFU.pdb | model 1 | chain A    278   -79.03   -21.52
2LFU.pdb | model 1 | chain A    279  -144.43    64.08
2LFU.pdb | model 1 | chain A    280   -84.23   144.90
2LFU.pdb | model 1 | chain A    281    70.83   -64.01
2LFU.pdb | model 1 | chain A    282  -107.77    93.35
2LFU.pdb | model 1 | chain A    283   -71.42   108.03
2LFU.pdb | model 1 | chain A    284   -72.99    99.31
2LFU.pdb | model 1 | chain A    285   -92.93     2.32
2LFU.pdb | model 10 | chain A    426   -88.16   127.99
2LFU.pdb | model 10 | chain A    427  -107.40  -161.76
2LFU.pdb | model 10 | chain A    428   -72.12   -14.94
2LFU.pdb | model 10 | chain A    429    66.62   -81.17
2LFU.pdb | model 10 | chain A    430  -141.67   111.76
2LFU.pdb | model 10 | chain A    431   -69.37   141.19
2LFU.pdb | model 10 | chain A    432    76.08   -75.91
2LFU.pdb | model 10 | chain A    433  -149.23   167.34
2LFU.pdb | model 10 | chain A    434   -63.43   -27.06
2LFU.pdb | model 10 | chain A    435  -166.70     None
Read fasta files

We want to read sequences that we wrote in files. PBxplore provides a function to read fasta files: the pbxplore.io.read_fasta() function.

[12]:
def pdb_to_fasta_pb(pdb_path, fasta_path):
    """
    Write a fasta file with all the PB sequences from a PDB
    """
    with open(fasta_path, 'w') as outfile:
        for chain_name, chain in pbx.chains_from_files([pdb_path]):
            dihedrals = chain.get_phi_psi_angles()
            pb_seq = pbx.assign(dihedrals)
            pbx.io.write_fasta_entry(outfile, pb_seq, chain_name)

# Write a fasta file
pdb_name, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')
pdb_to_fasta_pb(pdb_name, 'output.fasta')

# Read a list of headers and a list of sequences from a fasta file
names, sequences = pbx.io.read_fasta('output.fasta')

print('names:')
pprint(names)
print('sequences:')
pprint(sequences)

!rm output.fasta
Read 10 chain(s) in 2LFU.pdb
read 10 sequences in output.fasta
names:
['2LFU.pdb | model 1 | chain A',
 '2LFU.pdb | model 2 | chain A',
 '2LFU.pdb | model 3 | chain A',
 '2LFU.pdb | model 4 | chain A',
 '2LFU.pdb | model 5 | chain A',
 '2LFU.pdb | model 6 | chain A',
 '2LFU.pdb | model 7 | chain A',
 '2LFU.pdb | model 8 | chain A',
 '2LFU.pdb | model 9 | chain A',
 '2LFU.pdb | model 10 | chain A']
sequences:
['ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ',
 'ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghiadddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddfklpaccdddddfklmlmgcdehiaddddddfklmmgopZZ',
 'ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpacddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddefklpaccddddddfkgojbdfehpaddddddfkbccfbgZZ',
 'ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddddddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddfklmpccdddddddehiabghehiaddddddfklpccfkZZ',
 'ZZpaehiehkaccdddddehjbccdddddddddfklggcddddddddddddddfbhpadddddddddddfklopadddddehiamlmmpccdddddeehiadddddfbacddcddddddfklmaccddddddfbgghiafehiadddddddfklpacfZZ',
 'ZZmghbacfkbccdddddeehpacdddddddddfklggcdddddddddddddehiacadddddddddddfklopadddddehiaklnopcddddddeehiadddehjlnopacddddddfklmaccddddehiaehbgcdehiadddddddfehjlpcZZ',
 'ZZcchbacfkbccdddddfehpacdddddddddfklggcdddddddddddddddehjapadddddddddfknopadddddfklmmmnopcddddddehjiddddddfknopacddddddfklpaccdddddfklmaacdfehpadddddehjblckknZZ',
 'ZZcehjdeehiacdjdddedjbdcdddddddddfklggcdddddddddddddddbfblbacddddddddfklopacddddehiamlnopaddddddehjacddddfehpaaccdddddefklpaccdddddfklmbfbehehiaddddddffkgoiehZZ',
 'ZZpccdjdfkbccdddddehhpacdddddddddfklggcdddddddddddddehiacbdcdddddddddfklopadddddehiammnopcddddddeejiadddehjlgobacddddddfklmpccddddehiacbcbdfehpadddddehjklmklmZZ',
 'ZZccfklcfkbccdddddehjbdcdddddddddfklggcdddddddddddddehiapaccdddddddddfklopadddddehjamlnopaddddddehjddcdddfbfghpacddddddfklpaccddddddfbcfbacfehpadddddddekpghiaZZ']

If the sequences we want to read are spread amongst several fasta files, then we can use the pbxplore.io.read_several_fasta() function that takes a list of fasta file path as argument instead of a single path.

[13]:
# Write several fasta files
pdbname, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/1BTA.pdb', '1BTA.pdb')
pdb_to_fasta_pb(pdbname, '1BTA.fasta')
pdbname, _ = urllib.request.urlretrieve('https://files.rcsb.org/view/2LFU.pdb', '2LFU.pdb')
pdb_to_fasta_pb(pdb_name, '2FLU.fasta')

# Read the fasta files
names, sequences = pbx.io.read_several_fasta(['1BTA.fasta', '2FLU.fasta'])

# Print the first entries
print('names:')
pprint(names[:5])
print('sequences:')
pprint(sequences[:5])

!rm 1BTA.fasta 2FLU.fasta
Read 1 chain(s) in 1BTA.pdb
Read 10 chain(s) in 2LFU.pdb
read 1 sequences in 1BTA.fasta
read 10 sequences in 2FLU.fasta
names:
['1BTA.pdb | chain A',
 '2LFU.pdb | model 1 | chain A',
 '2LFU.pdb | model 2 | chain A',
 '2LFU.pdb | model 3 | chain A',
 '2LFU.pdb | model 4 | chain A']
sequences:
['ZZdddfklonbfklmmmmmmmmnopafklnoiaklmmmmmnoopacddddddehkllmmmmngoilmmmmmmmmmmmmnopacdcddZZ',
 'ZZbghiacfkbccdddddehiadddddddddddfklggcdddddddddddddehifbdcddddddddddfklopadddddfhpamlnopcddddddehjadddddehjacbddddddddfklmaccddddddfbgniaghiapaddddddfklnoambZZ',
 'ZZpcfblcffbccdddddeehjacdddddddddfklggcddddddddddddddfblghiadddddddddfklopadddddehpmmmnopcddddddeehiacdddfblopadcddddddfklpaccdddddfklmlmgcdehiaddddddfklmmgopZZ',
 'ZZmgghiafbbccdddddehjbdcdddddddddfklggcddddddddddddddfbfghpacddddddddfklopadddddehiaklmmmgcdddddeehiaddddfkbgciacdddddefklpaccddddddfkgojbdfehpaddddddfkbccfbgZZ',
 'ZZcghiacfkbacdddddfbhpacdddddddddfklmcfdddddddddddddehiacddddddddddddfknopadddddfkpamlnopaddddddehjaccdddfklnopacddddddfklmpccdddddddehiabghehiaddddddfklpccfkZZ']

Visualize protein deformability

Protein Blocks are great tools to study protein deformability. Indeed, if the block assigned to a residue changes between two frames of a trajectory, it represents a local deformation of the protein rather than the displacement of the residue.

The API allows to visualize Protein Block variability throughout a molecular dynamics simulation trajectory.

[1]:
from pprint import pprint
from IPython.display import Image, display
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
import urllib.request
import os
import numpy as np

# print date & versions
import datetime
print("Date & time:",datetime.datetime.now())
import sys
print("Python version:", sys.version)
print("Matplotlib version:", matplotlib.__version__)
Date & time: 2021-01-27 13:55:22.924405
Python version: 3.7.9 (default, Oct 19 2020, 15:13:17)
[GCC 7.5.0]
Matplotlib version: 3.3.2
[2]:
import pbxplore as pbx
print("PBxplore version:", pbx.__version__)
PBxplore version: 1.4.0

Here we will look at a molecular dynamics simulation of the barstar. As we will analyse Protein Block sequences, we first need to assign these sequences for each frame of the trajectory.

[3]:
# Assign PB sequences for all frames of a trajectory
topology, _ = urllib.request.urlretrieve('https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.gro',
                                         'psi_md_traj.gro')
trajectory, _ = urllib.request.urlretrieve('https://raw.githubusercontent.com/pierrepo/PBxplore/master/demo_doc/psi_md_traj.xtc',
                                           'psi_md_traj.xtc')
sequences = []
for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    sequences.append(pb_seq)
/home/docs/checkouts/readthedocs.org/user_builds/pbxplore/envs/latest/lib/python3.7/site-packages/MDAnalysis-1.0.0-py3.7-linux-x86_64.egg/MDAnalysis/coordinates/XDR.py:216: UserWarning: Reload offsets from trajectory
 ctime or size or n_atoms did not match
  warnings.warn("Reload offsets from trajectory\n "
Frame 1/225.
Frame 100/225.
Frame 200/225.
Frame 225/225.
Block occurences per position

The basic information we need to analyse protein deformability is the count of occurences of each PB for each position throughout the trajectory. This occurence matrix can be calculated with the pbxplore.analysis.count_matrix() function.

[4]:
count_matrix = pbx.analysis.count_matrix(sequences)

count_matrix is a numpy array with one row per PB and one column per position. In each cell is the number of time a position was assigned to a PB.

We can visualize count_matrix using Matplotlib as any 2D numpy array.

[5]:
im = plt.imshow(count_matrix, interpolation='none', aspect='auto')
plt.colorbar(im)
plt.xlabel('Position')
plt.ylabel('Block')
[5]:
Text(0, 0.5, 'Block')
_images/notebooks_Deformability_10_1.png

PBxplore provides the pbxplore.analysis.plot_map() function to ease the visualization of the occurence matrix.

[6]:
pbx.analysis.plot_map('map.png', count_matrix)
!rm map.png
_images/notebooks_Deformability_12_0.png

The pbxplore.analysis.plot_map() helper has a residue_min and a residue_max optional arguments to display only part of the matrix. These two arguments can be pass to all PBxplore functions that produce a figure.

[7]:
pbx.analysis.plot_map('map.png', count_matrix,
                      residue_min=20, residue_max=30)
!rm map.png
_images/notebooks_Deformability_14_0.png

Note that matrix in the the figure produced by pbxplore.analysis.plot_map() is normalized so as the sum of each column is 1. The matrix can be normalized with the pbxplore.analysis.compute_freq_matrix().

[8]:
freq_matrix = pbx.analysis.compute_freq_matrix(count_matrix)
[9]:
im = plt.imshow(freq_matrix, interpolation='none', aspect='auto')
plt.colorbar(im)
plt.xlabel('Position')
plt.ylabel('Block')
[9]:
Text(0, 0.5, 'Block')
_images/notebooks_Deformability_17_1.png
Protein Block entropy

The \(N_{eq}\) is a measure of variability based on the count matrix calculated above. It can be computed with the pbxplore.analysis.compute_neq() function.

[10]:
neq_by_position = pbx.analysis.compute_neq(count_matrix)

neq_by_position is a 1D numpy array with the \(N_{eq}\) for each residue.

[11]:
#Residus start by default at 1.
resids = np.arange(1,len(neq_by_position)+1)

plt.plot(resids, neq_by_position)
plt.xlabel('Position')
plt.ylabel('$N_{eq}$')
[11]:
Text(0, 0.5, '$N_{eq}$')
_images/notebooks_Deformability_21_1.png

The pbxplore.analysis.plot_neq() helper ease the plotting of the \(N_{eq}\).

[12]:
pbx.analysis.plot_neq('neq.png', neq_by_position)
!rm neq.png
_images/notebooks_Deformability_23_0.png

The residue_min and residue_max arguments are available.

[13]:
pbx.analysis.plot_neq('neq.png', neq_by_position,
                      residue_min=20, residue_max=30)
!rm neq.png
_images/notebooks_Deformability_25_0.png
Neq with RMSF

The \(N_{eq}\) and RMSF (Root Mean Square Fluctuation) can be plot together to highlight differences between flexible and rigid residues : the \(N_{eq}\) is a metric of deformability and flexibility whereas RMSF quantifies mobility.

Here an example of a plot with both metrics (You can adapt this code to your own need):

[14]:
# Let's assume you computed the RMSF (file rmsf.xvg)
# For this example, the rmsf was computed on the C-alpha and grouped by residue :
# g_rmsf -s psi_md_traj.gro -f psi_md_traj.xtc -res -o rmsf.xvg (Gromacs 4.6.7)


# Read rmsf file (ignore lines which start by '#@' and assume the data are in 2-column,
# first one the number of residue, 2nd one the rmsf
rmsf = np.array([line.split() for line in open("../../../demo_doc/rmsf.xvg") if not line[0] in '#@'], dtype=float)

#Generate 2 y-axes who share a same x-axis
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

#Left Axis
ax1.plot(rmsf[:,0], neq_by_position, color='#1f77b4', lw=2)
ax1.set_xlabel('Position')
ax1.set_ylabel('$N_{eq}$', color='#1f77b4')
ax1.set_ylim([0.0, 5])

#Right Axis
ax2.plot(rmsf[:,0], rmsf[:,1],color='#ff7f0e', lw=2)
ax2.set_ylim([0.0, 0.4])
ax2.set_ylabel('RMSF (nm)', color='#ff7f0e')

[14]:
Text(0, 0.5, 'RMSF (nm)')
_images/notebooks_Deformability_27_1.png

We observe that the region 33-35 is rigid. The high values of RMSF we observed were due to flexible residues in the vicinity of the region 33-35, probably acting as hinges (residues 32 and 36–37). Those hinges, due to their flexibility, induced the mobility of the whole loop : the region 33-35 fluctuated but did not deform.

API Reference

This page provides the documentation generated from the source files.

Protein structure manipulation — pbxplore.structure

Constants

  • PDB_EXTENSIONS
    list of file extensions corresponding to PDB files
  • PDBx_EXTENSIONS
    list of file extensions corresponding to PDBx/mmCIF files

Functions

pbxplore.structure.chains_from_files(path_list)[source]
pbxplore.structure.chains_from_trajectory(trajectory, topology)[source]

Objects

class pbxplore.structure.structure.Chain[source]

Class to handle PDB chain

get_phi_psi_angles()[source]

Compute phi and psi angles.

Returns:phi_psi_angles – Dict with residue number (int) as keys and a {'phi' : (float), 'psi' : (float)} dictionnary as values.
Return type:dict
Raises:FloatingPointError – If the computation of angles produces NaN. Generally, it means there is some problem with the residue coordinates.

Examples

>>> lines = ("ATOM    840  C   ARG B  11      22.955  23.561  -4.012  1.00 28.07           C  ",
...          "ATOM    849  N   SER B  12      22.623  24.218  -2.883  1.00 24.77           N  ",
...          "ATOM    850  CA  SER B  12      22.385  23.396  -1.637  1.00 21.99           C  ",
...          "ATOM    851  C   SER B  12      21.150  24.066  -0.947  1.00 32.67           C  ",
...          "ATOM    855  N   ILE B  13      20.421  23.341  -0.088  1.00 30.25           N  ")
>>>
>>> import pbxplore as pbx
>>> ch = pbx.structure.structure.Chain()
>>> for line in lines:
...     at = pbx.structure.structure.Atom()
...     at.read_from_PDB(line)
...     ch.add_atom(at)
...
>>> print(ch.get_phi_psi_angles())
{11: {'phi': None, 'psi': None}, 12: {'phi': -139.77684605036447, 'psi': 157.94348570201197}, 13: {'phi': None, 'psi': None}}
class pbxplore.structure.structure.Atom(ident=0, name=None, resname=None, chain=None, resid=0, x=0.0, y=0.0, z=0.0, model=None)[source]

Class for atoms in PDB or PDBx/mmCIF format.

class pbxplore.structure.PDB.PDB(name)[source]

Class to read PDB files.

Exceptions

exception pbxplore.structure.structure.ChainError[source]

Exeption class for the Chain class

exception pbxplore.structure.structure.AtomError[source]

Exeption class for the Atom class.

Protrein Block assignation — pbxplore.assignment

pbxplore.assignment.assign(dihedrals, pb_ref={'a': [41.14, 75.53, 13.92, -99.8, 131.88, -96.27, 122.08, -99.68], 'b': [108.24, -90.12, 119.54, -92.21, -18.06, -128.93, 147.04, -99.9], 'c': [-11.61, -105.66, 94.81, -106.09, 133.56, -106.93, 135.97, -100.63], 'd': [141.98, -112.79, 132.2, -114.79, 140.11, -111.05, 139.54, -103.16], 'e': [133.25, -112.37, 137.64, -108.13, 133.0, -87.3, 120.54, 77.4], 'f': [116.4, -105.53, 129.32, -96.68, 140.72, -74.19, -26.65, -94.51], 'g': [0.4, -81.83, 4.91, -100.59, 85.5, -71.65, 130.78, 84.98], 'h': [119.14, -102.58, 130.83, -67.91, 121.55, 76.25, -2.95, -90.88], 'i': [130.68, -56.92, 119.26, 77.85, 10.42, -99.43, 141.4, -98.01], 'j': [114.32, -121.47, 118.14, 82.88, -150.05, -83.81, 23.35, -85.82], 'k': [117.16, -95.41, 140.4, -59.35, -29.23, -72.39, -25.08, -76.16], 'l': [139.2, -55.96, -32.7, -68.51, -26.09, -74.44, -22.6, -71.74], 'm': [-39.62, -64.73, -39.52, -65.54, -38.88, -66.89, -37.76, -70.19], 'n': [-35.34, -65.03, -38.12, -66.34, -29.51, -89.1, -2.91, 77.9], 'o': [-45.29, -67.44, -27.72, -87.27, 5.13, 77.49, 30.71, -93.23], 'p': [-27.09, -86.14, 0.3, 59.85, 21.51, -96.3, 132.67, -92.91]})[source]

Assign Protein Blocks.

Dihedral angles are provided as a dictionnary with one item per residue. The key is the residue number, and the value is a dictionnary with phi and psi as keys, and the dihedral angles as values.

The protein block definitions are provided as a dictionnary. Each key is a block name, the values are lists of dihedral angles.

Parameters:
  • dihedrals (dict) – Phi and psi dihedral angles for each residue.
  • pb_ref (dict) – The definition of the protein blocks.

Input/Output in files — pbxplore.io

Deals with writing and reading of files in various formats.

Fasta

pbxplore.io.read_fasta(name)[source]

Read fasta file and output sequences in a list.

Parameters:name (str) – Name of file containing sequences in fasta format.
Returns:
  • header_lst (list) – List of headers (str)
  • sequence_lst (list) – List of sequences (str)
pbxplore.io.read_several_fasta(input_files)[source]

Read several fasta files

Note that each fasta file may contain several sequences.

Parameters:input_files (a list of fasta file paths.) –
Returns:
  • pb_name (a list of the headers)
  • pb_seq (a list of the sequences)
pbxplore.io.write_fasta(outfile, sequences, comments)[source]

Write fasta entries (header + sequence) in an open file

Parameters:
  • name (file descriptor) – The file descriptor to write in. It must allow writing.
  • header_lst (list) – List of headers (str)
  • sequence_lst (list) – List of sequences (str)
pbxplore.io.write_fasta_entry(outfile, sequence, comment, width=60)[source]

Write a fasta entry (header + sequence) in an open file

Parameters:
  • name (file descriptor) – The file descriptor to write in. It must allow writing.
  • sequence (str) – Sequence to format.
  • comment (str) – Comment to make header of sequence.
  • width (int) – The width of a line. FASTA_WIDTH by default.

Results af analyses

pbxplore.io.write_count_matrix(pb_count, outfile, first=1)[source]

Write a PB occurence matrix in a file.

Parameters:
  • pb_count (an occurence matrix as a 2D numpy array.) –
  • outfile (an open file where to write the matrix.) –
  • first (the residue number of the first position.) –
pbxplore.io.write_neq(outfile, neq_array, idx_first_residue=1, residue_min=1, residue_max=None)[source]

Write the Neq matrix in an open file

Parameters:
  • outfile (file descriptor) – The file descriptor to write in. It must allow writing.
  • neq_array (numpy array) – a 1D array containing the neq values.
  • idx_first_residue (int) – the index of the first residue in the array
  • residue_min (int) – the lower bound of residue frame
  • residue_max (int) – the upper bound of residue frame

Analyses – pbxplore.analysis

Build occurence matrix

pbxplore.analysis.count_matrix(pb_seq)[source]

Count the occurences of each block at each position.

The occurence matrix has one row per sequence, and one column per block. The columns are ordered in as pbxplore.PB.NAMES.

Parameters:pb_seq – a list of PB sequences.
Returns:pb_count – The occurence matrix.
Return type:numpy array
Raises:pbxplore.PB.InvalidBlockError – encountered an unexpected PB
pbxplore.analysis.read_occurence_file(name)[source]

Read an occurence matrix from a file. It will return the matrix as a numpy array and the indexes of residues.

Parameters:name (str) – Name of the file.
Returns:
  • count_mat (numpy array) – the occurence matrix without the residue number
  • residues (list) – the list of residues indexes
Raises:ValueError – when something is wrong about the file
pbxplore.analysis.plot_map(fname, count_mat, idx_first_residue=1, residue_min=1, residue_max=None)[source]

Generate a map of the distribution of PBs along protein sequence from an occurence matrix.

Parameters:
  • fname (str) – The path to the file to write in
  • count_mat (numpy array) – an occurence matrix returned by count_matrix.
  • idx_first_residue (int) – the index of the first residue in the matrix
  • residue_min (int) – the lower bound of the protein sequence
  • residue_max (int) – the upper bound of the protein sequence

Compare protein block sequences

pbxplore.analysis.compare(header_lst, seq_lst, substitution_mat, fname)[source]

Command line wrapper for the comparison of all sequences with the first one

When the –compare option is given to the command line, the program compares all the sequences to the first one and writes these comparison as sequences of digits. These digits represent the distance between the PB in the target and the one in the reference at the same position. The digits are normalized in the [0; 9] range.

This function run the comparison, write the result in a fasta file, and display on screen informations about the process.

Parameters:
  • header_lst (list of strings) – The list of sequence headers ordered as the sequences
  • seq_lst (list of strings) – The list of sequences ordered as the headers
  • substitution_mat (numpy.array) – A substitution matrix expressed as similarity scores
  • fname (str) – The output file name

Visualize deformability

pbxplore.analysis.compute_neq(count_mat)[source]

Compute the Neq for each residue from an occurence matrix.

Parameters:count_mat (numpy array) – an occurence matrix returned by count_matrix.
Returns:neq_array – a 1D array containing the neq values
Return type:numpy array
pbxplore.analysis.plot_neq(fname, neq_array, idx_first_residue=1, residue_min=1, residue_max=None)[source]

Generate the Neq plot along the protein sequence

Parameters:
  • fname (str) – The path to the file to write in
  • neq_array (numpy array) – an array containing the neq value associated to the residue number
  • idx_first_residue (int) – the index of the first residue in the array
  • residue_min (int) – the lower bound of the protein sequence
  • residue_max (int) – the upper bound of the protein sequence

Generates logo representation of PBs frequency along protein sequence through the weblogo library.

The weblogo reference: G. E. Crooks, G. Hon, J.-M. Chandonia, and S. E. Brenner. ‘WebLogo: A Sequence Logo Generator.’ Genome Research 14:1188–90 (2004) doi:10.1101/gr.849004. http://weblogo.threeplusone.com/

Parameters:
  • fname (str) – The path to the file to write in
  • count_mat (numpy array) – an occurence matrix returned by count_matrix.
  • idx_first_residue (int) – the index of the first residue in the matrix
  • residue_min (int) – the lower bound of residue frame
  • residue_max (int) – the upper bound of residue frame
  • title (str) – the title of the weblogo. Default is empty.

Utils

pbxplore.analysis.substitution_score(substitution_matrix, seqA, seqB)[source]

Compute the substitution score to go from seqA to seqB

Both sequences must have the same length.

The score is either expressed as a similarity or a distance depending on the substitution matrix.

pbxplore.analysis.compute_freq_matrix(count_mat)[source]

Compute a PB frequency matrix from an occurence matrix.

The frequency matrix has one row per sequence, and one column per block. The columns are ordered in as pbxplore.PB.NAMES.

Parameters:count_mat (numpy array) – an occurence matrix returned by count_matrix.
Returns:freq – The frequency matrix
Return type:numpy array
pbxplore.analysis.compute_score_by_position(score_mat, seq1, seq2)[source]

Computes substitution score between two sequences position per position

The substitution score can represent a similarity or a distance depending on the score matrix provided. The score matrix should be provided as a 2D numpy array with score[i, j] the score to swich the PB at the i-th position in pbxplore.PB.NAMES to the PB at the j-th position in pbxplore.PB.NAMES.

The function returns the result as a list of substitution scores to go from seq1 to seq2 for each position. Both sequences must have the same length.

Note

The score to move from or to a Z block (dummy block) is always 0.

Raises:pbxplore.PB.InvalidBlockError – encountered an unexpected PB

Functions directly accessible at the module level

pbxplore.chains_from_files(path_list)

See pbxplore.structure.chains_from_files()

pbxplore.chains_from_trajectory(trajectory, topology)

See pbxplore.structure.chains_from_trajectory()

pbxplore.assign(dihedrals)

See pbxplore.assignment.assign()

Citation

If you use PBxplore, please cite this tool as:

Barnoud J, Santuz H, Craveur P, Joseph AP, Jallu V, de Brevern AG, Poulain P,
PBxplore: a tool to analyze local protein structure and deformability with Protein Blocks

Contact & Support

PBxplore is a research software and has been developped by:

  • Pierre Poulain, “Mitochondria, Metals and Oxidative Stress” group, Institut Jacques Monod, UMR 7592, Univ. Paris Diderot, CNRS, Sorbonne Paris Cité, France.
  • Jonathan Barnoud, University of Groningen, Groningen, The Netherlands.
  • Hubert Santuz, Laboratoire de Biochimie Théorique, CNRS UPR 9080, Institut de Biologie Physico-Chimique, Paris, France.
  • Alexandre G. de Brevern, DSIMB, INSERM, U 1134, INTS, Paris, France.

If you want to report a bug or request a feature, please use the GitHub issue system.

Licence

PBxplore is licensed under The MIT License.

[1]
    1. de Brevern, C. Etchebest, and S. Hazout. Bayesian Probabilistic Approach for Predicting Backbone Structures in Terms of Protein Blocks. Proteins 41:271-87 (2000).