Bio2BEL ExPASy

This library helps to download and parses the enzyme classes from the ExPASy ENZYME database.

Installation

Easiest

Download the latest stable code from PyPI with:

$ python3 -m pip install bio2bel_expasy

Get the Latest

Download the most recent code from GitHub with:

$ python3 -m pip install git+https://github.com/bio2bel/expasy.git

For Developers

Clone the repository from GitHub and install in editable mode with:

$ git clone https://github.com/bio2bel/expasy.git
$ cd expasy
$ python3 -m pip install -e .

Testing

Bio2BEL ExPASy is tested with Python3 on Linux using Travis CI.

Manager

Manager for Bio2BEL ExPASy.

class bio2bel_expasy.manager.Manager(*args, **kwargs)[source]

Creates a connection to database and a persistent session using SQLAlchemy.

namespace_model

alias of bio2bel_expasy.models.Enzyme

id_enzyme = None

Maps canonicalized ExPASy enzyme identifiers to their SQLAlchemy models

is_populated() → bool[source]

Check if the database is already populated.

count_enzymes() → int[source]

Count the number of enzyme entries in the database.

count_enzyme_prosites() → int[source]

Count the number of enzyme-prosite annotations.

count_prosites() → int[source]

Count the number of ProSite entries in the database.

count_enzyme_proteins() → int[source]

Count the number of enzyme-protein annotations.

count_proteins() → int[source]

Count the number of protein entries in the database.

summarize() → Mapping[str, int][source]

Return a summary dictionary over the content of the database.

get_or_create_enzyme(expasy_id: str, description: Optional[str] = None) → bio2bel_expasy.models.Enzyme[source]

Get an enzyme from the database or creates it.

get_or_create_prosite(prosite_id: str, **kwargs) → bio2bel_expasy.models.Prosite[source]

Get a prosite from the database or creates it.

get_or_create_protein(accession_number: str, entry_name: str, **kwargs) → bio2bel_expasy.models.Protein[source]

Get a protein by its UniProt accession or create it.

Parameters:
  • accession_number
  • entry_name
  • kwargs
populate(tree_path: Optional[str] = None, database_path: Optional[str] = None) → None[source]

Populate the database..

Parameters:
  • tree_path
  • database_path
populate_tree(path: Optional[str] = None, force_download: bool = False) → None[source]

Download and populate the ExPASy tree.

Parameters:
  • path – A custom url to download
  • force_download – If true, overwrites a previously cached file
populate_database(path: Optional[str] = None, force_download: bool = False) → None[source]

Populate the ExPASy database.

Parameters:
  • path – A custom url to download
  • force_download – If true, overwrites a previously cached file
get_enzyme_by_id(expasy_id: str) → Optional[bio2bel_expasy.models.Enzyme][source]

Get an enzyme by its ExPASy identifier.

Implementation note: canonicalizes identifier to remove all spaces first.

Parameters:expasy_id – An ExPASy identifier. Example: 1.3.3.- or 1.3.3.19
get_parent_by_expasy_id(expasy_id: str) → Optional[bio2bel_expasy.models.Enzyme][source]

Return the parent ID of ExPASy identifier if exist otherwise returns None.

Parameters:expasy_id – An ExPASy identifier
get_children_by_expasy_id(expasy_id: str) → Optional[List[bio2bel_expasy.models.Enzyme]][source]

Return a list of enzymes which are children of the enzyme with the given ExPASy enzyme identifier.

Parameters:expasy_id – An ExPASy enzyme identifier
get_protein_by_uniprot_id(uniprot_id: str) → Optional[bio2bel_expasy.models.Protein][source]

Get a protein having the given UniProt identifier.

Parameters:uniprot_id – A UniProt identifier
>>> from bio2bel_expasy import Manager
>>> manager = Manager()
>>> protein = manager.get_protein_by_uniprot_id('Q6AZW2')
>>> protein.accession_number
'Q6AZW2'
get_prosite_by_id(prosite_id: str) → Optional[bio2bel_expasy.models.Prosite][source]

Get a ProSite having the given ProSite identifier.

Parameters:prosite_id – A ProSite identifier
get_prosites_by_expasy_id(expasy_id: str) → Optional[List[bio2bel_expasy.models.Prosite]][source]

Get a list of ProSites associated with the enzyme corresponding to the given identifier.

Parameters:expasy_id – An ExPASy identifier
get_enzymes_by_prosite_id(prosite_id: str) → Optional[List[bio2bel_expasy.models.Enzyme]][source]

Return a list of enzymes associated with the given ProSite ID.

Parameters:prosite_id – ProSite identifier
get_proteins_by_expasy_id(expasy_id: str) → Optional[List[bio2bel_expasy.models.Protein]][source]

Return a list of UniProt entries as tuples (accession_number, entry_name) of the given enzyme_id.

Parameters:expasy_id – An ExPASy identifier
get_enzymes_by_uniprot_id(uniprot_id: str) → Optional[List[bio2bel_expasy.models.Enzyme]][source]

Return a list of enzymes annotated to the protein with the given UniProt accession number.

Parameters:uniprot_id – A UniProt identifier

Example:

>>> from bio2bel_expasy import Manager
>>> manager = Manager()
>>> manager.get_enzymes_by_uniprot_id('Q6AZW2')
>>> ...
enrich_proteins_with_enzyme_families(graph: pybel.struct.graph.BELGraph) → None[source]

Enrich proteins in the BEL graph with IS_A relations to their enzyme classes.

  1. Gets a list of UniProt proteins
  2. Annotates pybel.constants.IS_A relations for all enzyme classes it finds
look_up_enzyme(node: pybel.dsl.node_classes.BaseEntity) → Optional[bio2bel_expasy.models.Enzyme][source]

Try to get an enzyme model from the given node.

enrich_enzyme_with_proteins(graph: pybel.struct.graph.BELGraph, node: pybel.dsl.node_classes.BaseEntity) → None[source]

Enrich an enzyme with all of its member proteins.

enrich_enzyme_parents(graph: pybel.struct.graph.BELGraph, node: pybel.dsl.node_classes.BaseEntity) → None[source]

Enrich an enzyme with its parents.

enrich_enzyme_children(graph: pybel.struct.graph.BELGraph, node: pybel.dsl.node_classes.BaseEntity) → None[source]

Enrich an enzyme with all of its children.

enrich_enzymes(graph: pybel.struct.graph.BELGraph) → None[source]

Add all children of entries.

enrich_enzymes_with_prosites(graph: pybel.struct.graph.BELGraph) → None[source]

Enrich enzyme classes in the graph with ProSites.

Models

SQLAlchemy models for Bio2BEL ExPASy.

class bio2bel_expasy.models.Enzyme(**kwargs)[source]

ExPASy’s main entry.

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

expasy_id

The ExPASy enzyme code.

description

The ExPASy enzyme description. May need context of parents.

level

Return what level (1, 2, 3, or 4) this enzyme is based on the number of dashes in its id.

to_json() → Mapping[source]

Return the data from this model as a dictionary.

as_bel() → pybel.dsl.node_classes.Protein[source]

Return a PyBEL node representing this enzyme.

class bio2bel_expasy.models.Prosite(**kwargs)[source]

Maps ec to prosite entries.

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

prosite_id

ProSite Identifier

as_bel() → pybel.dsl.node_classes.Protein[source]

Return a PyBEL node data dictionary representing this ProSite entry.

class bio2bel_expasy.models.Protein(**kwargs)[source]

Maps enzyme to SwissProt or UniProt.

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

accession_number

UniProt accession number

entry_name

UniProt entry name.

as_bel() → pybel.dsl.node_classes.Protein[source]

Return a PyBEL node data dictionary representing this UniProt entry.

Constants

Constants for Bio2BEL ExPASy.

bio2bel_expasy.constants.EXPASY_TREE_URL = 'ftp://ftp.expasy.org/databases/enzyme/enzclass.txt'

The web location of the enzyme class tree document

bio2bel_expasy.constants.EXPASY_TREE_DATA_PATH = '/home/docs/.bio2bel/expasy/enzclass.txt'

The local cache location where the enzyme class tree document is stored

bio2bel_expasy.constants.EXPASY_DATABASE_URL = 'ftp://ftp.expasy.org/databases/enzyme/enzyme.dat'

The web location of the ENZYME database document

bio2bel_expasy.constants.EXPASY_DATA_PATH = '/home/docs/.bio2bel/expasy/enzyme.dat'

The local cache location where the ENZYME database document is stored

Indices and tables