Welcome to i5k_doc’s documentation!¶
This is my introduction to this project
Requirements
My project depend on xxx
Contents:
Pre-requisites¶
Python modules¶
Described in requirements.txt
- Django==1.8.12
- Markdown==2.6.6
- celery==3.1.23
- cssmin==0.2.0
- django-pipeline==1.6.8
- django-simple-captcha==0.4.5
- djangorestframework==2.3.4
- django-rest-swagger==0.3.5
- docutils==0.12
- jsmin==2.0.11
- pillow==2.2.2
- psycopg2==2.6
- pycrypto==2.6.1
- python-memcached==1.57
- python-social-auth==0.2.16
- requests-oauthlib==0.6.1
- wsgiref
- django-suit==0.2.18
Service-side pre-requisites¶
- RabbitMQ
- mod_wsgi
- PostgreSQL
Setup guide¶
This setup guide is tested in Centos 6.7 and django 1.8.12
Note: The following variables may be used in path names; substitute as appropriate:
<user> : the name of the user doing a set up.
<user-home> : the user's home directory, e.g., /home/<user>
<app-home> : the root directory of the i5K application, e.g., /app/local/i5k
<virt-env> : the root directory of the virtualenv this set up creates.
<git-home> : the directory containing the django-blast git repository, e.g. <user-home>/git
Project Applications¶
Clone or refresh the django-blast project:
git clone https://github.com/NAL-i5K/django-blast
# Or if the django-blast repository exists:
cd <git-home>
git fetch
Python 2.7.8¶
Install necessary packages:
sudo yum -y groupinstall "Development tools"
sudo yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel
sudo yum -y install readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel python-devel
Install python 2.7.8 from source:
cd <user-home>
wget http://www.python.org/ftp/python/2.7.8/Python-2.7.8.tar.xz
tar -xf Python-2.7.8.tar.xz
# Configure as a shared library:
cd Python-2.7.8
./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
# Compile and install:
make
sudo make altinstall
# Update PATH:
export PATH="/usr/local/bin:$PATH"
# Checking Python version (output should be: Python 2.7.8):
python2.7 -V
# Cleanup if desired:
cd ..
rm -rf Python-2.7.8.tar.xz Python-2.7.8
Install pip and virtualenv:
wget https://bootstrap.pypa.io/ez_setup.py
sudo /usr/local/bin/python2.7 ez_setup.py
wget https://bootstrap.pypa.io/get-pip.py
sudo /usr/local/bin/python2.7 get-pip.py
sudo /usr/local/bin/pip2.7 install virtualenv
Build a separate virtualenv:
# Make root dir for virtualenv and cd into it:
cd django-blast
# Create a virtual environment called py2.7 and activate:
virtualenv py2.7
source py2.7/bin/activate
Python Modules and Packages¶
Install additional Python packages:
cd <virt-env>
# Cut, paste and run the following bash script.
# If any installation fails script halts:
for package in \
"django==1.8.12" \
"markdown==2.6.6" \
"cssmin==0.2.0" \
"django-pipeline==1.6.8" \
"djangorestframework==2.3.4" \
"django-rest-swagger==0.3.5" \
"django-suit==0.2.18" \
"django-axes" \
"docutils==0.12" \
"jsmin==2.0.11" \
"pycrypto==2.6.1" \
"python-memcached==1.57" \
"python-social-auth==0.2.16" \
"requests-oauthlib==0.6.1" \
"wsgiref==0.1.2" \
"pillow==2.2.2" \
"django-simple-captcha==0.4.5"
do
echo -e "\nInstalling $package..."
if ! yes | pip install $package ; then
echo -e "\nInstallation of package $package FAILED"
break
fi
done
RabbitMQ¶
Install RabbitMQ Server:
cd <user-home>
# Install RHEL/CentOS 6.8 64-Bit Extra Packages for Enterprise Linux (Epel).
# The 6.8 Epel caters for CentOS 6.*:
wget https://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
sudo rpm -ivh epel-release-6-8.noarch.rpm
# Install Erlang:
sudo yum -y install erlang
# Install RabbitMQ server:
sudo yum -y install rabbitmq-server
# To start the daemon by default when system boots run:
sudo chkconfig rabbitmq-server on
# Start the server:
sudo /sbin/service rabbitmq-server start
# Clean up:
rm epel-release-6-8.noarch.rpm
Celery¶
Install celery in the virtualenv and configure:
# At this point <virt-env> has all project files
# including celery config files.
cd <virt-env>
pip install celery==3.1.23
# Copy files:
sudo cp celeryd /etc/init.d
sudo cp celerybeat /etc/init.d
sudo cp celeryd.sysconfig /etc/default/celeryd
sudo cp celerybeat.sysconfig /etc/default/celerybeat
# Sudo edit '/etc/default/celeryd' as follows:
CELERYD_CHDIR="<virt-env>"
CELERYD_MULTI="<virt-env>/py2.7/bin/celery multi"
# Sudo edit '/etc/default/celerybeat' as follows:
CELERYBEAT_CHDIR="<virt-env>"
CELERY_BIN="<virt-env>/py2.7/bin/celery"
# Set as daemon:
sudo chkconfig celeryd on
sudo chkconfig celerybeat on
Memcached¶
Install and activate memcached:
sudo yum -y install memcached
# Set to start at boot time:
sudo chkconfig memcached on
Database¶
Install PostgreSQL:
# Add line to yum repository:
echo 'exclude=postgresql*' | sudo tee -a /etc/yum.repos.d/CentOS-Base.repo
# Install the PostgreSQL Global Development Group (PGDG) RPM file:
sudo yum -y install http://yum.postgresql.org/9.5/redhat/rhel-6-x86_64/pgdg-centos95-9.5-2.noarch.rpm
# Install PostgreSQL 9.5:
sudo yum -y install postgresql95-server postgresql95-contrib postgresql95-devel
# Initialize (uses default data directory: /var/lib/pgsql):
sudo service postgresql-9.5 initdb
# Startup at boot:
sudo chkconfig postgresql-9.5 on
# Control:
# sudo service postgresql-9.5 <command>
#
# where <command> can be:
#
# start : start the database.
# stop : stop the database.
# restart : stop/start the database; used to read changes to core configuration files.
# reload : reload pg_hba.conf file while keeping database running.
# Start:
sudo service postgresql-9.5 start
#
# (To remove everything: sudo yum erase postgresql95*)
#
# Create django database and user:
sudo su - postgres
psql
# At the prompt 'postgres=#' enter:
create database django;
create user django;
grant all on database django to django;
# Connect to django database:
\c django
# Create extension hstore:
create extension hstore;
# Exit psql and postgres user:
\q
exit
# Config in pg_hba.conf:
cd <virt-env>
export PATH=/usr/pgsql-9.5/bin:$PATH
# Restart:
sudo service postgresql-9.5 restart
# Install pycopg2:
pip install psycopg2==2.6
Apache¶
Please note: It is essential that tcp port 80 be open in your system. Sometimes the firewall may deny access to it. Check if iptables will drop input packets in the output of this command:
sudo iptables -L
If you see “INPUT” and “DROP” on the same line and no specific ACCEPT rule for tcp port 80 chances are web traffic will be blocked. Ask your sysadmin to open tcp ports 80 and 443 for http and https. Alternatively, check this iptables guide.
Install Apache and related modules:
sudo yum -y install httpd httpd-devel mod_ssl
Give the system a fully qualified domain name (FQDN) if needed:
# Find out the system IP addres with 'ifconfig'.
# Assuming it is a VM created by Vagrant, this could be 10.0.2.15.
# Sudo edit '/etc/hosts' and add an address and domain name entry.
# For example:
10.0.2.15 virtualCentOS.local virtualCentOS
# Sudo edit the file /etc/httpd/conf/httpd.conf,
# and set the ServerName, for example:
ServerName virtualCentOS.local:80
# Set to start httpd at boot:
sudo chkconfig httpd on
# Check this setting if you wish, with:
sudo chkconfig --list httpd
# Control:
# sudo apachectl <command>
# Where <command> can be:
# start : Start httpd daemon.
# stop : Stop httpd daemon.
# restart : Restart httpd daemon, start it if not running.
# status : Brief status report.
# graceful : Restart without aborting open connections.
# graceful-stop : stop without aborting open connections.
#
# Start httpd daemon:
sudo apachectl start
# Test Apache:
# If all is well. This command should produce copious
# HTML output and in the first few lines you should see:
# '<title>Apache HTTP Server Test Page powered by CentOS</title>'
curl localhost
# You can also view the formatted Apache test page in your
# browser, e.g., firefox http://<setup-machine-ip-address>
Workplace Apps¶
Blast¶
Introduction
I5K BLAST Tutorial is on https://i5k.nal.usda.gov/content/blast-tutorial
BLAST DB Configuration¶
There are five tables for creating BLAST DB and browsing in I5K-blast.
- Add Organism:
- Display name should be scientific name.
- Short name are used by system as a abbreviation.
- Descriptions and NCBI taxa ID are automatically filled.

- Add Sequence types:
- Used to classify BLAST DBs in distinct catagories.
- Provide two kinds of molecule type for choosing, Nucleotide/Peptide.
- Add Sequence
- Add BLAST DB
- Choose
Organsim
- Choose
Type
(Sequence type) - Type location of fasta file in
FASTA file path
- Type
Title
name. (showed in HMMER page) - Type
Descriptions
. - Check
is shown
, if not check, this database would show in HMMER page. - Save
- Choose

- Add JBrowse settings
Hmmer¶
HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs).
I5K HMMER Tutorial is on https://i5k.nal.usda.gov/webapp/hmmer/manual.
HMMER DB Configuration¶
Like Blast, HMMER databases must be configured then they could be searched.
Go django admin page and click Hmmer on left-menubar. You need to create HMMER db instance (Hmmer dbs) for each fasta file.
- Choose
Organsim
- Type location of peptide fasta file in
FASTA file path
- Type
Title
name. (showed in HMMER page) - Type
Descriptions
. - Check
is shown
, if not check, this database would show in HMMER page. - Save

HMMER Query Histroy¶
HMMER query histories are stored in table HMMER results
. Users could review them on dashboard.
All query results (files on disk) will be removed if it’s expired. (default: after seven days)
Query results locate in directory $MEDIA_ROOT/hmmer/task/
.
Clustal¶
ClustalW is the oldest of the currently most widely used programs for multiple sequence alignment. Clustal Omega is the latest version of CLUSTAL series. ClustalO is faster and more accurate because of new HMM alignment engine.
I5K CLUSTAL Tutorial is on https://i5k.nal.usda.gov/webapp/clustal/manual.
Install & Configuration¶
Install Clustalw and Clustal Omega.
Then append both bin directory in evironment varialbe PATH
.
Clustal Query Histroy¶
Clustal query histories are stored in table Clustal results
. Users could review them on dashboard.
All query results (files on disk) will be removed if it’s expired. (default: after seven days)
Query results locate in directory $MEDIA_ROOT/clustal/task/
.
Dashboard¶
Personal query history.
Data¶
Rest framework. Not finished
Proxy¶
For providing indirect access to some resources without https. Currently it is used by Web Apollo instances for looking up GO Terms.
Drupal_SSO¶
Coonection to Drupal summit data function.
DRUPAL_URL = 'https://gmod-dev.nal.usda.gov'
# cookie can be seen in same domain
DRUPAL_COOKIE_DOMAIN=".nal.usda.gov"
WebApollo SSO¶
Complete introduction locate in Section 4.
WebApollo Single Sign On¶
What is WebApollo SSO?¶
The basic idea in SSO is to provide handy user interface and make WebApollo user more like a community. In order to accomplish those ideas, we try to transfer management jobs from WebApollo to SSO. SSO gives the coordinators more authority to manage their members who can annotating and grant the priviledges on their own.
In SSO, we seperate users into three different roles.
- First, the
ADMIN
who actually owns ‘admin priviledge’ in WebApollo, can manage users/groups/eroll event. - Second, the
COORDINATOR
who belong to groupGROUP_(Organism_short_name(OSN))_ADMIN
, can manage membership in specific (Organism). - Last, the remaining users are in
USER
. They can make request to join (or leave) different organism team. Once be recuited in, user will pertain to groupGROUP_(OSN)_USER
.
SSO make a virtual role COORDINATOR
by exploiting a conventional group name GROUP_(OSN)_ADMIN
and the user in the team
would be in group GROUP_(OSN)_USER
.
Role\ | WebApollo | Single Sign On (SSO) |
ADMIN | Global Admin | Global Admin |
COORDINATOR | Admin permission | in GROUP_()_ADMIN |
USER | RWE permission | in GROUP_()_USER with RWE permission |
Note
Mapping between full organism name and short organism name are stored in django-blast app. Full organism name is the real name in WebApollo and short name is a abbreviation used in django-blast app.
Framework Overview¶
SSO was implemented in Django and JQuery. Conceptually, SSO is a proxy service for delegating user request to appropriate WebApollo service. The main advantage here is that SSO could provides more social utilities for the I5K community.

Database Schema (UserMapping)
Apollo_user_id | Apollo_user_name | Apollo_user_pwd | django_user | last_date |
1 | Chris | (AES encrpted pwd) | Christopher | |
2 | Monica | (AES encrpted pwd) | Monica | |
3 | Mei | (AES encrpted pwd) | NULL |
SSO records the mapping between Apollo_user and django_user in table UserMapping
.
Apollo_user_id
and django_user
are unique attribute and this makes mapping a one to one relationship.
(apollo_user_name could be changed and is not unique)
In above table, record 1 and 2 tell a formal relationship but record 3 describes an Apollo user doesn’t belong to any django user. User can claim it by re-register process. (mentioned below)
Configuration¶
SSO uses a pre-assigned admin Apollo account to communiate with Apollo server. The account must be create on apollo server first. Two URLs address of i5k server and apollo server are used to identify each others’ locations. In order to secure user password, SSO encrpt it before saving password into database.
WebApollo SSO configuration in django setting.py:
# WebApollo SSO robot account
ROBOT_ID='R2D2'
ROBOT_PWD='demo'
#URL of i5k workspace and webapollo
I5K_URL='http://i5k.nal.gov'
APOLLO_URL='http://i5k.apollo.nal.gov/apollo'
# cookie can be seen in Apollo-prod and Gmod-prod
APOLLO_COOKIE_DOMAIN=".nal.usda.gov"
#Encypt webapollo user password in SSO database.
#AES key must be either 16, 24, or 32 bytes long.
SSO_CIPHER='1234567890123456'
Register WebApollo¶
There are three ways to make connection between i5k account to apollo account.

When registering an new i5k account, SSO also create an apollo account(same ID).
When entering SSO, if SSO doesn’t have mapping record of user,
- it asks user to create a new apollo account
- or register his account info into SSO.
When entering SSO, if SSO has mapping record of user but login failed, it asks user to re-enter his password into SSO.
Utilities¶
There are six individual tab pages, three of them are general and others are specific for Admin user.
Utilities only for Admin¶
Tab\ | Function Descriptions |
User(Admin) | View/Create/Delete/Update/Disconnect Apollo User |
Group(Admin) | View/Create/Delete Apollo Group |
PReq(Admin) | View Pending request |
General Utilities¶
Tab\ | Function Descriptions |
My Organism | Manage organism which you joined in / Go WebApollo |
My Request | Make request to join/leave a organism community |
My Info | User basic information |