Welcome to CTS-ircdeploy’s documentation!

_images/logo.gif

This project contains the deployment architecture used by the International Rescue Committee (IRC) for the CTS project. While this repository is specific to IRC’s instance of CTS, the architecture may be used as an example or reference for alternative deployments. To explore other open source IRC projects, please see IRC’s GitHub account.

The purpose of this documentation is to help system administrators deploy and maintain CTS. A certain level of knowledge of Linux is assumed.

Contents:

Overview

This is an overview of how CTS is deployed.

Server Architecture

CTS is deployed on the following stack using Fabric and SaltStack:

  • OS: Ubuntu 12.04 LTS
  • Python: 2.7
  • Database: Postgres 9.1
  • Application Server: Gunicorn
  • Frontend Server: Nginx
  • Cache: Memcached

Deploys are done to a single server, which will serve all the CTS instances under one domain name using URL prefixing.

For development and test purposes, CTS can be deployed to servers other than production. These are called environments, and there are two defined initially:

  • staging
  • production

The fab commands used to deploy and provision a server always take an environment name as the first argument, e.g. fab staging do_something.

When deploying to a server, the code is deployed from a branch of the code repository on GitHub. Which branch is used is controlled by a setting in the local file conf/pillar/<ENVIRONMENT>/env.sls, e.g. conf/pillar/staging/env.sls might contain:

environment: staging

domain: cts-staging.caktusgroup.com

repo:
  url: git@github.com:theirc/CTS.git
  branch: origin/develop

This indicates that the staging server will use the code from the origin/develop branch.

Country Instances

On a server, there can be multiple copies of CTS running, each with completely independent data. Each copy is called an instance.

All the instances on a server are running the same code, but they run in different processes and use different databases.

An Nginx server receives incoming requests and routes them to the appropriate instance based on the first part of the URL path. E.g. https://cts.rescue.org/IQ/ might go to an instance for Iraq, while https://cts.rescue.org/TR/ might go to an instance for Turkey.

The instances are defined in the file conf/pillar/project.sls and are the same for all environments. Here’s a sample excerpt from that file:

instances:
  turkey:
    name: Turkey
    prefix: /TR
    currency: TRY
    port: 8001
  iraq:
    name: Iraq
    prefix: /IQ
    currency: IQD
    port: 8002
  jordan:
    name: Jordan
    prefix: /JO
    currency: JOD
    port: 8003

Note that this file defines for each instance a human-readable name, a URL prefix, the international code for the instance’s currency, and an internal port where the instance will listen.

Logs for each instance are in /var/www/cts/log/<INSTANCE>/ on the server, where <INSTANCE> is the key of the instance in the configuration, e.g. iraq or turkey.

Some fab commands require an instance to be specified. Here’s an example of how that is done:

fab staging instance:iraq manage_shell

For each instance, there’s a file cts/settings/<INSTANCE>.py with the settings that are unique for that instance. See the existing files, such as cts/settings/jordan.py, to see what needs to be in the instance’s settings file. (Actually, very little needs to be in there.)

Local Development

When running locally (e.g. django-admin.py runserver), the environment name is dev and there’s only one instance, local, with no URL prefix. Since there’s no prefix, it should work the way developers are used to.

Server Setup Reference

This is a detailed description of how a server ends up configured by the CTS provisioning process.

Files

Below is the server layout created by this provisioning process:

/var/www/cts/
    source/
    env/
    log/
    public/
        static/
        media/
    ssl/

source contains the source code of the project, checked out from git. env is the virtualenv for Python requirements. log stores the Nginx, Gunicorn and other logs used by the project. public holds the static resources (css/js) for the project and the uploaded user media. public/static/ and public/media/ map to the STATIC_ROOT and MEDIA_ROOT settings. ssl contains the SSL key and certificate pair.

Configuration files are updated in:

/etc/nginx
/etc/postgresql
/etc/rabbitmq
/etc/supervisor

Processes

Nginx

Incoming HTTP requests are received by the Nginx web server. /etc/nginx/sites-enabled/cts.conf has the specific configuration for CTS.

Nginx serves static files itself, and routes dynamic requests to the appropriate backend processes. It uses the request URL path to determine how to handle each request.

Nginx is started by an init.d script. There is only one logical nginx running on a server, though it might consist of a master process and multiple worker processes.

Gunicorn

Gunicorn is a Python WSGI server. It is used to run processes with the Django code that can handle HTTP requests routed from Nginx.

Gunicorn processes are managed by Supervisord.

For each CTS instance, there will be one or more Gunicorn processes running on the server.

Celery

Celery is a Python library allowing tasks to be scheduled for later execution. In CTS, Celery tasks are used to poll for new package scans.

There are two kinds of celery processes. Worker processes do the work. There can be many worker processes for an instance. A beat process is like cron: it schedules tasks at certain times. An instance only has one beat process.

Celery processes are managed by Supervisord.

Supervisor

Supervisor is a daemon that manages background processes. Each process is configured by a file in /etc/supervisor/conf.d, and supervisor ensures that each process is started and continues to run.

CTS uses Supervisor to manage long-running Python processes, like Gunicorn and Celery.

Supervisor itself is started by an init.d script.

Only one logical Supervisor process runs on a server.

Rabbit MQ

Rabbit MQ provides reliable asynchronous message queuing among Celery’s processes.

Rabbit MQ is started by an init.d script.

Only one logical Rabbit MQ runs on a server.

Postgres

Postgres is our primary database server. We use a PostgreSQL instance hosted on Amazon RDS.

Getting Started

This is a step-by-step guide to start administering IRC’s CTS servers.

  1. Clone the git repository:

    git clone https://github.com/theirc/CTS-ircdeploy.git
    cd CTS-ircdeploy/
    

Or if you’ll be contributing to the repository:

git clone git@github.com:theirc/CTS-ircdeploy.git
cd CTS-ircdeploy/
  1. To setup your local environment you should create a virtualenv and install the necessary requirements:

    mkvirtualenv cts-ircdeploy
    $VIRTUAL_ENV/bin/pip install -r $PWD/requirements.txt
    
  2. Add a developer user to the configuration.

    Edit conf/pillar/devs.sls and add a username and SSH public key. This will be used to grant access to the servers later.

    Each user record should match the format:

    <username>:
      public_key:
       - ssh-rsa <Full SSH Public Key would go here>
    

    e.g.

    popeye:
    public_key:

    Additional developers can be added later, but you will need to create at least one user for yourself.

    Submit a pull request to the repository and get the change merged.

  3. Ask someone who already has access to the server to deploy.

    This will apply your changes, so you’ll have an account on the server with ssh access and sudo privileges.

    If you need to administer multiple environments, ask to have the changes deployed to all of them.

  4. Fetch the latest secrets.

    The secrets files are not in git, so you’ll need to download them from the server. Each environment has its own secrets file, so you’ll need to run the appropriate command for each. Suppose you’re working with the staging server, then you’d run:

    fab staging get_secrets
    

    After running this, you should have a local file conf/pillar/staging/secrets.sls with the passwords, keys, etc that aren’t kept in git.

At this point, you should be able to do any of the needed administration tasks.

Common Administration Tasks

Here are some common tasks and how to perform them.

Get secrets

Since the secrets aren’t stored in Git, it’s a good idea, before doing any server administration, to fetch the current secrets from the servers:

fab staging get_secrets
fab production get_secrets

Update secrets

If you need to update any secrets, be sure to first get the latest secrets file from the relevant server (see above). Then you can edit your local copy (e.g. conf/pillar/production/secrets.sls or conf/pillar/staging/secrets.sls) and deploy (see next item).

Deploy new code

Running a deploy does several things:

  1. If the local secrets files are different from the ones on the server, display the differences and ask whether to update the server files from the local ones. If you answer “no” at this point, the deploy is aborted.
  2. Ensure system and Python packages are installed, configuration files are correct, and generally check and update the provisioning on the server. This uses Salt.
  3. Sync all the configuration files under conf from your local system to the server. This makes it easier to test deploy changes without having to continually commit possibly broken code first.
  4. Checkout the source code from github. It’ll use whatever branch name is set in the local conf/pillar/<environment>/env.sls file, so you can test by editing that file locally and deploying. But the actual source code you want to test has to be pushed to github. (By “source code” here, we basically mean everything in the git repository that is outside of the conf directory.)
  5. Run the usual Django deploy-time commands such as collectstatic and syncdb --migrate.
  6. Restart the servers

To do a deploy, the command is just “deploy”, e.g.:

fab production deploy

Run arbitrary Django management commands

If you want to run an arbitrary Django management command, like “syncdb” or “dbshell”, you can use a command from your local system like:

fab staging instance:iraq manage_run:syncdb

Note that you have to pick an instance.

SSH to the server

There’s a shortcut for this:

fab staging ssh

Provisioning Servers and Environments

New EC2 Servers

These are instructions for creating and deploying a new server. Production servers are typically deployed on Amazon EC2 servers, but most of these instructions would apply to any server.

For the purposes of this documentation, we’ll assume we’re adding a new server, to be referred to as the testing environment.

  1. Create a new EC2 server. Some tips:
  • Put it in a region close to where most users will be, e.g. Ireland (eu-west-1). (To switch regions in the AWS EC2 console, look near the top-right of the window for a light-gray selector on a black background.)
  • Use an AMI (image) of Ubuntu 12.04 server, 64-bit, EBS - e.g. ubuntu-precise-12.04-amd64-server-20140408 (ami-d1f308a6)
  • Be sure to save the private key that is created, or use an existing one you already own. (Caktus: key pairs are stored in LastPass, search for CTS.) The AWS private key is only needed until CTS has been deployed the first time, but it is essential until then.
  1. If needed, follow the New Environments to add a new environment.

  2. Add the new server’s ssh key to your ssh-agent, e.g.:

    ssh-add /path/to/newserver.pem
    

    This will allow you to ssh into the new server as root initially. After we’ve finished our deploy, you’ll have your own userid on the server that you can use to ssh in.

  3. Create a minion:

    fab -u root testing setup_minion
    
  4. Initial deploy:

    fab -u root testing deploy
    

After that, developer accounts will exist on the server with ssh access, so “-u root” will no longer be needed. You’ll be able to update the server with:

fab testing deploy

New Environments

(You should rarely need to do this.)

An environment defines a server where CTS will run, e.g. “production” or “staging”.

Creating a new environment requires adding parts of its configuration to multiple places in the CTS configuration files.

For the purposes of this documentation, we’ll assume we’re adding a new environment named testing, which will be accessed at cts-testing.caktusgroup.com.

  1. Edit the fabfile (fabfile.py in the top directory). Create a new task near the top, modeled on the existing tasks like ‘production’. Fill in the new server’s hostname or IP address. Like this:

    @task
    def testing():
        env.environment = 'testing'
        env.hosts = ['cts-testing.caktusgroup.com']
        env.master = env.hosts[0]
    
  2. In the fabfile, add the new environment to SERVER_ENVIRONMENTS near the top:

    SERVER_ENVIRONMENTS = ['staging', 'production', 'testing']
    
  3. In conf/pillar/top.sls, add the new environment to the list:

    {% for env in ['staging', 'production', 'testing'] %}
    
  4. Under the conf/pillar directory, create a new directory with the same name as your new environment. Copy the env.sls and secrets.sls files from an existing directory, such as production. Add the env.sls file to git, but DO NOT add the secrets.sls file to git. Edit both as seems appropriate. The environment and domain names should match those in fabfile.py.

    conf/pillar/testing/env.sls:

    environment: testing
    
    domain: cts-testing.caktusgroup.com
    
    repo:
      url: git@github.com:theirc/CTS.git
      branch: origin/develop
    
    # Additional public environment variables to set for the project
    env:
      FOO: BAR
    

    The repo will also need a deployment key generated so that the Salt minion can access the repository. Or if the repository already has a deployment key, you’ll need access to the private key. See the Github docs on managing deploy keys

    The private key should be added to conf/pillar/<environment>/secrets.sls under the label github_deploy_key:

    github_deploy_key: |
      -----BEGIN RSA PRIVATE KEY-----
      foobar
      -----END RSA PRIVATE KEY-----
    

    You may choose to include the public SSH key in the repo as well, but this is not strictly required.

    The project_name and python_version are set in conf/pillar/project.sls. Currently we support using Python 2.7 on this project.

    The secrets.sls can also contain a section to enable HTTP basic authentication. This is useful for staging environments where you want to limit who can see the site before it is ready. This will also prevent bots from crawling and indexing the pages. To enable basic auth simply add a section called http_auth in the relevant conf/pillar/<environment>/secrets.sls:

    http_auth:
      admin: 123456
    

    This should be a list of key/value pairs. The keys will serve as the usernames and the values will be the password. As with all password usage please pick a strong password.

    Here’s what conf/pillar/testing/secrets.sls might look like:

    secrets:
        DB_PASSWORD: xxxxxx
        BROKER_PASSWORD: yyyyy
        newrelic_license_key: zzzzz
    
        # Iraq:
        ONA_DOMAIN_IQ: ona-staging.caktusgroup.com
        ONA_API_ACCESS_TOKEN_IQ: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        ONA_FORM_IDS_IQ: 4
        ONA_DEVICEID_VERIFICATION_FORM_ID_IQ: 52
    
        # Jordan:
        ONA_DOMAIN_JO: ona-staging.caktusgroup.com
        ONA_API_ACCESS_TOKEN_JO: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        ONA_FORM_IDS_JO: 3;14
        ONA_DEVICEID_VERIFICATION_FORM_ID_JO: 35
    
        # Turkey:
        ONA_DOMAIN_TR: ona-staging.caktusgroup.com
        ONA_API_ACCESS_TOKEN_TR: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        ONA_FORM_IDS_TR: 5;6;23
        ONA_DEVICEID_VERIFICATION_FORM_ID_TR: 65
    
    # Uncomment and update username/password to enable HTTP basic auth
    # Comment out to enable access to the public to the site
    http_auth:
        caktus: abc123
    
    github_deploy_key: |
        -----BEGIN RSA PRIVATE KEY-----
        xxxxxxxx....xxxxxxxxx
        -----END RSA PRIVATE KEY-----
    
    # Key and cert are optional; if either is missing, self-signed cert will be generated
    ssl_certificate: |
        -----BEGIN CERTIFICATE-----
        MIIFtzCCBJ+gAwIBAgIRAKExk5E8hLbFJa3HRZCMlowwDQYJKoZIhvcNAQEFBQAw
        ...
        lgFKqqiPJXgcYrkEaCFpGG2KVI2oRVCc6EOS
        -----END CERTIFICATE-----
    
    ssl_key: |
        -----BEGIN PRIVATE KEY-----
        MIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCoU2/FjOX/XWbf
        ...
        VtAT+BRfNZvJ3f2bWV8U2A==
        -----END PRIVATE KEY-----
    
  5. Edit conf/salt/project/new_relic_webmon/newrelic.ini. At the end, add a new New Relic environment:

    [newrelic:testing]
    monitor_mode = false
    
  6. Commit changes to git and push them. Merge to master if this is going to be a production server, or to whatever branch env.sls is configured to pull from.

    If you want to test without merging the changes to master yet, then push the changes to some other branch, and edit your local copy of conf/pillar/testing/env.sls to change the branch name to the one you’re using.

Ona Support

Overview

CTS utilizes Ona to capture mobile form data. The web application communicates with Ona via its REST API.

Ona Secrets

Here is the data that should be in conf/pillar/<environment>/secrets.sls to access an Ona server, for the IQ instance:

secrets:
  ONA_DOMAIN_IQ: ona.io # domain of your Ona instance
  ONA_API_ACCESS_TOKEN_IQ: changeme # API access token of a valid Ona User
  ONA_FORM_IDS_IQ: 23;5 # Semicolon-separated Form IDs for package/voucher tracking for this instance of the web application
  ONA_DEVICEID_VERIFICATION_FORM_ID_IQ: changeme # Form ID for binding a device to a user for this instance of the web application

Additional Forms

If additional form support is required, a few code changes will be necessary. The components needed are:

  • an environment variable and/or Django setting to define the form id to capture
  • a celery task to poll and consume form submissions

Here is a made-up example:

# Django setting
ONA_MY_FORM_ID = os.environ.get('ONA_MY_FORM_ID', '')

# Celery task
@app.task
def update_package_locations():
    """Updates the local database with new package tracking form submissions"""
    form_id = settings.ONA_MY_FORM_ID
    client = OnaApiClient()
    submissions = client.get_form_submissions(form_id)
    for data in submissions:
        submission = PackageLocationFormSubmission(data)
        if not FormSubmission.objects.filter(uuid=submission._uuid).exists():
            FormSubmission.from_ona_form_data(submission)

The above task uses a helper object, PackageLocationFormSubmission to parse the data. For many forms, it is possible to utilize the OnaItemBase base class. Dependent on your specific needs for the form, you may want to author a custom object based on OnaItemBase to process your form submissions.

If you need to query submissions for a specific form utilize the form_id field to filter with:

FormSubmission.objects.filter(form_id=settings.ONA_MY_FORM_ID)

CTS backups

Backups are taken regularly and stored on the Caktus backup server. This document explains how to access and use those backups.

Getting a backup dump and restoring it locally

Steps you can do ahead of time:

  • Get access to the Caktus backup server (open a tech support request).

When you need to restore a backup:

  • Make sure you are in your CTS-IRCDeploy directory, and not in the CTS project directory:

    $ git config --get remote.origin.url
    git@github.com:theirc/CTS-ircdeploy.git
    
  • List the files in the latest backup directory and find the most recent backup file for each instance (i.e. “iraq”, “jordan”, and “turkey”):

    $ backup_path=/mnt/rsnapshot/cts/daily.0/home/caktus-backup
    $ ssh caktus-backup@backup.caktus.lan ls $backup_path/cts_iraq* | tail -1
    cts_iraq.rescue.org-20170927.bz2
    $ ssh caktus-backup@backup.caktus.lan ls $backup_path/cts_jordan* | tail -1
    cts_jordan.rescue.org-20170927.bz2
    $ ssh caktus-backup@backup.caktus.lan ls $backup_path/cts_turkey* | tail -1
    cts_turkey.rescue.org-20170927.bz2
    
  • Copy those files your local directory:

    $ scp caktus-backup@backup.caktus.lan:${backup_path}/cts_iraq.rescue.org-20170927.bz2 cts_iraq.bz2
    $ scp caktus-backup@backup.caktus.lan:${backup_path}/cts_jordan.rescue.org-20170927.bz2 cts_jordan.bz2
    $ scp caktus-backup@backup.caktus.lan:${backup_path}/cts_turkey.rescue.org-20170927.bz2 cts_turkey.bz2
    

    The iraq file is about 3MB, and the others are about 5MB each, as of Sept 2017.

  • Decompress the file using the -k flag which keeps the compressed version around (since we’ll be SCP’ing that to staging in a few steps):

    $ bunzip2 -k cts_iraq.bz2
    $ bunzip2 -k cts_jordan.bz2
    $ bunzip2 -k cts_turkey.bz2
    
  • Drop your existing local database and restore from the backup:

    $ dropdb cts
    $ createdb --template=template0 cts
    $ psql --quiet cts -f cts_iraq > sql-import.log 2>&1
    

    You can look through sql-import.log to view the output from that command. There will be a bunch of errors about missing relations and roles. It’s OK to ignore them.

  • Change to the CTS project directory:

    $ cd ../CTS
    $ git config --get remote.origin.url
    git@github.com:theirc/CTS.git
    
  • Migrate the database:

    $ workon cts
    $ python manage.py migrate --noinput
    $ python manage.py createsuperuser
    $ python manage.py runserver
    

    Check out localhost:8000 and poke around.

  • Repeat the above process with the ‘jordan’ and ‘turkey’ dumps.

Bringing up a new site using the backup dump

  • Change back to the CTS project directory:

    $ cd ../CTS-ircdeploy
    $ git config --get remote.origin.url
    git@github.com:theirc/CTS-ircdeploy.git
    
  • Copy the three compressed dump files to staging:

    $ fab staging put_file:cts_iraq.bz2
    $ fab staging put_file:cts_jordan.bz2
    $ fab staging put_file:cts_turkey.bz2
    
  • SSH into staging and unzip the files:

    $ fab staging ssh
    user@cts-staging$ cd /tmp
    user@cts-staging$ bunzip2 cts_iraq.bz2
    user@cts-staging$ bunzip2 cts_jordan.bz2
    user@cts-staging$ bunzip2 cts_turkey.bz2
    
  • Stop the web and celery processes:

    user@cts-staging$ sudo supervisorctl stop all
    
  • Switch to the cts user and set up the environment to allow you to access RDS:

    user@cts-staging$ sudo -u cts -i
    cts@cts-staging$ . /var/www/cts/run.sh
    (env)cts@cts-staging$ export PGHOST=$DB_HOST PGPASSWORD=$DB_PASSWORD PGUSER=$DB_USER
    (env)cts@cts-staging$ dropdb cts_iraq
    (env)cts@cts-staging$ createdb cts_iraq
    (env)cts@cts-staging$ psql --quiet cts_iraq -f /tmp/cts_iraq > /tmp/sql-import.log 2>&1
    
  • Review the sql-import.log. There will be lots of errors about missing roles, tables, etc, but that is OK. Now, run migrations:

    (env)cts@cts-staging$ INSTANCE=iraq django-admin.py migrate --noinput
    
  • Repeat this process for the other 2 instances: ‘jordan’, and ‘turkey’.

  • After completing all 3 instances, switch back to your user and restart the servers:

    (env)cts@cts-staging$ logout
    user@cts-staging$ sudo supervisorctl start all
    cts-celery-jordan: started
    cts-turkey-server: started
    cts-celery-turkey: started
    cts-celery-beat-jordan: started
    cts-celery-iraq: started
    cts-celery-beat-iraq: started
    cts-jordan-server: started
    cts-iraq-server: started
    cts-celery-beat-turkey: started
    
  • Give the load balancer a few minutes to realize we’re healthy, then poke around the staging servers to make sure everything looks good.

  • Finally clean up the dumps from the staging server and locally:

    user@cts-staging$ cd /tmp/
    user@cts-staging$ sudo rm -f cts_*.rescue.org* sql-import.log
    user@cts-staging$ logout
    Connection to ec2-54-86-123-211.compute-1.amazonaws.com closed.
    
    Done.
    $ rm -f cts_*.rescue.org* sql-import.log
    

Releases

0.4 on Sep. 29, 2017

  • Add ssh keys for maintenance and backup.
  • Document how to restore from backup (#16).

0.3 on Jan. 12, 2016

  • Update Ona docs (#11)
  • Update for voucher support (#12)
  • Restart celery nightly (#13)
  • Separate celery queues (#14). Requires CTS v0.7.0

0.2 on Nov. 13, 2015

  • Move to Amazon RDS (#6)
  • Move logs to syslog (#8)
  • Stop using dbbackup script (#10)

0.1 on Jun. 9, 2015

  • Initial release

Indices and tables