_images/logo.png

DevOps Knowledge Base

DevOps Knowledge Base

This knowledge base includes topics of: Software Configuration Management, Continuous Delivery, Infrastructure Orchestration and DevOps

Contents

Software Configuration Management

SCM Generic Topics

  • Difference between Software Configuration Management and Hardware Configuration Management: scm-vs-hcm
  • Streamlining Build Processes and Configuration Management: streamlining-scm
SCM and CM Industry Standards
  • EIA 649-A National Consensus Standard for Configuration Management
  • IEEE Std 1042-1987 Guide to Software Configuration Management
  • IEEE Std-828-1990 Standard for software CM Plans
  • ISO 10007 Quality management systems – Guidelines for configuration management
  • STANAG 4159 NATO Materiel Configuration Management Policy and Procedures for Multinational Joint Projects
  • STANAG 4427 Allied Configuration Management Publications
  • MIL-HDBK-61A(SE)
  • ECSS-M-ST-40C – ESA (European Space Agency) – Aerospace Industry
ITIL
SCM Frameworks
BOSH
  • BOSH is a Cloud Foundry project for release engineering, deployment, and lifecycle managemen: bosh-homepage

Code Management

Best Practices
  1. Build an abstract Layer between your Version Control System system and the entire SCM Toolkit.
  2. Atomicity, so every operation is atomic, 0 or 1, True or False, there are no states in between.
  3. Avoid storing different types of data than the actual source code (no binfiles, media, etc..).
Common Problems
Pull Request VS Code Review
Choosing VCS System
Branching
Performance & Scaling
Binary Files

For every Distributed Version Control System (DVCS) handling (especially large) binary files is an issue.

Creating Workspace

For every Distributed Version Control System (DVCS) creating a workspace operation may be an issue, especially for big repositories or repositories with a big number of projets (subrepositories).

  • Facebook: “it takes about 10 minutes to begin discovering commits in Facebook’s 350,000-commit primary repository, and about 18 hours to import it all with 64 taskmasters on modern hardware.” Source: facebook-creating-workspace.
VCS/DVCS Migrations
Large Changes in Code Review

Most of Code Review tools do not support large changes.

Versioning
Android

Starting with Cupcake, individual builds are identified with a short build code, e.g. FRF85B. The first letter is the code name of the release family, e.g. F is Froyo. The second letter is a branch code that allows Google to identify the exact code branch that the build was made from, and R is by convention the primary release branch. The next letter and two digits are a date code. The letter counts quarters, with A being Q1 2009. Therefore, F is Q2 2010. The two digits count days within the quarter, so F85 is June 24 2010. Finally, the last letter identifies individual versions related to the same date code, sequentially starting with A; A is actually implicit and usually omitted for brevity.

Tools
Centralized VS Distributed
Git, Mercurial and Bazaar domination
Git
Tools and Plugins
Git Propaganda
Git Branching
Git on Windows
Git & Multiple Projects
Git Tools
Git Use Cases
Online Tutorials
Git Presentations
Git cheatsheets
Best Practices
Git and Android
Gerrit
Cross Repo Dependencies
Gerrit Server - public instances
Tips and Tricks
Mercurial
Propaganda
  • Google announces Mercurial support:
  • CodePlex announces Mercurial support:
Architecture
  • Mercurial Architecture: ols-mercurial-paper.pdf
Veracity
Fossil

Boar

VCS

VCS is an abstraction layer over various version control systems: vcs-homepage. Project seems to be dead.

Commercial

Perforce

Code Review
Phabricator

Phabircator is developed and used by Facebook (and many other companies.. )

  • Homepage: phabricator-homepage
Rietveld
Code Review Use Cases
Tips and Tricks
Resources
Software Development KPIs

Development KPIs

  • Lines of code per developer
  • Build test failures
  • Unit test failures
  • Number of bugs found in their code
  • Number of bugs fixed
  • Actual time to finish a task based against their own estimate
  • Number of developers and commits by organization, site or country (Bangalore, Brugge)
  • Number of revisions merged per contributor
  • Number of revisions abandoned per contributor
  • Number of revisions merged per organization, site, country
  • Number of revisions abandoned per organization, site, country
  • Ratios merged/abandoned
  • Number of new contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months
  • Number of contributors stopping contributing or decreasing continuously in the past 3 months.

Gerrit KPIs

  • Number of Code review comments
  • Average time spent on Code Review
  • Number of commits reviewed in <2 days, <1 week, <1 month, <3 months, >3 months or unreviewed
  • Code Review queue size
  • How many new users registered (per day, per month, per year)

SCM Team KPIs

  • Time to set up an environment
  • Time from change request to release
  • Mean time to resolution

JIRA Related KPIs:

  • Average time for an accepted bug report between bug creation date and PATCH_TO_REVIEW status being set
  • Average time for an accepted bug report between PATCH_TO_REVIEW status being set and RESOLVED FIXED status being set.
  • Average time for an accepted bug report between bug creation date and first comment by not the reporter her/himself.

Deployment KPIs:

  • Speed of deployment
  • Deployment success rate
  • How quickly service can be restored after a failed deployment

Build Management

Best Practices
Build Parallelization
GNU Make

Pitfals and Benefits: pitfals-and-benefits

Theory

Build System high level feature proposal: build-system-high-level

Build in the Cloud
Build Automation

Change Control

Risk
Lowering the risk

Change should be done through tools and culture.

  • you are always testing
  • use the bots, use the tools, but do not talk to me
  • use revision tracker
  • introduce post/pre commit gatekeepers
  • introduce push-karma
Database Patching
DB Patching Tools

Infrastructure Orchestration

The concept of Infrastructure Orchestration was previously known as Environment Configuration. Many people and organizations are still using the old name, but for purpose of this SCM Knowledge Base We will stick with the newest trend as it is getting more and more popular with every month.

Best Practices
  1. Configuration treated as a Source Code
  2. Communication of configuration changes
Articles
Frameworks
Comparisons
  • Comparison Grid for Ansible / SaltStack / Chef / Puppet: comparison-grid
  • Comparison of open-source configuration management software: wikipedia
  • Ansible and Salt: A detailed comparison: ansible-and-salt
Salt Stack

Salt is an open source tool to manage your infrastructure. Easy enough to get running in minutes and fast enough to manage tens of thousands of servers (and still get a response back in seconds). Execute arbitrary shell commands or choose from dozens of pre-built modules of common (or complex) commands. Target individual servers or groups of servers based on name, defined roles, or a variety of system information such as hardware, software, operating system, current version, current environment, and many more.

Salt Tutorials
Usage Examples
Propaganda
Vagrant

Vagrant is a development tool which stands on the shoulders of giants, using tried and proven technologies to achieve its magic. Vagrant uses Oracle’s VirtualBox to create its virtual machines and then uses Chef or Puppet to provision them. By providing easy to configure, lightweight, reproducible, and portable virtual machines targeted at development environments, Vagrant helps maximize the productivity and flexibility of you and your team.

Vagrant Homepage: vagrant-homepage

Puppet

Puppet is IT automation software that helps system administrators manage infrastructure throughout its lifecycle, from provisioning and configuration to patch management and compliance. Using Puppet, you can easily automate repetitive tasks, quickly deploy critical applications, and proactively manage change, scaling from 10s of servers to 1000s, on-premise or in the cloud.

Capistrano

Capistrano is a utility and framework for executing commands in parallel on multiple remote machines, via SSH. It uses a simple DSL (borrowed in part from Rake) that allows you to define tasks, which may be applied to machines in certain roles. It also supports tunneling connections via some gateway machine to allow operations to be performed behind VPN’s and firewalls. Capistrano was originally designed to simplify and automate deployment of web applications to distributed environments, and originally came bundled with a set of tasks designed for deploying Rails applications.

Fabric

Fabric is a Python (2.5 or higher) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks. It provides a basic suite of operations for executing local or remote shell commands (normally or via sudo) and uploading/downloading files, as well as auxiliary functionality such as prompting the running user for input, or aborting execution.

Fabric Homepage: fabric-homepage

Ubuntu’s juju
Glu

Glu takes a very declarative approach, in which you describe/model what you want, and glu can then:

  • compute the set of actions to deploy/upgrade your applications
  • ensure that it remains consistent over time
  • detect and alert you when there is a mismatch

Glu Homepage: glu-homepage

Ansible

It turns out, that about the same time I did look around, a new alternative was launched called Ansible, written in Python. I haven’t done a lot with it yet. But I really like what I’ve seen so far, and the design principles really resonates with me. The easiest config management system to use, ever. Requires no software to be installed on the remote box for bootstrapping Idempotent modules (although you can choose whether or not to have this for your own modules) I think the author Michael DeHaan sums it up really good in this interview:

Chef
Nix

Nix Homepage: nix-homepage

  • Why Puppet/Chef/Ansible aren’t good enough: nix-vs-other
Gunnery

Gunnery is multipurpose task execution tool for distributed systems: gunnery-homepage

Rundeck

Rundeck Homepage: rundeck-homepage

Docker
Network
Scaling

Release Management

Best Practices
Articles
Naming Conventions
Versioning

Most common model:

  • X.0 for Major releases
  • X.X.0 for minor releases
  • X.X.X.0 for patches that don’t include functionality updates.
  • X.X.X.X for security and/or emergency patches.

Given a version number MAJOR.MINOR.PATCH, increment the:

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality in a backwards-compatible manner, and
  • PATCH version when you make backwards-compatible bug fixes.

Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

Other models:

Tools
Gitver

A very simple, lightweight, tag-based version string manager for git, written in Python.

Homepage: https://github.com/manuelbua/gitver

Deployment Management

Best Practices
  1. Deployment process is fully automated
  2. Rollback process is fully automated
  3. Deployment should be unnoticed by users (invisible upgrades)
Articles
Deployment Methodologies
Invisible Deployment
Torrent based Deployment

Torrent methodology is used by Facebook and Twitter and becomes a standard for large scale web sites, below you can read more about this:

Facbook and Torrent deployments:

Continuous Integration

Articles
Tools
Drone
Jenkins
Nix

Continuous Delivery

Tools
Articles
Youtube

DevOps

Best Practices
Propaganda
What is DevOps
DevOps and Security
DevOps and ITIL
  • DevOps and ITIL: Continuous Delivery doesn’t stop at software: devops-and-itil
  • How to Modify ITIL to Accommodate DevOps: itil-devops
DevOps and Culture
DevOps and Enterprise
Articles
DevOps and Robotics
DevOps and Microsoft
Containerization
  • Using Containers for Continuous Deployment: using-cointainers
  • Demonstration of realtime Linux container autoscaling: force12
LXC
  • Userspace tools for the Linux kernel containers: lxc-homepage
Break down the silos
DevOps History

Automation

TODO

See also

This topic overlaps with Infrastructure Orchestration chapter which you can find here: infrastructure-orchestration

Testing

Best Practices
  1. Fail constantly to verify if all the automated failover solutions are working properly.
Articles
Tools
Chaos Monkey

Motto: The best way to avoid failure is to fail constantly

Robot Framework
GWT

Monitoring

KPIs
Development KPIs
  • Lines of code per developer
  • Build test failures
  • Unit test failures
  • Number of bugs found in their code
  • Number of bugs fixed
  • Actual time to finish a task based against their own estimate
  • Number of developers and commits by organization, site or country (Bangalore, Brugge)
  • Number of revisions merged per contributor
  • Number of revisions abandoned per contributor
  • Number of revisions merged per organization, site, country
  • Number of revisions abandoned per organization, site, country
  • Ratios merged/abandoned
  • Number of new contributors with 1 / 2-5 / 6+ changes submitted in the past 3 months
  • Number of contributors stopping contributing or decreasing continuously in the past 3 months.
Gerrit KPIs
  • Number of Code review comments
  • Average time spent on Code Review
  • Number of commits reviewed in <2 days, <1 week, <1 month, <3 months, >3 months or unreviewed
  • Code Review queue size
  • How many new users registered (per day, per month, per year)
  • Average time for an accepted bug report between bug creation date and PATCH_TO_REVIEW status being set
  • Average time for an accepted bug report between PATCH_TO_REVIEW status being set and RESOLVED FIXED status being set.
  • Average time for an accepted bug report between bug creation date and first comment by not the reporter her/himself.
SCM Team KPIs
  • Time to set up an environment
  • Time from change request to release
  • Mean time to resolution
Deployment KPIs
  • Speed of deployment
  • Deployment success rate
  • How quickly service can be restored after a failed deployment
Articles
Tools
Statuspage
Nagios
Cacti

Cacti is a complete network graphing solution designed to harness the power of RRDTool’s data storage and graphing functionality.

Articles
Service Discovery

,, _service-discovery-in-the-cloud: http://jasonwilder.com/blog/2014/02/04/service-discovery-in-the-cloud/ .. _service-discovery-solutions: http://www.activestate.com/blog/2014/05/service-discovery-solutions

Books
Examples
Dashboards
Comamnd Line Tools

CMDB

CMDB - Configuration Management DataBase

CMS - Configuration Management System (ususally providing some web interface and built on top of CMDB)

Articles
Migrations

From ANY CMDB to Django CMDB

Django insepctdb module allows you to migrate from an old CMDB engine to a new one (Django based) within an hour.. Below you can find two examples presenting the idea:

http://www.ianlewis.org/en/administer-wordpress-django-admin http://labs.freehackers.org/projects/djangoredmineadmin

Tools
Ralph

Open source CMDB implementation with Hardware focus (Django + Python + Celery)

Home Page: https://github.com/allegro/ralph/blob/master/doc/cmdb.rst

Other SCM Resources

Books

Amazon
Software Configuration Management
  • Configuration Management Best Practices: Practical Methods that Work in the Real World: scm-best-practices
DevOps
CMDB
  • The CMDB Imperative: How to Realize the Dream and Avoid the Nightmares: cmdb
Continuous Delivery

Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation: continuous-delivery

Free

Websites

Video

Motivation

Youtube

Education

Certifications and Trainings

USA/Europe/Online
  • Software Configuration Management Professional (SCMP): scmp
  • Software Change, Configuration and Release Management: learningtree
  • iNTCCM Certified Professional for Configuration Management: intccm
  • The NDIA has a CDM certification course: ndia
  • CMPIC Configuration Management Training & Certification: cmpic
  • Software Engineering Institute - Configuration Management: sei
  • ITIL/ITSM - Configuration Management: itil-itsm
  • Institute of Configuration Management: icmhq
Poland
ITIL

Failures

Real life examples
General issues with poor SCM
  • Approved changes not incorporated
  • “Minor” changes cause major problems
  • Delayed or lag time in performance
  • Suppliers deliver non-compliant products
  • Unable to identify installed software versions
  • Delivered products do not meet performance requirements
  • Difficulty maintaining products
  • Like-models with different configurations
  • Constant changes without proper documentation
  • Changes do not match current product/system configuration
  • The latest version of engineering drawings cannot be found
  • The wrong versions of the configuration items are being baselined
  • A required change is implemented in the wrong version of the product
  • The wrong version of the product/source code was tested or deployed
  • The user is given documentation that doesn’t support the version of the product they’ve received
  • Cost increase in overall maintenance and operation
  • un-ending blaming of “who’s responsible for the failure” or who did not follow the procedures

Dictionary

  • KISS - Keep It Simple, Stupied

Events

Upcoming Events

Passed Events

Surveys
2011
Reports
2012
2013
Recordings
2014

Python

Python

Propaganda

Popularity of programming language: https://sites.google.com/site/pydatalog/pypl/PyPL-PopularitY-of-Programming-Language

Python continues to grow slowly, and Perl continues its long decline. According to Google trends, the number of searches for Perl is 19% of what it was in 2004 source: http://www.drdobbs.com/jvm/the-rise-and-fall-of-languages-in-2012/240145800

Programming language popularity chart: http://langpop.corger.nl/

  • Python is Now the Most Popular Introductory Teaching Language at Top U.S. Universities: python-universities
Django

An Architecture for Django Templates: https://oncampus.oberlin.edu/webteam/2012/09/architecture-django-templates

Migrating Django projects with fixtures: http://kuttler.eu/post/django-db-utils-IntegrityError-duplicate-entry/

Making a specific Django app faster: http://reinout.vanrees.org/weblog/2014/05/06/making-faster.html

Understanding Test Driven Development with Django: http://arunrocks.com/understanding-tdd-with-django

Starting a Django 1.6 Project the Right Way: http://www.jeffknupp.com/blog/2013/12/18/starting-a-django-16-project-the-right-way/

Web development with Django and Python: http://www.slideshare.net/mpirnat/web-development-with-python-and-django

Complete single server Django stack tutorial: http://www.apreche.net/complete-single-server-django-stack-tutorial/

Django & Gooogle App Engine: http://f.souza.cc/2010/08/flying-with-django-on-google-app-engine.html

Introduction to Django: http://gettingstartedwithdjango.com/en/lessons/introduction-and-launch/

Django + emails + Celery http://www.cucumbertown.com/engineering/scheduling-morning-emails-with-django-and-celery/

Django mailviews: https://github.com/disqus/django-mailviews?utm_source=Python+Weekly+Newsletter&utm_campaign=c15c89a350-Python_Weekly_Issue_75_February_21_2013&utm_medium=email

Sphinks + Github: http://raxcloud.blogspot.de/2013/02/documenting-python-code-using-sphinx.html

Django Best Practices: http://lincolnloop.com/django-best-practices/

Fixing Database Connections in Django: http://craigkerstiens.com/2013/03/07/Fixing-django-db-connections

Deploying Django with Saltstack: http://www.barrymorrison.com/2013/Mar/11/deploying-django-with-salt-stack/

Effective Django: http://effectivedjango.com

Serving static files with Django: http://agiliq.com/blog/2013/03/serving-static-files-in-django/

Getting started with Django: http://gettingstartedwithdjango.com/

Overloading Django Form Fields: http://pydanny.com/overloading-form-fields.html

Creating PostgreSQL database: http://od-eon.com/blogs/calvin/postgresql-cheat-sheet-beginners/

Django and sending emails: http://ozkatz.github.io/getting-e-mail-right-with-django-and-ses.html

Django and sending emails2: http://stackful-dev.com/django-email-tricks-part-1.html

Django and Salt: https://github.com/wunki/django-salted

Testing: http://www.realpython.com/blog/python/testing-in-django-part-1-best-practices-and-examples/

Other

Linux

Linux

https://github.com/kahun/awesome-sysadmin

Usability

UX design for startups: http://uxpin.com/ux-design-for-startups.html

OS Architecture

Think OS is an introduction to Operating Systems for programmers: thinkos

Howtos/Tutorials
Virtualization
Performance

FAQ