Intro to Playbooks
About Playbooks
Playbooks are a completely different way to use ansible than in adhoc task execution mode, and are
particularly powerful.
Simply put, playbooks are the basis for a really simple configuration management and multi-machine deployment system,
unlike any that already exist, and one that is very well suited to deploying complex applications.
Playbooks can declare configurations, but they can also orchestrate steps of
any manual ordered process, even as different steps must bounce back and forth
between sets of machines in particular orders. They can launch tasks
synchronously or asynchronously.
While you might run the main /usr/bin/ansible
program for ad-hoc
tasks, playbooks are more likely to be kept in source control and used
to push out your configuration or assure the configurations of your
remote systems are in spec.
There are also some full sets of playbooks illustrating a lot of these techniques in the
ansible-examples repository. We'd recommend
looking at these in another tab as you go along.
There are also many jumping off points after you learn playbooks, so hop back to the documentation
index after you're done with this section.
Playbook Language Example
Playbooks are expressed in YAML format (see YAML Syntax) and have a minimum of syntax, which intentionally
tries to not be a programming language or script, but rather a model of a configuration or a process.
Each playbook is composed of one or more 'plays' in a list.
The goal of a play is to map a group of hosts to some well defined roles, represented by
things ansible calls tasks. At a basic level, a task is nothing more than a call
to an ansible module (see Working With Modules).
By composing a playbook of multiple 'plays', it is possible to
orchestrate multi-machine deployments, running certain steps on all
machines in the webservers group, then certain steps on the database
server group, then more commands back on the webservers group, etc.
"plays" are more or less a sports analogy. You can have quite a lot of plays that affect your systems
to do different things. It's not as if you were just defining one particular state or model, and you
can run different plays at different times.
For starters, here's a playbook that contains just one play:
---
- hosts: webservers
vars:
http_port: 80
max_clients: 200
remote_user: root
tasks:
- name: ensure apache is at the latest version
yum:
name: httpd
state: latest
- name: write the apache config file
template:
src: /srv/httpd.j2
dest: /etc/httpd.conf
notify:
- restart apache
- name: ensure apache is running
service:
name: httpd
state: started
handlers:
- name: restart apache
service:
name: httpd
state: restarted
Playbooks can contain multiple plays. You may have a playbook that targets first
the web servers, and then the database servers. For example:
---
- hosts: webservers
remote_user: root
tasks:
- name: ensure apache is at the latest version
yum:
name: httpd
state: latest
- name: write the apache config file
template:
src: /srv/httpd.j2
dest: /etc/httpd.conf
- hosts: databases
remote_user: root
tasks:
- name: ensure postgresql is at the latest version
yum:
name: postgresql
state: latest
- name: ensure that postgresql is started
service:
name: postgresql
state: started
You can use this method to switch between the host group you're targeting,
the username logging into the remote servers, whether to sudo or not, and so
forth. Plays, like tasks, run in the order specified in the playbook: top to
bottom.
Below, we'll break down what the various features of the playbook language are.
Basics
Hosts and Users
For each play in a playbook, you get to choose which machines in your infrastructure
to target and what remote user to complete the steps (called tasks) as.
The hosts
line is a list of one or more groups or host patterns,
separated by colons, as described in the Working with Patterns
documentation. The remote_user
is just the name of the user account:
---
- hosts: webservers
remote_user: root
注解
The remote_user
parameter was formerly called just user
. It was renamed in Ansible 1.4 to make it more distinguishable from the user module (used to create users on remote systems).
Remote users can also be defined per task:
---
- hosts: webservers
remote_user: root
tasks:
- name: test connection
ping:
remote_user: yourname
Support for running things as another user is also available (see Understanding Privilege Escalation):
---
- hosts: webservers
remote_user: yourname
become: yes
You can also use become on a particular task instead of the whole play:
---
- hosts: webservers
remote_user: yourname
tasks:
- service:
name: nginx
state: started
become: yes
become_method: sudo
You can also login as you, and then become a user different than root:
---
- hosts: webservers
remote_user: yourname
become: yes
become_user: postgres
You can also use other privilege escalation methods, like su:
---
- hosts: webservers
remote_user: yourname
become: yes
become_method: su
If you need to specify a password to sudo, run ansible-playbook
with --ask-become-pass
or
when using the old sudo syntax --ask-sudo-pass
(-K
). If you run a become playbook and the
playbook seems to hang, it's probably stuck at the privilege escalation prompt.
Just Control-C to kill it and run it again adding the appropriate password.
重要
When using become_user
to a user other than root, the module
arguments are briefly written into a random tempfile in /tmp
.
These are deleted immediately after the command is executed. This
only occurs when changing privileges from a user like 'bob' to 'timmy',
not when going from 'bob' to 'root', or logging in directly as 'bob' or
'root'. If it concerns you that this data is briefly readable
(not writable), avoid transferring unencrypted passwords with
become_user set. In other cases, /tmp
is not used and this does
not come into play. Ansible also takes care to not log password
parameters.
You can also control the order in which hosts are run. The default is to follow the order supplied by the inventory:
- hosts: all
order: sorted
gather_facts: False
tasks:
- debug:
var: inventory_hostname
Possible values for order are:
- inventory:
- The default. The order is 'as provided' by the inventory
- reverse_inventory:
- As the name implies, this reverses the order 'as provided' by the inventory
- sorted:
- Hosts are alphabetically sorted by name
- reverse_sorted:
- Hosts are sorted by name in reverse alphabetical order
- shuffle:
- Hosts are randomly ordered each run
Tasks list
Each play contains a list of tasks. Tasks are executed in order, one
at a time, against all machines matched by the host pattern,
before moving on to the next task. It is important to understand that, within a play,
all hosts are going to get the same task directives. It is the purpose of a play to map
a selection of hosts to tasks.
When running the playbook, which runs top to bottom, hosts with failed tasks are
taken out of the rotation for the entire playbook. If things fail, simply correct the playbook file and rerun.
The goal of each task is to execute a module, with very specific arguments.
Variables, as mentioned above, can be used in arguments to modules.
Modules should be idempotent, that is, running a module multiple times
in a sequence should have the same effect as running it just once. One
way to achieve idempotency is to have a module check whether its desired
final state has already been achieved, and if that state has been achieved,
to exit without performing any actions. If all the modules a playbook uses
are idempotent, then the playbook itself is likely to be idempotent, so
re-running the playbook should be safe.
The command and shell modules will typically rerun the same command again,
which is totally ok if the command is something like
chmod
or setsebool
, etc. Though there is a creates
flag available which can
be used to make these modules also idempotent.
Every task should have a name
, which is included in the output from
running the playbook. This is human readable output, and so it is
useful to provide good descriptions of each task step. If the name
is not provided though, the string fed to 'action' will be used for
output.
Tasks can be declared using the legacy action: module options
format, but
it is recommended that you use the more conventional module: options
format.
This recommended format is used throughout the documentation, but you may
encounter the older format in some playbooks.
Here is what a basic task looks like. As with most modules,
the service module takes key=value
arguments:
tasks:
- name: make sure apache is running
service:
name: httpd
state: started
The command and shell modules are the only modules that just take a list
of arguments and don't use the key=value
form. This makes
them work as simply as you would expect:
tasks:
- name: enable selinux
command: /sbin/setenforce 1
The command and shell module care about return codes, so if you have a command
whose successful exit code is not zero, you may wish to do this:
tasks:
- name: run this command and ignore the result
shell: /usr/bin/somecommand || /bin/true
Or this:
tasks:
- name: run this command and ignore the result
shell: /usr/bin/somecommand
ignore_errors: True
If the action line is getting too long for comfort you can break it on
a space and indent any continuation lines:
tasks:
- name: Copy ansible inventory file to client
copy: src=/etc/ansible/hosts dest=/etc/ansible/hosts
owner=root group=root mode=0644
Variables can be used in action lines. Suppose you defined
a variable called vhost
in the vars
section, you could do this:
tasks:
- name: create a virtual host file for {{ vhost }}
template:
src: somefile.j2
dest: /etc/httpd/conf.d/{{ vhost }}
Those same variables are usable in templates, which we'll get to later.
Now in a very basic playbook all the tasks will be listed directly in that play, though it will usually
make more sense to break up tasks as described in Creating Reusable Playbooks.
Action Shorthand
Ansible prefers listing modules like this:
template:
src: templates/foo.j2
dest: /etc/foo.conf
Early versions of Ansible used the following format, which still works:
action: template src=templates/foo.j2 dest=/etc/foo.conf
Handlers: Running Operations On Change
As we've mentioned, modules should be idempotent and can relay when
they have made a change on the remote system. Playbooks recognize this and
have a basic event system that can be used to respond to change.
These 'notify' actions are triggered at the end of each block of tasks in a play, and will only be
triggered once even if notified by multiple different tasks.
For instance, multiple resources may indicate
that apache needs to be restarted because they have changed a config file,
but apache will only be bounced once to avoid unnecessary restarts.
Here's an example of restarting two services when the contents of a file
change, but only if the file changes:
- name: template configuration file
template:
src: template.j2
dest: /etc/foo.conf
notify:
- restart memcached
- restart apache
The things listed in the notify
section of a task are called
handlers.
Handlers are lists of tasks, not really any different from regular
tasks, that are referenced by a globally unique name, and are notified
by notifiers. If nothing notifies a handler, it will not
run. Regardless of how many tasks notify a handler, it will run only
once, after all of the tasks complete in a particular play.
Here's an example handlers section:
handlers:
- name: restart memcached
service:
name: memcached
state: restarted
- name: restart apache
service:
name: apache
state: restarted
As of Ansible 2.2, handlers can also "listen" to generic topics, and tasks can notify those topics as follows:
handlers:
- name: restart memcached
service:
name: memcached
state: restarted
listen: "restart web services"
- name: restart apache
service:
name: apache
state:restarted
listen: "restart web services"
tasks:
- name: restart everything
command: echo "this task will restart the web services"
notify: "restart web services"
This use makes it much easier to trigger multiple handlers. It also decouples handlers from their names,
making it easier to share handlers among playbooks and roles (especially when using 3rd party roles from
a shared source like Galaxy).
注解
- Notify handlers are always run in the same order they are defined, not in the order listed in the notify-statement. This is also the case for handlers using listen.
- Handler names and listen topics live in a global namespace.
- If two handler tasks have the same name, only one will run.
*
- You cannot notify a handler that is defined inside of an include. As of Ansible 2.1, this does work, however the include must be static.
Roles are described later on, but it's worthwhile to point out that:
- handlers notified within
pre_tasks
, tasks
, and post_tasks
sections are automatically flushed in the end of section where they were notified;
- handlers notified within
roles
section are automatically flushed in the end of tasks
section, but before any tasks
handlers.
If you ever want to flush all the handler commands immediately you can do this:
tasks:
- shell: some tasks go here
- meta: flush_handlers
- shell: some other tasks
In the above example any queued up handlers would be processed early when the meta
statement was reached. This is a bit of a niche case but can come in handy from
time to time.
Executing A Playbook
Now that you've learned playbook syntax, how do you run a playbook? It's simple.
Let's run a playbook using a parallelism level of 10:
ansible-playbook playbook.yml -f 10
Ansible-Pull
Should you want to invert the architecture of Ansible, so that nodes check in to a central location, instead
of pushing configuration out to them, you can.
The ansible-pull
is a small script that will checkout a repo of configuration instructions from git, and then
run ansible-playbook
against that content.
Assuming you load balance your checkout location, ansible-pull
scales essentially infinitely.
Run ansible-pull --help
for details.
There's also a clever playbook available to configure ansible-pull
via a crontab from push mode.
Tips and Tricks
To check the syntax of a playbook, use ansible-playbook
with the --syntax-check
flag. This will run the
playbook file through the parser to ensure its included files, roles, etc. have no syntax problems.
Look at the bottom of the playbook execution for a summary of the nodes that were targeted
and how they performed. General failures and fatal "unreachable" communication attempts are
kept separate in the counts.
If you ever want to see detailed output from successful modules as well as unsuccessful ones,
use the --verbose
flag. This is available in Ansible 0.5 and later.
To see what hosts would be affected by a playbook before you run it, you
can do this:
ansible-playbook playbook.yml --list-hosts
While automation exists to make it easier to make things repeatable, all systems are not exactly alike; some may require configuration that is slightly different from others. In some instances, the observed behavior or state of one system might influence how you configure other systems. For example, you might need to find out the IP address of a system and use it as a configuration value on another system.
Ansible uses variables to help deal with differences between systems.
To understand variables you'll also want to read Conditionals and Loops.
Useful things like the group_by module
and the when
conditional can also be used with variables, and to help manage differences between systems.
The ansible-examples github repository contains many examples of how variables are used in Ansible.
For advice on best practices, refer to Variables and Vaults in the Best Practices chapter.
Before we start using variables, it's important to know what are valid variable names.
Variable names should be letters, numbers, and underscores. Variables should always start with a letter.
foo_port
is a great variable. foo5
is fine too.
foo-port
, foo port
, foo.port
and 12
are not valid variable names.
YAML also supports dictionaries which map keys to values. For instance:
foo:
field1: one
field2: two
You can then reference a specific field in the dictionary using either bracket
notation or dot notation:
These will both reference the same value ("one"). However, if you choose to
use dot notation be aware that some keys can cause problems because they
collide with attributes and methods of python dictionaries. You should use
bracket notation instead of dot notation if you use keys which start and end
with two underscores (Those are reserved for special meanings in python) or
are any of the known public attributes:
add
, append
, as_integer_ratio
, bit_length
, capitalize
, center
, clear
, conjugate
, copy
, count
, decode
, denominator
, difference
, difference_update
, discard
, encode
, endswith
, expandtabs
, extend
, find
, format
, fromhex
, fromkeys
, get
, has_key
, hex
, imag
, index
, insert
, intersection
, intersection_update
, isalnum
, isalpha
, isdecimal
, isdigit
, isdisjoint
, is_integer
, islower
, isnumeric
, isspace
, issubset
, issuperset
, istitle
, isupper
, items
, iteritems
, iterkeys
, itervalues
, join
, keys
, ljust
, lower
, lstrip
, numerator
, partition
, pop
, popitem
, real
, remove
, replace
, reverse
, rfind
, rindex
, rjust
, rpartition
, rsplit
, rstrip
, setdefault
, sort
, split
, splitlines
, startswith
, strip
, swapcase
, symmetric_difference
, symmetric_difference_update
, title
, translate
, union
, update
, upper
, values
, viewitems
, viewkeys
, viewvalues
, zfill
.
We've actually already covered a lot about variables in another section, so far this shouldn't be terribly new, but
a bit of a refresher.
Often you'll want to set variables based on what groups a machine is in. For instance, maybe machines in Boston
want to use 'boston.ntp.example.com' as an NTP server.
See the Working with Inventory document for multiple ways on how to define variables in inventory.
In a playbook, it's possible to define variables directly inline like so:
- hosts: webservers
vars:
http_port: 80
This can be nice as it's right there when you are reading the playbook.
It turns out we've already talked about variables in another place too.
As described in Roles, variables can also be included in the playbook via include files, which may or may
not be part of an "Ansible Role". Usage of roles is preferred as it provides a nice organizational system.
It's nice enough to know about how to define variables, but how do you use them?
Ansible allows you to reference variables in your playbooks using the Jinja2 templating system. While you can do a lot of complex things in Jinja, only the basics are things you really need to learn at first.
For example, in a simple template, you can do something like:
My amp goes to {{ max_amp_value }}
And that will provide the most basic form of variable substitution.
This is also valid directly in playbooks, and you'll occasionally want to do things like:
template: src=foo.cfg.j2 dest={{ remote_install_path }}/foo.cfg
In the above example, we used a variable to help decide where to place a file.
Inside a template you automatically have access to all of the variables that are in scope for a host. Actually
it's more than that -- you can also read variables about other hosts. We'll show how to do that in a bit.
注解
ansible allows Jinja2 loops and conditionals in templates, but in playbooks, we do not use them. Ansible
playbooks are pure machine-parseable YAML. This is a rather important feature as it means it is possible to code-generate
pieces of files, or to have other ecosystem tools read Ansible files. Not everyone will need this but it can unlock
possibilities.
注解
These are infrequently utilized features. Use them if they fit a use case you have, but this is optional knowledge.
Filters in Jinja2 are a way of transforming template expressions from one kind of data into another. Jinja2
ships with many of these. See builtin filters in the official Jinja2 template documentation.
In addition to those, Ansible supplies many more. See the Filters document
for a list of available filters and example usage guide.
YAML syntax requires that if you start a value with {{ foo }}
you quote the whole line, since it wants to be
sure you aren't trying to start a YAML dictionary. This is covered on the YAML Syntax documentation.
This won't work:
- hosts: app_servers
vars:
app_path: {{ base_path }}/22
Do it like this and you'll be fine:
- hosts: app_servers
vars:
app_path: "{{ base_path }}/22"
If you know you don't need any fact data about your hosts, and know everything about your systems centrally, you
can turn off fact gathering. This has advantages in scaling Ansible in push mode with very large numbers of
systems, mainly, or if you are using Ansible on experimental platforms. In any play, just do this:
- hosts: whatever
gather_facts: no
As discussed in the playbooks chapter, Ansible facts are a way of getting data about remote systems for use in playbook variables.
Usually these are discovered automatically by the setup
module in Ansible. Users can also write custom facts modules, as described in the API guide. However, what if you want to have a simple way to provide system or user provided data for use in Ansible variables, without writing a fact module?
"Facts.d" is one mechanism for users to control some aspect of how their systems are managed.
注解
Perhaps "local facts" is a bit of a misnomer, it means "locally supplied user values" as opposed to "centrally supplied user values", or what facts are -- "locally dynamically determined values".
If a remotely managed system has an /etc/ansible/facts.d
directory, any files in this directory
ending in .fact
, can be JSON, INI, or executable files returning JSON, and these can supply local facts in Ansible.
An alternate directory can be specified using the fact_path
play keyword.
For example, assume /etc/ansible/facts.d/preferences.fact
contains:
This will produce a hash variable fact named general
with asdf
and bar
as members.
To validate this, run the following:
ansible <hostname> -m setup -a "filter=ansible_local"
And you will see the following fact added:
"ansible_local": {
"preferences": {
"general": {
"asdf" : "1",
"bar" : "2"
}
}
}
And this data can be accessed in a template/playbook
as:
{{ ansible_local.preferences.general.asdf }}
The local namespace prevents any user supplied fact from overriding system facts or variables defined elsewhere in the playbook.
注解
The key part in the key=value pairs will be converted into lowercase inside the ansible_local variable. Using the example above, if the ini file contained XYZ=3
in the [general]
section, then you should expect to access it as: {{ ansible_local.preferences.general.xyz }}
and not {{ ansible_local.preferences.general.XYZ }}
. This is because Ansible uses Python's ConfigParser which passes all option names through the optionxform method and this method's default implementation converts option names to lower case.
If you have a playbook that is copying over a custom fact and then running it, making an explicit call to re-run the setup module
can allow that fact to be used during that particular play. Otherwise, it will be available in the next play that gathers fact information.
Here is an example of what that might look like:
- hosts: webservers
tasks:
- name: create directory for ansible custom facts
file: state=directory recurse=yes path=/etc/ansible/facts.d
- name: install custom ipmi fact
copy: src=ipmi.fact dest=/etc/ansible/facts.d
- name: re-read facts after adding custom fact
setup: filter=ansible_local
In this pattern however, you could also write a fact module as well, and may wish to consider this as an option.
To adapt playbook behavior to specific version of ansible, a variable ansible_version is available, with the following
structure:
"ansible_version": {
"full": "2.0.0.2",
"major": 2,
"minor": 0,
"revision": 0,
"string": "2.0.0.2"
}
As shown elsewhere in the docs, it is possible for one server to reference variables about another, like so:
{{ hostvars['asdf.example.com']['ansible_os_family'] }}
With "Fact Caching" disabled, in order to do this, Ansible must have already talked to 'asdf.example.com' in the
current play, or another play up higher in the playbook. This is the default configuration of ansible.
To avoid this, Ansible 1.8 allows the ability to save facts between playbook runs, but this feature must be manually
enabled. Why might this be useful?
With a very large infrastructure with thousands of hosts, fact caching could be configured to run nightly. Configuration of a small set of servers could run ad-hoc or periodically throughout the day. With fact caching enabled, it would
not be necessary to "hit" all servers to reference variables and information about them.
With fact caching enabled, it is possible for machine in one group to reference variables about machines in the other group, despite the fact that they have not been communicated with in the current execution of /usr/bin/ansible-playbook.
To benefit from cached facts, you will want to change the gathering
setting to smart
or explicit
or set gather_facts
to False
in most plays.
Currently, Ansible ships with two persistent cache plugins: redis and jsonfile.
To configure fact caching using redis, enable it in ansible.cfg
as follows:
[defaults]
gathering = smart
fact_caching = redis
fact_caching_timeout = 86400
# seconds
To get redis up and running, perform the equivalent OS commands:
yum install redis
service redis start
pip install redis
Note that the Python redis library should be installed from pip, the version packaged in EPEL is too old for use by Ansible.
In current embodiments, this feature is in beta-level state and the Redis plugin does not support port or password configuration, this is expected to change in the near future.
To configure fact caching using jsonfile, enable it in ansible.cfg
as follows:
[defaults]
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /path/to/cachedir
fact_caching_timeout = 86400
# seconds
fact_caching_connection
is a local filesystem path to a writeable
directory (ansible will attempt to create the directory if one does not exist).
fact_caching_timeout
is the number of seconds to cache the recorded facts.
Another major use of variables is running a command and using the result of that command to save the result into a variable. Results will vary from module to module. Use of -v
when executing playbooks will show possible values for the results.
The value of a task being executed in ansible can be saved in a variable and used later. See some examples of this in the
Conditionals chapter.
While it's mentioned elsewhere in that document too, here's a quick syntax example:
- hosts: web_servers
tasks:
- shell: /usr/bin/foo
register: foo_result
ignore_errors: True
- shell: /usr/bin/bar
when: foo_result.rc == 5
Registered variables are valid on the host the remainder of the playbook run, which is the same as the lifetime of "facts"
in Ansible. Effectively registered variables are just like facts.
When using register
with a loop, the data structure placed in the variable during the loop will contain a results
attribute, that is a list of all responses from the module. For a more in-depth example of how this works, see the Loops section on using register with a loop.
注解
If a task fails or is skipped, the variable still is registered with a failure or skipped status, the only way to avoid registering a variable is using tags.
We already described facts a little higher up in the documentation.
Some provided facts, like networking information, are made available as nested data structures. To access
them a simple {{ foo }}
is not sufficient, but it is still easy to do. Here's how we get an IP address:
{{ ansible_eth0["ipv4"]["address"] }}
OR alternatively:
{{ ansible_eth0.ipv4.address }}
Similarly, this is how we access the first element of an array:
Even if you didn't define them yourself, Ansible provides a few variables for you automatically.
The most important of these are hostvars
, group_names
, and groups
. Users should not use
these names themselves as they are reserved. environment
is also reserved.
hostvars
lets you ask about the variables of another host, including facts that have been gathered
about that host. If, at this point, you haven't talked to that host yet in any play in the playbook
or set of playbooks, you can still get the variables, but you will not be able to see the facts.
If your database server wants to use the value of a 'fact' from another node, or an inventory variable
assigned to another node, it's easy to do so within a template or even an action line:
{{ hostvars['test.example.com']['ansible_distribution'] }}
Additionally, group_names
is a list (array) of all the groups the current host is in. This can be used in templates using Jinja2 syntax to make template source files that vary based on the group membership (or role) of the host
{% if 'webserver' in group_names %}
# some part of a configuration file that only applies to webservers
{% endif %}
groups
is a list of all the groups (and hosts) in the inventory. This can be used to enumerate all hosts within a group.
For example:
{% for host in groups['app_servers'] %}
# something that applies to all app servers.
{% endfor %}
A frequently used idiom is walking a group to find all IP addresses in that group
{% for host in groups['app_servers'] %}
{{ hostvars[host]['ansible_eth0']['ipv4']['address'] }}
{% endfor %}
An example of this could include pointing a frontend proxy server to all of the app servers, setting up the correct firewall rules between servers, etc.
You need to make sure that the facts of those hosts have been populated before though, for example by running a play against them if the facts have not been cached recently (fact caching was added in Ansible 1.8).
Additionally, inventory_hostname
is the name of the hostname as configured in Ansible's inventory host file. This can
be useful for when you don't want to rely on the discovered hostname ansible_hostname
or for other mysterious
reasons. If you have a long FQDN, inventory_hostname_short
also contains the part up to the first
period, without the rest of the domain.
play_hosts
has been deprecated in 2.2, it was the same as the new ansible_play_batch
variable.
ansible_play_hosts
is the full list of all hosts still active in the current play.
ansible_play_batch
is available as a list of hostnames that are in scope for the current 'batch' of the play. The batch size is defined by serial
, when not set it is equivalent to the whole play (making it the same as ansible_play_hosts
).
ansible_playbook_python
is the path to the python executable used to invoke the Ansible command line tool.
These vars may be useful for filling out templates with multiple hostnames or for injecting the list into the rules for a load balancer.
Don't worry about any of this unless you think you need it. You'll know when you do.
Also available, inventory_dir
is the pathname of the directory holding Ansible's inventory host file, inventory_file
is the pathname and the filename pointing to the Ansible's inventory host file.
playbook_dir
contains the playbook base directory.
We then have role_path
which will return the current role's pathname (since 1.8). This will only work inside a role.
And finally, ansible_check_mode
(added in version 2.1), a boolean magic variable which will be set to True
if you run Ansible with --check
.
It's a great idea to keep your playbooks under source control, but
you may wish to make the playbook source public while keeping certain
important variables private. Similarly, sometimes you may just
want to keep certain information in different files, away from
the main playbook.
You can do this by using an external variables file, or files, just like this:
---
- hosts: all
remote_user: root
vars:
favcolor: blue
vars_files:
- /vars/external_vars.yml
tasks:
- name: this is just a placeholder
command: /bin/echo foo
This removes the risk of sharing sensitive data with others when
sharing your playbook source with them.
The contents of each variables file is a simple YAML dictionary, like this:
---
# in the above example, this would be vars/external_vars.yml
somevar: somevalue
password: magic
In addition to vars_prompt
and vars_files
, it is possible to set variables at the
command line using the --extra-vars
(or -e
) argument. Variables can be defined using
a single quoted string (containing one or more variables) using one of the formats below
key=value format:
ansible-playbook release.yml --extra-vars "version=1.23.45 other_variable=foo"
注解
Values passed in using the key=value
syntax are interpreted as strings.
Use the JSON format if you need to pass in anything that shouldn't be a string (Booleans, integers, floats, lists etc).
JSON string format:
ansible-playbook release.yml --extra-vars '{"version":"1.23.45","other_variable":"foo"}'
ansible-playbook arcade.yml --extra-vars '{"pacman":"mrs","ghosts":["inky","pinky","clyde","sue"]}'
YAML string format:
ansible-playbook release.yml --extra-vars '
version: "1.23.45"
other_variable: foo'
ansible-playbook arcade.yml --extra-vars '
pacman: mrs
ghosts:
- inky
- pinky
- clyde
- sue'
vars from a JSON or YAML file:
ansible-playbook release.yml --extra-vars "@some_file.json"
This is useful for, among other things, setting the hosts group or the user for the playbook.
Escaping quotes and other special characters:
Ensure you're escaping quotes appropriately for both your markup (e.g. JSON), and for
the shell you're operating in.:
ansible-playbook arcade.yml --extra-vars "{\"name\":\"Conan O\'Brien\"}"
ansible-playbook arcade.yml --extra-vars '{"name":"Conan O'\\\''Brien"}'
ansible-playbook script.yml --extra-vars "{\"dialog\":\"He said \\\"I just can\'t get enough of those single and double-quotes"\!"\\\"\"}"
In these cases, it's probably best to use a JSON or YAML file containing the variable
definitions.
A lot of folks may ask about how variables override another. Ultimately it's Ansible's philosophy that it's better
you know where to put a variable, and then you have to think about it a lot less.
Avoid defining the variable "x" in 47 places and then ask the question "which x gets used".
Why? Because that's not Ansible's Zen philosophy of doing things.
There is only one Empire State Building. One Mona Lisa, etc. Figure out where to define a variable, and don't make
it complicated.
However, let's go ahead and get precedence out of the way! It exists. It's a real thing, and you might have
a use for it.
If multiple variables of the same name are defined in different places, they get overwritten in a certain order.
Here is the order of precedence from least to greatest (the last listed variables winning prioritization):
- role defaults
- inventory file or script group vars
- inventory group_vars/all
- playbook group_vars/all
- inventory group_vars/*
- playbook group_vars/*
- inventory file or script host vars
- inventory host_vars/*
- playbook host_vars/*
- host facts / cached set_facts
- inventory host_vars/*
- playbook host_vars/*
- host facts
- play vars
- play vars_prompt
- play vars_files
- role vars (defined in role/vars/main.yml)
- block vars (only for tasks in block)
- task vars (only for the task)
- include_vars
- set_facts / registered vars
- role (and include_role) params
- include params
- extra vars (always win precedence)
Basically, anything that goes into "role defaults" (the defaults folder inside the role) is the most malleable and easily overridden. Anything in the vars directory of the role overrides previous versions of that variable in namespace. The idea here to follow is that the more explicit you get in scope, the more precedence it takes with command line -e
extra vars always winning. Host and/or inventory variables can win over role defaults, but not explicit includes like the vars directory or an include_vars
task.
Footnotes
注解
Within any section, redefining a var will overwrite the previous instance.
If multiple groups have the same variable, the last one loaded wins.
If you define a variable twice in a play's vars:
section, the second one wins.
注解
The previous describes the default config hash_behaviour=replace
, switch to merge
to only partially overwrite.
注解
Group loading follows parent/child relationships. Groups of the same 'patent/child' level are then merged following alphabetical order.
This last one can be superceeded by the user via ansible_group_priority
, which defaults to 1
for all groups.
Another important thing to consider (for all versions) is that connection variables override config, command line and play/role/task specific options and keywords. For example:
This will still connect as ramon
because ansible_ssh_user
is set to ramon
in inventory for myhost.
For plays/tasks this is also true for remote_user
:
- hosts: myhost
tasks:
- command: i'll connect as ramon still
remote_user: lola
This is done so host-specific settings can override the general settings. These variables are normally defined per host or group in inventory,
but they behave like other variables. If you want to override the remote user globally (even over inventory) you can use extra vars:
ansible... -e "ansible_user=<user>"
You can also override as a normal variable in a play:
- hosts: all
vars:
ansible_user: lola
tasks:
- command: i'll connect as lola!
Ansible has three main scopes:
- Global: this is set by config, environment variables and the command line
- Play: each play and contained structures, vars entries (vars; vars_files; vars_prompt), role defaults and vars.
- Host: variables directly associated to a host, like inventory, include_vars, facts or registered task outputs
Let's show some examples and where you would choose to put what based on the kind of control you might want over values.
First off, group variables are powerful.
Site wide defaults should be defined as a group_vars/all
setting. Group variables are generally placed alongside
your inventory file. They can also be returned by a dynamic inventory script (see Working With Dynamic Inventory) or defined
in things like Ansible Tower from the UI or API:
---
# file: /etc/ansible/group_vars/all
# this is the site wide default
ntp_server: default-time.example.com
Regional information might be defined in a group_vars/region
variable. If this group is a child of the all
group (which it is, because all groups are), it will override the group that is higher up and more general:
---
# file: /etc/ansible/group_vars/boston
ntp_server: boston-time.example.com
If for some crazy reason we wanted to tell just a specific host to use a specific NTP server, it would then override the group variable!:
---
# file: /etc/ansible/host_vars/xyz.boston.example.com
ntp_server: override.example.com
So that covers inventory and what you would normally set there. It's a great place for things that deal with geography or behavior. Since groups are frequently the entity that maps roles onto hosts, it is sometimes a shortcut to set variables on the group instead of defining them on a role. You could go either way.
Remember: Child groups override parent groups, and hosts always override their groups.
Next up: learning about role variable precedence.
We'll pretty much assume you are using roles at this point. You should be using roles for sure. Roles are great. You are using
roles aren't you? Hint hint.
If you are writing a redistributable role with reasonable defaults, put those in the roles/x/defaults/main.yml
file. This means
the role will bring along a default value but ANYTHING in Ansible will override it.
See Roles for more info about this:
---
# file: roles/x/defaults/main.yml
# if not overridden in inventory or as a parameter, this is the value that will be used
http_port: 80
If you are writing a role and want to ensure the value in the role is absolutely used in that role, and is not going to be overridden
by inventory, you should put it in roles/x/vars/main.yml
like so, and inventory values cannot override it. -e
however, still will:
---
# file: roles/x/vars/main.yml
# this will absolutely be used in this role
http_port: 80
This is one way to plug in constants about the role that are always true. If you are not sharing your role with others,
app specific behaviors like ports is fine to put in here. But if you are sharing roles with others, putting variables in here might
be bad. Nobody will be able to override them with inventory, but they still can by passing a parameter to the role.
Parameterized roles are useful.
If you are using a role and want to override a default, pass it as a parameter to the role like so:
roles:
- role: apache
vars:
http_port: 8080
This makes it clear to the playbook reader that you've made a conscious choice to override some default in the role, or pass in some
configuration that the role can't assume by itself. It also allows you to pass something site-specific that isn't really part of the
role you are sharing with others.
This can often be used for things that might apply to some hosts multiple times. For example:
roles:
- role: app_user
vars:
myname: Ian
- role: app_user
vars:
myname: Terry
- role: app_user
vars:
myname: Graham
- role: app_user
vars:
myname: John
In this example, the same role was invoked multiple times. It's quite likely there was
no default for name
supplied at all. Ansible can warn you when variables aren't defined -- it's the default behavior in fact.
There are a few other things that go on with roles.
Generally speaking, variables set in one role are available to others. This means if you have a roles/common/vars/main.yml
you
can set variables in there and make use of them in other roles and elsewhere in your playbook:
roles:
- role: common_settings
- role: something
vars:
foo: 12
- role: something_else
注解
There are some protections in place to avoid the need to namespace variables.
In the above, variables defined in common_settings are most definitely available to 'something' and 'something_else' tasks, but if
"something's" guaranteed to have foo set at 12, even if somewhere deep in common settings it set foo to 20.
So, that's precedence, explained in a more direct way. Don't worry about precedence, just think about if your role is defining a
variable that is a default, or a "live" variable you definitely want to use. Inventory lies in precedence right in the middle, and
if you want to forcibly override something, use -e
.
If you found that a little hard to understand, take a look at the ansible-examples repo on our github for a bit more about
how all of these things can work together.
For information about advanced YAML syntax used to declare variables and have more control over the data placed in YAML files used by Ansible, see Advanced Syntax.
Often the result of a play may depend on the value of a variable, fact (something learned about the remote system), or previous task result.
In some cases, the values of variables may depend on other variables.
Additional groups can be created to manage hosts based on whether the hosts match other criteria. This topic covers how conditionals are used in playbooks.
Sometimes you will want to skip a particular step on a particular host.
This could be something as simple as not installing a certain package if the operating system is a particular version,
or it could be something like performing some cleanup steps if a filesystem is getting full.
This is easy to do in Ansible with the when clause, which contains a raw Jinja2 expression without double curly braces (see Variables).
It's actually pretty simple:
tasks:
- name: "shut down Debian flavored systems"
command: /sbin/shutdown -t now
when: ansible_os_family == "Debian"
# note that Ansible facts and vars like ansible_os_family can be used
# directly in conditionals without double curly braces
You can also use parentheses to group conditions:
tasks:
- name: "shut down CentOS 6 and Debian 7 systems"
command: /sbin/shutdown -t now
when: (ansible_distribution == "CentOS" and ansible_distribution_major_version == "6") or
(ansible_distribution == "Debian" and ansible_distribution_major_version == "7")
Multiple conditions that all need to be true (a logical 'and') can also be specified as a list:
tasks:
- name: "shut down CentOS 6 systems"
command: /sbin/shutdown -t now
when:
- ansible_distribution == "CentOS"
- ansible_distribution_major_version == "6"
A number of Jinja2 "filters" can also be used in when statements, some of which are unique
and provided by Ansible. Suppose we want to ignore the error of one statement and then
decide to do something conditionally based on success or failure:
tasks:
- command: /bin/false
register: result
ignore_errors: True
- command: /bin/something
when: result is failed
# In older versions of ansible use ``success``, now both are valid but succeeded uses the correct tense.
- command: /bin/something_else
when: result is succeeded
- command: /bin/still/something_else
when: result is skipped
注解
both success and succeeded work (fail/failed, etc).
As a reminder, to see what facts are available on a particular system, you can do the following:
ansible hostname.example.com -m setup
Tip: Sometimes you'll get back a variable that's a string and you'll want to do a math operation comparison on it. You can do this like so:
tasks:
- shell: echo "only on Red Hat 6, derivatives, and later"
when: ansible_os_family == "RedHat" and ansible_lsb.major_release|int >= 6
注解
the above example requires the lsb_release package on the target host in order to return the ansible_lsb.major_release fact.
Variables defined in the playbooks or inventory can also be used. An example may be the execution of a task based on a variable's boolean value:
Then a conditional execution might look like:
tasks:
- shell: echo "This certainly is epic!"
when: epic
or:
tasks:
- shell: echo "This certainly isn't epic!"
when: not epic
If a required variable has not been set, you can skip or fail using Jinja2's defined test. For example:
tasks:
- shell: echo "I've got '{{ foo }}' and am not afraid to use it!"
when: foo is defined
- fail: msg="Bailing out. this play requires 'bar'"
when: bar is undefined
This is especially useful in combination with the conditional import of vars files (see below).
As the examples show, you don't need to use {{ }} to use variables inside conditionals, as these are already implied.
Combining when with loops (see Loops), be aware that the when statement is processed separately for each item. This is by design:
tasks:
- command: echo {{ item }}
loop: [ 0, 2, 4, 6, 8, 10 ]
when: item > 5
If you need to skip the whole task depending on the loop variable being defined, used the |default filter to provide an empty iterator:
- command: echo {{ item }}
loop: "{{ mylist|default([]) }}"
when: item > 5
If using a dict in a loop:
- command: echo {{ item.key }}
loop: "{{ query('dict', mydict|default({})) }}"
when: item.value > 5
It's also easy to provide your own facts if you want, which is covered in Developing Modules. To run them, just
make a call to your own custom fact gathering module at the top of your list of tasks, and variables returned
there will be accessible to future tasks:
tasks:
- name: gather site specific fact data
action: site_facts
- command: /usr/bin/thingy
when: my_custom_fact_just_retrieved_from_the_remote_system == '1234'
Note that if you have several tasks that all share the same conditional statement, you can affix the conditional
to a task include statement as below. All the tasks get evaluated, but the conditional is applied to each and every task:
- import_tasks: tasks/sometasks.yml
when: "'reticulating splines' in output"
注解
In versions prior to 2.0 this worked with task includes but not playbook includes. 2.0 allows it to work with both.
Or with a role:
- hosts: webservers
roles:
- role: debian_stock_config
when: ansible_os_family == 'Debian'
You will note a lot of 'skipped' output by default in Ansible when using this approach on systems that don't match the criteria.
Read up on the 'group_by' module in the Working With Modules docs for a more streamlined way to accomplish the same thing.
When used with include_* tasks instead of imports, the conditional is applied _only_ to the include task itself and not any other
tasks within the included file(s). A common situation where this distinction is important is as follows:
# include a file to define a variable when it is not already defined
# main.yml
- include_tasks: other_tasks.yml
when: x is not defined
# other_tasks.yml
- set_fact:
x: foo
- debug:
var: x
In the above example, if import_tasks
had been used instead both included tasks would have also been skipped. With include_tasks
instead, the tasks are executed as expected because the conditional is not applied to them.
注解
This is an advanced topic that is infrequently used.
Sometimes you will want to do certain things differently in a playbook based on certain criteria.
Having one playbook that works on multiple platforms and OS versions is a good example.
As an example, the name of the Apache package may be different between CentOS and Debian,
but it is easily handled with a minimum of syntax in an Ansible Playbook:
---
- hosts: all
remote_user: root
vars_files:
- "vars/common.yml"
- [ "vars/{{ ansible_os_family }}.yml", "vars/os_defaults.yml" ]
tasks:
- name: make sure apache is started
service: name={{ apache }} state=started
注解
The variable 'ansible_os_family' is being interpolated into
the list of filenames being defined for vars_files.
As a reminder, the various YAML files contain just keys and values:
---
# for vars/RedHat.yml
apache: httpd
somethingelse: 42
How does this work? For Red Hat operating systems ('CentOS', for example), the first file Ansible tries to import
is 'vars/RedHat.yml'. If that file does not exist, Ansible attempts to load 'vars/os_defaults.yml'. If no files in
the list were found, an error is raised.
On Debian, Ansible first looks for 'vars/Debian.yml' instead of 'vars/RedHat.yml', before
falling back on 'vars/os_defaults.yml'.
Ansible's approach to configuration -- separating variables from tasks, keeping your playbooks
from turning into arbitrary code with nested conditionals - results in more streamlined and auditable configuration rules because there are fewer decision points to track.
注解
This is an advanced topic that is infrequently used. You can probably skip this section.
Sometimes a configuration file you want to copy, or a template you will use may depend on a variable.
The following construct selects the first available file appropriate for the variables of a given host, which is often much cleaner than putting a lot of if conditionals in a template.
The following example shows how to template out a configuration file that was very different between, say, CentOS and Debian:
- name: template a file
template:
src: "{{ item }}"
dest: /etc/myapp/foo.conf
loop: "{{ query('first_found', { 'files': myfiles, 'paths': mypaths}) }}"
vars:
myfiles:
- "{{ansible_distribution}}.conf"
- default.conf
mypaths: ['search_location_one/somedir/', '/opt/other_location/somedir/']
Often in a playbook it may be useful to store the result of a given command in a variable and access
it later. Use of the command module in this way can in many ways eliminate the need to write site specific facts, for
instance, you could test for the existence of a particular program.
The 'register' keyword decides what variable to save a result in. The resulting variables can be used in templates, action lines, or when statements. It looks like this (in an obviously trivial example):
- name: test play
hosts: all
tasks:
- shell: cat /etc/motd
register: motd_contents
- shell: echo "motd contains the word hi"
when: motd_contents.stdout.find('hi') != -1
As shown previously, the registered variable's string contents are accessible with the 'stdout' value.
The registered result can be used in the loop of a task if it is converted into
a list (or already is a list) as shown below. "stdout_lines" is already available on the object as
well though you could also call "home_dirs.stdout.split()" if you wanted, and could split by other
fields:
- name: registered variable usage as a loop list
hosts: all
tasks:
- name: retrieve the list of home directories
command: ls /home
register: home_dirs
- name: add home dirs to the backup spooler
file:
path: /mnt/bkspool/{{ item }}
src: /home/{{ item }}
state: link
loop: "{{ home_dirs.stdout_lines }}"
# same as loop: "{{ home_dirs.stdout.split() }}"
As shown previously, the registered variable's string contents are accessible with the 'stdout' value.
You may check the registered variable's string contents for emptiness:
- name: check registered variable for emptiness
hosts: all
tasks:
- name: list contents of directory
command: ls mydir
register: contents
- name: check contents for emptiness
debug:
msg: "Directory is empty"
when: contents.stdout == ""
The following Facts are frequently used in Conditionals - see above for examples.
ansible_distribution
Possible values:
Alpine
Altlinux
Amazon
Archlinux
ClearLinux
Coreos
Debian
Fedora
Gentoo
Mandriva
NA
OpenWrt
OracleLinux
RedHat
Slackware
SMGL
SUSE
VMwareESX
ansible_os_family
Possible values:
AIX
Alpine
Altlinux
Archlinux
Darwin
Debian
FreeBSD
Gentoo
HP-UX
Mandrake
RedHat
SGML
Slackware
Solaris
Suse
Often you'll want to do many things in one task, such as create a lot of users, install a lot of packages, or
repeat a polling step until a certain result is reached.
This chapter is all about how to use loops in playbooks.
To save some typing, repeated tasks can be written in short-hand like so:
- name: add several users
user:
name: "{{ item }}"
state: present
groups: "wheel"
loop:
- testuser1
- testuser2
If you have defined a YAML list in a variables file, or the 'vars' section, you can also do:
The above would be the equivalent of:
- name: add user testuser1
user:
name: "testuser1"
state: present
groups: "wheel"
- name: add user testuser2
user:
name: "testuser2"
state: present
groups: "wheel"
注解
Before 2.5 Ansible mainly used the with_<lookup>
keywords to create loops, the loop keyword is basically analogous to with_list
.
Some plugins like, the yum and apt modules can take lists directly to their options, this is more optimal than looping over the task.
See each action's documentation for details, for now here is an example:
- name: optimal yum
yum:
name: "{{list_of_packages}}"
state: present
- name: non optimal yum, not only slower but might cause issues with interdependencies
yum:
name: "{{item}}"
state: present
loop: "{{list_of_packages}}"
Note that the types of items you iterate over do not have to be simple lists of strings.
If you have a list of hashes, you can reference subkeys using things like:
- name: add several users
user:
name: "{{ item.name }}"
state: present
groups: "{{ item.groups }}"
loop:
- { name: 'testuser1', groups: 'wheel' }
- { name: 'testuser2', groups: 'root' }
Also be aware that when combining Conditionals with a loop, the when:
statement is processed separately for each item.
See The When Statement for an example.
To loop over a dict, use the dict2items
Dict Filter:
- name: create a tag dictionary of non-empty tags
set_fact:
tags_dict: "{{ (tags_dict|default({}))|combine({item.key: item.value}) }}"
loop: "{{ tags|dict2items }}"
vars:
tags:
Environment: dev
Application: payment
Another: "{{ doesnotexist|default() }}"
when: item.value != ""
Here, we don't want to set empty tags, so we create a dictionary containing only non-empty tags.
Sometimes you need more than what a simple list provides, you can use Jinja2 expressions to create complex lists:
For example, using the 'nested' lookup, you can combine lists:
- name: give users access to multiple databases
mysql_user:
name: "{{ item[0] }}"
priv: "{{ item[1] }}.*:ALL"
append_privs: yes
password: "foo"
loop: "{{ ['alice', 'bob'] |product(['clientdb', 'employeedb', 'providerdb'])|list }}"
注解
with_
loops are actually a combination of things with_
+ lookup()
, even items
is a lookup. loop
can be used in the same way as shown above.
In Ansible 2.5 a new jinja2 function was introduced named query, that offers several benefits over lookup
when using the new loop
keyword.
This is described more in the lookup documentation, however, query
provides a more simple interface and a more predictable output from lookup plugins, ensuring better compatibility with loop
.
In certain situations the lookup
function may not return a list which loop
requires.
The following invocations are equivalent, using wantlist=True
with lookup
to ensure a return type of a list:
loop: "{{ query('inventory_hostnames', 'all') }}"
loop: "{{ lookup('inventory_hostnames', 'all', wantlist=True) }}"
Sometimes you would want to retry a task until a certain condition is met. Here's an example:
- shell: /usr/bin/foo
register: result
until: result.stdout.find("all systems go") != -1
retries: 5
delay: 10
The above example run the shell module recursively till the module's result has "all systems go" in its stdout or the task has
been retried for 5 times with a delay of 10 seconds. The default value for "retries" is 3 and "delay" is 5.
The task returns the results returned by the last task run. The results of individual retries can be viewed by -vv option.
The registered variable will also have a new key "attempts" which will have the number of the retries for the task.
注解
If the until
parameter isn't defined, the value for the retries
parameter is forced to 1.
After using register
with a loop, the data structure placed in the variable will contain a results
attribute that is a list of all responses from the module.
Here is an example of using register
with loop
:
- shell: "echo {{ item }}"
loop:
- "one"
- "two"
register: echo
This differs from the data structure returned when using register
without a loop:
{
"changed": true,
"msg": "All items completed",
"results": [
{
"changed": true,
"cmd": "echo \"one\" ",
"delta": "0:00:00.003110",
"end": "2013-12-19 12:00:05.187153",
"invocation": {
"module_args": "echo \"one\"",
"module_name": "shell"
},
"item": "one",
"rc": 0,
"start": "2013-12-19 12:00:05.184043",
"stderr": "",
"stdout": "one"
},
{
"changed": true,
"cmd": "echo \"two\" ",
"delta": "0:00:00.002920",
"end": "2013-12-19 12:00:05.245502",
"invocation": {
"module_args": "echo \"two\"",
"module_name": "shell"
},
"item": "two",
"rc": 0,
"start": "2013-12-19 12:00:05.242582",
"stderr": "",
"stdout": "two"
}
]
}
Subsequent loops over the registered variable to inspect the results may look like:
- name: Fail if return code is not 0
fail:
msg: "The command ({{ item.cmd }}) did not have a 0 return code"
when: item.rc != 0
loop: "{{ echo.results }}"
During iteration, the result of the current item will be placed in the variable:
- shell: echo "{{ item }}"
loop:
- one
- two
register: echo
changed_when: echo.stdout != "one"
If you wish to loop over the inventory, or just a subset of it, there are multiple ways.
One can use a regular loop
with the ansible_play_batch
or groups
variables, like this:
# show all the hosts in the inventory
- debug:
msg: "{{ item }}"
loop: "{{ groups['all'] }}"
# show all the hosts in the current play
- debug:
msg: "{{ item }}"
loop: "{{ ansible_play_batch }}"
There is also a specific lookup plugin inventory_hostnames
that can be used like this:
# show all the hosts in the inventory
- debug:
msg: "{{ item }}"
loop: "{{ query('inventory_hostnames', 'all') }}"
# show all the hosts matching the pattern, ie all but the group www
- debug:
msg: "{{ item }}"
loop: "{{ query('inventory_hostnames', 'all!www') }}"
More information on the patterns can be found on Working with Patterns
In 2.0 you are again able to use loops and task includes (but not playbook includes). This adds the ability to loop over the set of tasks in one shot.
Ansible by default sets the loop variable item
for each loop, which causes these nested loops to overwrite the value of item
from the "outer" loops.
As of Ansible 2.1, the loop_control
option can be used to specify the name of the variable to be used for the loop:
# main.yml
- include_tasks: inner.yml
loop:
- 1
- 2
- 3
loop_control:
loop_var: outer_item
# inner.yml
- debug:
msg: "outer item={{ outer_item }} inner item={{ item }}"
loop:
- a
- b
- c
注解
If Ansible detects that the current loop is using a variable which has already been defined, it will raise an error to fail the task.
When using complex data structures for looping the display might get a bit too "busy", this is where the label
directive comes to help:
- name: create servers
digital_ocean:
name: "{{ item.name }}"
state: present
loop:
- name: server1
disks: 3gb
ram: 15Gb
network:
nic01: 100Gb
nic02: 10Gb
...
loop_control:
label: "{{ item.name }}"
This will now display just the label
field instead of the whole structure per item
, it defaults to {{ item }}
to display things as usual.
Another option to loop control is pause
, which allows you to control the time (in seconds) between execution of items in a task loop.:
# main.yml
- name: create servers, pause 3s before creating next
digital_ocean:
name: "{{ item }}"
state: present
loop:
- server1
- server2
loop_control:
pause: 3
If you need to keep track of where you are in a loop, you can use the index_var
option to loop control to specify a variable name to contain the current loop index.:
- name: count our fruit
debug:
msg: "{{ item }} with index {{ my_idx }}"
loop:
- apple
- banana
- pear
loop_control:
index_var: my_idx
With the release of Ansible 2.5, the recommended way to perform loops is the use the new loop
keyword instead of with_X
style loops.
In many cases, loop
syntax is better expressed using filters instead of more complex use of query
or lookup
.
The following examples will show how to convert many common with_
style loops to loop
and filters.
with_list
with_list
is directly replaced by loop
.
- name: with_list
debug:
msg: "{{ item }}"
with_list:
- one
- two
- name: with_list -> loop
debug:
msg: "{{ item }}"
loop:
- one
- two
with_items
with_items
is replaced by loop
and the flatten
filter.
- name: with_items
debug:
msg: "{{ item }}"
with_items: "{{ items }}"
- name: with_items -> loop
debug:
msg: "{{ item }}"
loop: "{{ items|flatten(levels=1) }}"
with_indexed_items
with_indexed_items
is replaced by loop
, the flatten
filter and loop_control.index_var
.
- name: with_indexed_items
debug:
msg: "{{ item.0 }} - {{ item.1 }}"
with_indexed_items: "{{ items }}"
- name: with_indexed_items -> loop
debug:
msg: "{{ index }} - {{ item }}"
loop: "{{ items|flatten(levels=1) }}"
loop_control:
index_var: index
with_flattened
with_flattened
is replaced by loop
and the flatten
filter.
- name: with_flattened
debug:
msg: "{{ item }}"
with_flattened: "{{ items }}"
- name: with_flattened -> loop
debug:
msg: "{{ item }}"
loop: "{{ items|flatten }}"
with_together
with_together
is replaced by loop
and the zip
filter.
- name: with_together
debug:
msg: "{{ item.0 }} - {{ item.1 }}"
with_together:
- "{{ list_one }}"
- "{{ list_two }}"
- name: with_together -> loop
debug:
msg: "{{ item.0 }} - {{ item.1 }}"
loop: "{{ list_one|zip(list_two)|list }}"
with_dict
with_dict
can be substituted by loop
and either the dictsort
or dict2items
filters.
- name: with_dict
debug:
msg: "{{ item.key }} - {{ item.value }}"
with_dict: "{{ dictionary }}"
- name: with_dict -> loop (option 1)
debug:
msg: "{{ item.key }} - {{ item.value }}"
loop: "{{ dictionary|dict2items }}"
- name: with_dict -> loop (option 2)
debug:
msg: "{{ item.0 }} - {{ item.1 }}"
loop: "{{ dictionary|dictsort }}"
with_sequence
with_sequence
is replaced by loop
and the range
function, and potentially the format
filter.
- name: with_sequence
debug:
msg: "{{ item }}"
with_sequence: start=0 end=4 stride=2 format=testuser%02x
- name: with_sequence -> loop
debug:
msg: "{{ 'testuser%02x' | format(item) }}"
# range is exclusive of the end point
loop: "{{ range(0, 4 + 1, 2)|list }}"
with_subelements
with_subelements
is replaced by loop
and the subelements
filter.
- name: with_subelements
debug:
msg: "{{ item.0.name }} - {{ item.1 }}"
with_subelements:
- "{{ users }}"
- mysql.hosts
- name: with_subelements -> loop
debug:
msg: "{{ item.0.name }} - {{ item.1 }}"
loop: "{{ users|subelements('mysql.hosts') }}"
with_nested/with_cartesian
with_nested
and with_cartesian
are replaced by loop and the product
filter.
- name: with_nested
debug:
msg: "{{ item.0 }} - {{ item.1 }}"
with_nested:
- "{{ list_one }}"
- "{{ list_two }}"
- name: with_nested -> loop
debug:
msg: "{{ item.0 }} - {{ item.1 }}"
loop: "{{ list_one|product(list_two)|list }}"
with_random_choice
with_random_choice
is replaced by just use of the random
filter, without need of loop
.
- name: with_random_choice
debug:
msg: "{{ item }}"
with_random_choice: "{{ my_list }}"
- name: with_random_choice -> loop (No loop is needed here)
debug:
msg: "{{ my_list|random }}"
tags: random
Here are some tips for making the most of Ansible and Ansible playbooks.
You can find some example playbooks illustrating these best practices in our ansible-examples repository. (NOTE: These may not use all of the features in the latest release, but are still an excellent reference!).
The following section shows one of many possible ways to organize playbook content.
Your usage of Ansible should fit your needs, however, not ours, so feel free to modify this approach and organize as you see fit.
One crucial way to organize your playbook content is Ansible's "roles" organization feature, which is documented as part
of the main playbooks page. You should take the time to read and understand the roles documentation which is available here: Roles.
Directory Layout
The top level of the directory would contain files and directories like so:
production # inventory file for production servers
staging # inventory file for staging environment
group_vars/
group1.yml # here we assign variables to particular groups
group2.yml
host_vars/
hostname1.yml # here we assign variables to particular systems
hostname2.yml
library/ # if any custom modules, put them here (optional)
module_utils/ # if any custom module_utils to support modules, put them here (optional)
filter_plugins/ # if any custom filter plugins, put them here (optional)
site.yml # master playbook
webservers.yml # playbook for webserver tier
dbservers.yml # playbook for dbserver tier
roles/
common/ # this hierarchy represents a "role"
tasks/ #
main.yml # <-- tasks file can include smaller files if warranted
handlers/ #
main.yml # <-- handlers file
templates/ # <-- files for use with the template resource
ntp.conf.j2 # <------- templates end in .j2
files/ #
bar.txt # <-- files for use with the copy resource
foo.sh # <-- script files for use with the script resource
vars/ #
main.yml # <-- variables associated with this role
defaults/ #
main.yml # <-- default lower priority variables for this role
meta/ #
main.yml # <-- role dependencies
library/ # roles can also include custom modules
module_utils/ # roles can also include custom module_utils
lookup_plugins/ # or other types of plugins, like lookup in this case
webtier/ # same kind of structure as "common" was above, done for the webtier role
monitoring/ # ""
fooapp/ # ""
Alternative Directory Layout
Alternatively you can put each inventory file with its group_vars
/host_vars
in a separate directory. This is particularly useful if your group_vars
/host_vars
don't have that much in common in different environments. The layout could look something like this:
inventories/
production/
hosts # inventory file for production servers
group_vars/
group1.yml # here we assign variables to particular groups
group2.yml
host_vars/
hostname1.yml # here we assign variables to particular systems
hostname2.yml
staging/
hosts # inventory file for staging environment
group_vars/
group1.yml # here we assign variables to particular groups
group2.yml
host_vars/
stagehost1.yml # here we assign variables to particular systems
stagehost2.yml
library/
module_utils/
filter_plugins/
site.yml
webservers.yml
dbservers.yml
roles/
common/
webtier/
monitoring/
fooapp/
This layout gives you more flexibility for larger environments, as well as a total separation of inventory variables between different environments. The downside is that it is harder to maintain, because there are more files.
Use Dynamic Inventory With Clouds
If you are using a cloud provider, you should not be managing your inventory in a static file. See Working With Dynamic Inventory.
This does not just apply to clouds -- If you have another system maintaining a canonical list of systems
in your infrastructure, usage of dynamic inventory is a great idea in general.
How to Differentiate Staging vs Production
If managing static inventory, it is frequently asked how to differentiate different types of environments. The following example
shows a good way to do this. Similar methods of grouping could be adapted to dynamic inventory (for instance, consider applying the AWS
tag "environment:production", and you'll get a group of systems automatically discovered named "ec2_tag_environment_production".
Let's show a static inventory example though. Below, the production file contains the inventory of all of your production hosts.
It is suggested that you define groups based on purpose of the host (roles) and also geography or datacenter location (if applicable):
# file: production
[atlanta-webservers]
www-atl-1.example.com
www-atl-2.example.com
[boston-webservers]
www-bos-1.example.com
www-bos-2.example.com
[atlanta-dbservers]
db-atl-1.example.com
db-atl-2.example.com
[boston-dbservers]
db-bos-1.example.com
# webservers in all geos
[webservers:children]
atlanta-webservers
boston-webservers
# dbservers in all geos
[dbservers:children]
atlanta-dbservers
boston-dbservers
# everything in the atlanta geo
[atlanta:children]
atlanta-webservers
atlanta-dbservers
# everything in the boston geo
[boston:children]
boston-webservers
boston-dbservers
Group And Host Variables
This section extends on the previous example.
Groups are nice for organization, but that's not all groups are good for. You can also assign variables to them! For instance, atlanta has its own NTP servers, so when setting up ntp.conf, we should use them. Let's set those now:
---
# file: group_vars/atlanta
ntp: ntp-atlanta.example.com
backup: backup-atlanta.example.com
Variables aren't just for geographic information either! Maybe the webservers have some configuration that doesn't make sense for the database servers:
---
# file: group_vars/webservers
apacheMaxRequestsPerChild: 3000
apacheMaxClients: 900
If we had any default values, or values that were universally true, we would put them in a file called group_vars/all:
---
# file: group_vars/all
ntp: ntp-boston.example.com
backup: backup-boston.example.com
We can define specific hardware variance in systems in a host_vars file, but avoid doing this unless you need to:
---
# file: host_vars/db-bos-1.example.com
foo_agent_port: 86
bar_agent_port: 99
Again, if we are using dynamic inventory sources, many dynamic groups are automatically created. So a tag like "class:webserver" would load in
variables from the file "group_vars/ec2_tag_class_webserver" automatically.
Top Level Playbooks Are Separated By Role
In site.yml, we import a playbook that defines our entire infrastructure. This is a very short example, because it's just importing
some other playbooks:
---
# file: site.yml
- import_playbook: webservers.yml
- import_playbook: dbservers.yml
In a file like webservers.yml (also at the top level), we map the configuration of the webservers group to the roles performed by the webservers group:
---
# file: webservers.yml
- hosts: webservers
roles:
- common
- webtier
The idea here is that we can choose to configure our whole infrastructure by "running" site.yml or we could just choose to run a subset by running
webservers.yml. This is analogous to the "--limit" parameter to ansible but a little more explicit:
ansible-playbook site.yml --limit webservers
ansible-playbook webservers.yml
Task And Handler Organization For A Role
Below is an example tasks file that explains how a role works. Our common role here just sets up NTP, but it could do more if we wanted:
---
# file: roles/common/tasks/main.yml
- name: be sure ntp is installed
yum:
name: ntp
state: installed
tags: ntp
- name: be sure ntp is configured
template:
src: ntp.conf.j2
dest: /etc/ntp.conf
notify:
- restart ntpd
tags: ntp
- name: be sure ntpd is running and enabled
service:
name: ntpd
state: started
enabled: yes
tags: ntp
Here is an example handlers file. As a review, handlers are only fired when certain tasks report changes, and are run at the end
of each play:
---
# file: roles/common/handlers/main.yml
- name: restart ntpd
service:
name: ntpd
state: restarted
See Roles for more information.
What This Organization Enables (Examples)
Above we've shared our basic organizational structure.
Now what sort of use cases does this layout enable? Lots! If I want to reconfigure my whole infrastructure, it's just:
ansible-playbook -i production site.yml
To reconfigure NTP on everything:
ansible-playbook -i production site.yml --tags ntp
To reconfigure just my webservers:
ansible-playbook -i production webservers.yml
For just my webservers in Boston:
ansible-playbook -i production webservers.yml --limit boston
For just the first 10, and then the next 10:
ansible-playbook -i production webservers.yml --limit boston[0:9]
ansible-playbook -i production webservers.yml --limit boston[10:19]
And of course just basic ad-hoc stuff is also possible:
ansible boston -i production -m ping
ansible boston -i production -m command -a '/sbin/reboot'
And there are some useful commands to know:
# confirm what task names would be run if I ran this command and said "just ntp tasks"
ansible-playbook -i production webservers.yml --tags ntp --list-tasks
# confirm what hostnames might be communicated with if I said "limit to boston"
ansible-playbook -i production webservers.yml --limit boston --list-hosts
Deployment vs Configuration Organization
The above setup models a typical configuration topology. When doing multi-tier deployments, there are going
to be some additional playbooks that hop between tiers to roll out an application. In this case, 'site.yml'
may be augmented by playbooks like 'deploy_exampledotcom.yml' but the general concepts can still apply.
Consider "playbooks" as a sports metaphor -- you don't have to just have one set of plays to use against your infrastructure
all the time -- you can have situational plays that you use at different times and for different purposes.
Ansible allows you to deploy and configure using the same tool, so you would likely reuse groups and just
keep the OS configuration in separate playbooks from the app deployment.
As also mentioned above, a good way to keep your staging (or testing) and production environments separate is to use a separate inventory file for staging and production. This way you pick with -i what you are targeting. Keeping them all in one file can lead to surprises!
Testing things in a staging environment before trying in production is always a great idea. Your environments need not be the same
size and you can use group variables to control the differences between those environments.
The 'state' parameter is optional to a lot of modules. Whether 'state=present' or 'state=absent', it's always best to leave that
parameter in your playbooks to make it clear, especially as some modules support additional states.
We're somewhat repeating ourselves with this tip, but it's worth repeating. A system can be in multiple groups. See Working with Inventory and Working with Patterns. Having groups named after things like
webservers and dbservers is repeated in the examples because it's a very powerful concept.
This allows playbooks to target machines based on role, as well as to assign role specific variables
using the group variable system.
See Roles.
When dealing with a parameter that is different between two different operating systems, a great way to handle this is
by using the group_by module.
This makes a dynamic group of hosts matching certain criteria, even if that group is not defined in the inventory file:
---
# talk to all hosts just so we can learn about them
- hosts: all
tasks:
- group_by:
key: os_{{ ansible_distribution }}
# now just on the CentOS hosts...
- hosts: os_CentOS
gather_facts: False
tasks:
- # tasks that only happen on CentOS go here
This will throw all systems into a dynamic group based on the operating system name.
If group-specific settings are needed, this can also be done. For example:
---
# file: group_vars/all
asdf: 10
---
# file: group_vars/os_CentOS
asdf: 42
In the above example, CentOS machines get the value of '42' for asdf, but other machines get '10'.
This can be used not only to set variables, but also to apply certain roles to only certain systems.
Alternatively, if only variables are needed:
- hosts: all
tasks:
- include_vars: "os_{{ ansible_distribution }}.yml"
- debug:
var: asdf
This will pull in variables based on the OS name.
If a playbook has a ./library
directory relative to its YAML file, this directory can be used to add ansible modules that will
automatically be in the ansible module path. This is a great way to keep modules that go with a playbook together. This is shown
in the directory structure example at the start of this section.
It is possible to leave off the 'name' for a given task, though it is recommended to provide a description
about why something is being done instead. This name is shown when the playbook is run.
When you can do something simply, do something simply. Do not reach
to use every feature of Ansible together, all at once. Use what works
for you. For example, you will probably not need vars
,
vars_files
, vars_prompt
and --extra-vars
all at once,
while also using an external inventory file.
If something feels complicated, it probably is, and may be a good opportunity to simplify things.
Use version control. Keep your playbooks and inventory file in git
(or another version control system), and commit when you make changes
to them. This way you have an audit trail describing when and why you
changed the rules that are automating your infrastructure.
For general maintenance, it is often easier to use grep
, or similar tools, to find variables in your Ansible setup. Since vaults obscure these variables, it is best to work with a layer of indirection. When running a playbook, Ansible finds the variables in the unencrypted file and all sensitive variables come from the encrypted file.
A best practice approach for this is to start with a group_vars/
subdirectory named after the group. Inside of this subdirectory, create two files named vars
and vault
. Inside of the vars
file, define all of the variables needed, including any sensitive ones. Next, copy all of the sensitive variables over to the vault
file and prefix these variables with vault_
. You should adjust the variables in the vars
file to point to the matching vault_
variables using jinja2 syntax, and ensure that the vault
file is vault encrypted.
This best practice has no limit on the amount of variable and vault files or their names.
Social media¶
If you like Ansible and just want to spread the good word, feel free to share on your social media platform of choice, and let us know by using
@ansible
or#ansible
. We'll be looking for you.