rotate-backups: Simple command line interface for backup rotation¶
Welcome to the documentation of rotate-backups version 8.1! The following sections are available:
User documentation¶
The readme is the best place to start reading, it’s targeted at all users and documents the command line interface:
rotate-backups: Simple command line interface for backup rotation¶
Backups are good for you. Most people learn this the hard way (including me).
Nowadays my Linux laptop automatically creates a full system snapshot every
four hours by pushing changed files to an rsync daemon running on the server
in my home network and creating a snapshot afterwards using the cp -al
command (the article Easy Automated Snapshot-Style Backups with Linux and
Rsync explains the basic technique). The server has a second disk attached
which asynchronously copies from the main disk so that a single disk failure
doesn’t wipe all of my backups (the “time delayed replication” aspect has also
proven to be very useful).
Okay, cool, now I have backups of everything, up to date and going back in time! But I’m running through disk space like crazy… A proper deduplicating filesystem would be awesome but I’m running crappy consumer grade hardware and e.g. ZFS has not been a good experience in the past. So I’m going to have to delete backups…
Deleting backups is never nice, but an easy and proper rotation scheme can help a lot. I wanted to keep things manageable so I wrote a Python script to do it for me. Over the years I actually wrote several variants. Because I kept copy/pasting these scripts around I decided to bring the main features together in a properly documented Python package and upload it to the Python Package Index.
The rotate-backups package is currently tested on cPython 2.7, 3.5+ and PyPy (2.7). It’s tested on Linux and Mac OS X and may work on other unixes but definitely won’t work on Windows right now.
Features¶
- Dry run mode
- Use it. I’m serious. If you don’t and rotate-backups eats more backups than intended you have no right to complain ;-)
- Flexible rotation
- Rotation with any combination of hourly, daily, weekly, monthly and yearly retention periods.
- Fuzzy timestamp matching in filenames
The modification times of the files and/or directories are not relevant. If you speak Python regular expressions, here is how the fuzzy matching works:
# Required components. (?P<year>\d{4}) \D? (?P<month>\d{2}) \D? (?P<day>\d{2}) \D? ( # Optional components. (?P<hour>\d{2}) \D? (?P<minute>\d{2}) \D? (?P<second>\d{2})? )?
- All actions are logged
- Log messages are saved to the system log (e.g.
/var/log/syslog
) so you can retrace what happened when something seems to have gone wrong.
Installation¶
The rotate-backups package is available on PyPI which means installation should be as simple as:
$ pip install rotate-backups
There’s actually a multitude of ways to install Python packages (e.g. the per user site-packages directory, virtual environments or just installing system wide) and I have no intention of getting into that discussion here, so if this intimidates you then read up on your options before returning to these instructions ;-).
Usage¶
There are two ways to use the rotate-backups package: As the command line
program rotate-backups
and as a Python API. For details about the Python
API please refer to the API documentation available on Read the Docs. The
command line interface is described below.
Command line¶
Usage: rotate-backups [OPTIONS] [DIRECTORY, ..]
Easy rotation of backups based on the Python package by the same name.
To use this program you specify a rotation scheme via (a combination of) the
--hourly
, --daily
, --weekly
, --monthly
and/or --yearly
options and the
directory (or directories) containing backups to rotate as one or more
positional arguments.
You can rotate backups on a remote system over SSH by prefixing a DIRECTORY with an SSH alias and separating the two with a colon (similar to how rsync accepts remote locations).
Instead of specifying directories and a rotation scheme on the command line you
can also add them to a configuration file. For more details refer to the online
documentation (see also the --config
option).
Please use the --dry-run
option to test the effect of the specified rotation
scheme before letting this program loose on your precious backups! If you don’t
test the results using the dry run mode and this program eats more backups than
intended you have no right to complain ;-).
Supported options:
Option | Description |
---|---|
-M , --minutely=COUNT |
In a literal sense this option sets the number of “backups per minute” to
preserve during rotation. For most use cases that doesn’t make a lot of
sense :-) but you can combine the --minutely and --relaxed options to
preserve more than one backup per hour. Refer to the usage of the -H ,
--hourly option for details about COUNT . |
-H , --hourly=COUNT |
Set the number of hourly backups to preserve during rotation:
|
-d , --daily=COUNT |
Set the number of daily backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-w , --weekly=COUNT |
Set the number of weekly backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-m , --monthly=COUNT |
Set the number of monthly backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-y , --yearly=COUNT |
Set the number of yearly backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-t , --timestamp-pattern=PATTERN |
Customize the regular expression pattern that is used to match and extract
timestamps from filenames. PATTERN is expected to be a Python compatible
regular expression that must define the named capture groups ‘year’,
‘month’ and ‘day’ and may define ‘hour’, ‘minute’ and ‘second’. |
-I , --include=PATTERN |
Only process backups that match the shell pattern given by PATTERN . This
argument can be repeated. Make sure to quote PATTERN so the shell doesn’t
expand the pattern before it’s received by rotate-backups. |
-x , --exclude=PATTERN |
Don’t process backups that match the shell pattern given by PATTERN . This
argument can be repeated. Make sure to quote PATTERN so the shell doesn’t
expand the pattern before it’s received by rotate-backups. |
-j , --parallel |
Remove backups in parallel, one backup per mount point at a time. The idea behind this approach is that parallel rotation is most useful when the files to be removed are on different disks and so multiple devices can be utilized at the same time. Because mount points are per system the |
-p , --prefer-recent |
By default the first (oldest) backup in each time slot is preserved. If you’d prefer to keep the most recent backup in each time slot instead then this option is for you. |
-r , --relaxed |
By default the time window for each rotation scheme is enforced (this is
referred to as strict rotation) but the
If the explanation above is not clear enough, here’s a simple way to decide whether you want to customize this behavior or not:
|
-i , --ionice=CLASS |
Use the “ionice” program to set the I/O scheduling class and priority of
the “rm” invocations used to remove backups. CLASS is expected to be one of
the values “idle” (3), “best-effort” (2) or “realtime” (1). Refer to the
man page of the “ionice” program for details about these values. The
numeric values are required by the ‘busybox’ implementation of ‘ionice’. |
-c , --config=FILENAME |
Load configuration from
Any available configuration files are loaded in the order given above, so that sections in user-specific configuration files override sections by the same name in system-wide configuration files. For more details refer to the online documentation. |
-C , --removal-command=CMD |
Change the command used to remove backups. The value of As an example of why you might want to change this, CephFS snapshots are represented as regular directory trees that can be deleted at once with a single ‘rmdir’ command (even though according to POSIX semantics this command should refuse to remove nonempty directories, but I digress). |
-u , --use-sudo |
Enable the use of “sudo” to rotate backups in directories that are not readable and/or writable for the current user (or the user logged in to a remote system over SSH). |
-S , --syslog=CHOICE |
Explicitly enable or disable system logging instead of letting the program figure out what to do. The values ‘1’, ‘yes’, ‘true’ and ‘on’ enable system logging whereas the values ‘0’, ‘no’, ‘false’ and ‘off’ disable it. |
-f , --force |
If a sanity check fails an error is reported and the program aborts. You
can use --force to continue with backup rotation instead. Sanity checks
are done to ensure that the given DIRECTORY exists, is readable and is
writable. If the --removal-command option is given then the last sanity
check (that the given location is writable) is skipped (because custom
removal commands imply custom semantics). |
-n , --dry-run |
Don’t make any changes, just print what would be done. This makes it easy to evaluate the impact of a rotation scheme without losing any backups. |
-v , --verbose |
Increase logging verbosity (can be repeated). |
-q , --quiet |
Decrease logging verbosity (can be repeated). |
-h , --help |
Show this message and exit. |
Configuration files¶
Instead of specifying directories and rotation schemes on the command line you can also add them to a configuration file.
Configuration files are text files in the subset of ini syntax supported by Python’s configparser module. They can be located in the following places:
Directory | Main configuration file | Modular configuration files |
---|---|---|
/etc | /etc/rotate-backups.ini | /etc/rotate-backups.d/*.ini |
~ | ~/.rotate-backups.ini | ~/.rotate-backups.d/*.ini |
~/.config | ~/.config/rotate-backups.ini | ~/.config/rotate-backups.d/*.ini |
The available configuration files are loaded in the order given above, so that user specific configuration files override system wide configuration files.
You can load a configuration file in a nonstandard location using the command
line option --config
, in this case the default locations mentioned above
are ignored.
Each section in the configuration defines a directory that contains backups to be rotated. The options in each section define the rotation scheme and other options. Here’s an example based on how I use rotate-backups to rotate the backups of the Linux installations that I make regular backups of:
# /etc/rotate-backups.ini:
# Configuration file for the rotate-backups program that specifies
# directories containing backups to be rotated according to specific
# rotation schemes.
[/backups/laptop]
hourly = 24
daily = 7
weekly = 4
monthly = 12
yearly = always
ionice = idle
[/backups/server]
daily = 7 * 2
weekly = 4 * 2
monthly = 12 * 4
yearly = always
ionice = idle
[/backups/mopidy]
daily = 7
weekly = 4
monthly = 2
ionice = idle
[/backups/xbmc]
daily = 7
weekly = 4
monthly = 2
ionice = idle
As you can see in the retention periods of the directory /backups/server
in
the example above you are allowed to use expressions that evaluate to a number
(instead of having to write out the literal number).
Here’s an example of a configuration for two remote directories:
# SSH as a regular user and use `sudo' to elevate privileges.
[server:/backups/laptop]
use-sudo = yes
hourly = 24
daily = 7
weekly = 4
monthly = 12
yearly = always
ionice = idle
# SSH as the root user (avoids sudo passwords).
[server:/backups/server]
ssh-user = root
hourly = 24
daily = 7
weekly = 4
monthly = 12
yearly = always
ionice = idle
As this example shows you have the option to connect as the root user or to
connect as a regular user and use sudo
to elevate privileges.
Customizing the rotation algorithm¶
Since publishing rotate-backups I’ve found that the default rotation algorithm is not to everyone’s satisfaction and because the suggested alternatives were just as valid as the choices that I initially made, options were added to expose the alternative behaviors:
Default | Alternative |
---|---|
Strict rotation (the time window for each rotation frequency is enforced). | Relaxed rotation (time windows are
not enforced). Enabled by the
-r , --relaxed option. |
The oldest backup in each time slot is preserved and newer backups in the time slot are removed. | The newest backup in each time slot
is preserved and older backups in
the time slot are removed. Enabled
by the -p , --prefer-recent
option. |
Supported configuration options¶
Rotation schemes are defined using the
minutely
,hourly
,daily
,weekly
,monthly
andyearly
options, these options support the same values as documented for the command line interface.The
timestamp-pattern
option can be used to customize the regular expression that’s used to extract timestamps from filenames. The value is expected to be a Python compatible regular expression that must contain the named capture groups ‘year’, ‘month’ and ‘day’ and may contain the groups ‘hour’, ‘minute’ and ‘second’. As an example here is the default regular expression:# Required components. (?P<year>\d{4} ) \D? (?P<month>\d{2}) \D? (?P<day>\d{2} ) \D? (?: # Optional components. (?P<hour>\d{2} ) \D? (?P<minute>\d{2}) \D? (?P<second>\d{2})? )?
Note how this pattern spans multiple lines: Regular expressions are compiled using the re.VERBOSE flag which means whitespace (including newlines) is ignored.
The
include-list
andexclude-list
options define a comma separated list of filename patterns to include or exclude, respectively:- Make sure not to quote the patterns in the configuration file, just provide them literally.
- If an include or exclude list is defined in the configuration file it overrides the include or exclude list given on the command line.
The
prefer-recent
,strict
anduse-sudo
options expect a boolean value (yes
,no
,true
,false
,1
or0
).The
removal-command
option can be used to customize the command that is used to remove backups.The
ionice
option expects one of the I/O scheduling class namesidle
,best-effort
orrealtime
(or the corresponding numbers).The
ssh-user
option can be used to override the name of the remote SSH account that’s used to connect to a remote system.
How it works¶
The basic premise of rotate-backups is fairly simple:
You point rotate-backups at a directory containing timestamped backups.
It will scan the directory for entries (it doesn’t matter whether they are files or directories) with a recognizable timestamp in the name.
Note
All of the matched directory entries are considered to be backups of the same data source, i.e. there’s no filename similarity logic to distinguish unrelated backups that are located in the same directory. If this presents a problem consider using the
--include
and/or--exclude
options.The user defined rotation scheme is applied to the entries. If this doesn’t do what you’d expect it to you can try the
--relaxed
and/or--prefer-recent
options.The entries to rotate are removed (or printed in dry run).
Contact¶
The latest version of rotate-backups is available on PyPI and GitHub. The documentation is hosted on Read the Docs and includes a changelog. For bug reports please create an issue on GitHub. If you have questions, suggestions, etc. feel free to send me an e-mail at peter@peterodding.com.
API documentation¶
The following API documentation is automatically generated from the source code:
API documentation¶
This documentation is based on the source code of version 8.1 of the rotate-backups package. The following modules are available:
rotate_backups
¶
Simple to use Python API for rotation of backups.
The rotate_backups
module contains the Python API of the
rotate-backups package. The core logic of the package is contained in the
RotateBackups
class.
-
rotate_backups.
DEFAULT_REMOVAL_COMMAND
= ['rm', '-fR']¶ The default removal command (a list of strings).
-
rotate_backups.
ORDERED_FREQUENCIES
= (('minutely', relativedelta(minutes=+1)), ('hourly', relativedelta(hours=+1)), ('daily', relativedelta(days=+1)), ('weekly', relativedelta(days=+7)), ('monthly', relativedelta(months=+1)), ('yearly', relativedelta(years=+1)))¶ An iterable of tuples with two values each:
- The name of a rotation frequency (a string like ‘hourly’, ‘daily’, etc.).
- A
relativedelta
object.
The tuples are sorted by increasing delta (intentionally).
-
rotate_backups.
SUPPORTED_DATE_COMPONENTS
= (('year', True), ('month', True), ('day', True), ('hour', False), ('minute', False), ('second', False))¶ An iterable of tuples with two values each:
-
rotate_backups.
SUPPORTED_FREQUENCIES
= {'daily': relativedelta(days=+1), 'hourly': relativedelta(hours=+1), 'minutely': relativedelta(minutes=+1), 'monthly': relativedelta(months=+1), 'weekly': relativedelta(days=+7), 'yearly': relativedelta(years=+1)}¶ A dictionary with rotation frequency names (strings) as keys and
relativedelta
objects as values. This dictionary is generated based on the tuples inORDERED_FREQUENCIES
.
-
rotate_backups.
TIMESTAMP_PATTERN
= <_sre.SRE_Pattern object>¶ A compiled regular expression object used to match timestamps encoded in filenames.
-
rotate_backups.
coerce_location
(value, **options)[source]¶ Coerce a string to a
Location
object.Parameters: - value – The value to coerce (a string or
Location
object). - options – Any keyword arguments are passed on to
create_context()
.
Returns: A
Location
object.- value – The value to coerce (a string or
-
rotate_backups.
coerce_retention_period
(value)[source]¶ Coerce a retention period to a Python value.
Parameters: value – A string containing the text ‘always’, a number or an expression that can be evaluated to a number. Returns: A number or the string ‘always’. Raises: ValueError
when the string can’t be coerced.
-
rotate_backups.
load_config_file
(configuration_file=None, expand=True)[source]¶ Load a configuration file with backup directories and rotation schemes.
Parameters: Returns: A generator of tuples with four values each:
- An execution context created using
executor.contexts
. - The pathname of a directory with backups (a string).
- A dictionary with the rotation scheme.
- A dictionary with additional options.
Raises: ValueError
when configuration_file is given but doesn’t exist or can’t be loaded.This function is used by
RotateBackups
to discover user defined rotation schemes and byrotate_backups.cli
to discover directories for which backup rotation is configured. When configuration_file isn’t givenConfigLoader
is used to search for configuration files in the following locations:/etc/rotate-backups.ini
and/etc/rotate-backups.d/*.ini
~/.rotate-backups.ini
and~/.rotate-backups.d/*.ini
~/.config/rotate-backups.ini
and~/.config/rotate-backups.d/*.ini
All of the available configuration files are loaded in the order given above, so that sections in user-specific configuration files override sections by the same name in system-wide configuration files.
- An execution context created using
-
rotate_backups.
rotate_backups
(directory, rotation_scheme, **options)[source]¶ Rotate the backups in a directory according to a flexible rotation scheme.
Note
This function exists to preserve backwards compatibility with older versions of the rotate-backups package where all of the logic was exposed as a single function. Please refer to the documentation of the
RotateBackups
initializer and therotate_backups()
method for an explanation of this function’s parameters.
-
class
rotate_backups.
RotateBackups
(rotation_scheme, **options)[source]¶ Python API for the
rotate-backups
program.Here’s an overview of the
RotateBackups
class:Superclass: PropertyManager
Special methods: __init__()
Public methods: apply_rotation_scheme()
,collect_backups()
,find_preservation_criteria()
,group_backups()
,load_config_file()
,match_to_datetime()
,rotate_backups()
androtate_concurrent()
Properties: config_file
,dry_run
,exclude_list
,force
,include_list
,io_scheduling_class
,prefer_recent
,removal_command
,rotation_scheme
,strict
andtimestamp_pattern
When you initialize a
RotateBackups
object you are required to provide a value for therotation_scheme
property. You can set the values of theconfig_file
,dry_run
,exclude_list
,force
,include_list
,io_scheduling_class
,prefer_recent
,removal_command
,rotation_scheme
,strict
andtimestamp_pattern
properties by passing keyword arguments to the class initializer.-
__init__
(rotation_scheme, **options)[source]¶ Initialize a
RotateBackups
object.Parameters: - rotation_scheme – Used to set
rotation_scheme
. - options – Any keyword arguments are used to set the values of
instance properties that support assignment
(
config_file
,dry_run
,exclude_list
,include_list
,io_scheduling_class
,removal_command
andstrict
).
- rotation_scheme – Used to set
-
config_file
[source]¶ The pathname of a configuration file (a string or
None
).When this property is set
rotate_backups()
will useload_config_file()
to give the user (operator) a chance to set the rotation scheme and other options via a configuration file.Note
The
config_file
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
dry_run
[source]¶ True
to simulate rotation,False
to actually remove backups (defaults toFalse
).If this is
True
thenrotate_backups()
won’t make any actual changes, which provides a ‘preview’ of the effect of the rotation scheme. Right now this is only useful in the command line interface because there’s no return value.Note
The
dry_run
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
exclude_list
[source]¶ Filename patterns to exclude specific backups (a list of strings).
This is a list of strings with
fnmatch
patterns. Whencollect_backups()
encounters a backup whose name matches any of the patterns in this list the backup will be ignored, even if it also matches the include list (it’s the only logical way to combine both lists).See also: include_list
Note
The
exclude_list
property is acustom_property
. You can change the value of this property using normal attribute assignment syntax. This property’s value is computed once (the first time it is accessed) and the result is cached. To clear the cached value you can usedel
ordelattr()
.
-
force
[source]¶ True
to continue if sanity checks fail,False
to raise an exception.Sanity checks are performed before backup rotation starts to ensure that the given location exists, is readable and is writable. If
removal_command
is customized then the last sanity check (that the given location is writable) is skipped (because custom removal commands imply custom semantics, see also #18). If a sanity check fails an exception is raised, but you can setforce
toTrue
to continue with backup rotation instead (the default is obviouslyFalse
).Note
The
force
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
include_list
[source]¶ Filename patterns to select specific backups (a list of strings).
This is a list of strings with
fnmatch
patterns. When it’s not emptycollect_backups()
will only collect backups whose name matches a pattern in the list.See also: exclude_list
Note
The
include_list
property is acustom_property
. You can change the value of this property using normal attribute assignment syntax. This property’s value is computed once (the first time it is accessed) and the result is cached. To clear the cached value you can usedel
ordelattr()
.
-
io_scheduling_class
[source]¶ The I/O scheduling class for backup rotation (a string or
None
).When this property is set (and
have_ionice
isTrue
) then ionice will be used to set the I/O scheduling class for backup rotation. This can be useful to reduce the impact of backup rotation on the rest of the system.The value of this property is expected to be one of the strings ‘idle’, ‘best-effort’ or ‘realtime’.
Note
The
io_scheduling_class
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
prefer_recent
[source]¶ Whether to prefer older or newer backups in each time slot (a boolean).
Defaults to
False
which means the oldest backup in each time slot (an hour, a day, etc.) is preserved while newer backups in the time slot are removed. You can set this toTrue
if you would like to preserve the newest backup in each time slot instead.Note
The
prefer_recent
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
removal_command
[source]¶ The command used to remove backups (a list of strings).
By default the command
rm -fR
is used. This choice was made because it works regardless of whether the user’s “backups to be rotated” are files or directories or a mixture of both.Note
The
removal_command
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
rotation_scheme
[source]¶ The rotation scheme to apply to backups (a dictionary).
Each key in this dictionary defines a rotation frequency (one of the strings ‘minutely’, ‘hourly’, ‘daily’, ‘weekly’, ‘monthly’ and ‘yearly’) and each value defines a retention count:
- An integer value represents the number of backups to preserve in the given rotation frequency, starting from the most recent backup and counting back in time.
- The string ‘always’ means all backups in the given rotation frequency are preserved (this is intended to be used with the biggest frequency in the rotation scheme, e.g. yearly).
No backups are preserved for rotation frequencies that are not present in the dictionary.
Note
The
rotation_scheme
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named rotation_scheme (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
strict
[source]¶ Whether to enforce the time window for each rotation frequency (a boolean, defaults to
True
).The easiest way to explain the difference between strict and relaxed rotation is using an example:
- If
strict
isTrue
and the number of hourly backups to preserve is three, only backups created in the relevant time window (the hour of the most recent backup and the two hours leading up to that) will match the hourly frequency. - If
strict
isFalse
then the three most recent backups will all match the hourly frequency (and thus be preserved), regardless of the calculated time window.
If the explanation above is not clear enough, here’s a simple way to decide whether you want to customize this behavior:
- If your backups are created at regular intervals and you never miss
an interval then the default (
True
) is most likely fine. - If your backups are created at irregular intervals then you may want
to set
strict
toFalse
to convinceRotateBackups
to preserve more backups.
Note
The
strict
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.- If
-
timestamp_pattern
[source]¶ The pattern used to extract timestamps from filenames (defaults to
TIMESTAMP_PATTERN
).The value of this property is a compiled regular expression object. Callers can provide their own compiled regular expression which makes it possible to customize the compilation flags (see the
re.compile()
documentation for details).The regular expression pattern is expected to be a Python compatible regular expression that defines the named capture groups ‘year’, ‘month’ and ‘day’ and optionally ‘hour’, ‘minute’ and ‘second’.
String values are automatically coerced to compiled regular expressions by calling
coerce_pattern()
, in this case only there.VERBOSE
flag is used.If the caller provides a custom pattern it will be validated to confirm that the pattern contains named capture groups corresponding to each of the required date components defined by
SUPPORTED_DATE_COMPONENTS
.Note
The
timestamp_pattern
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
rotate_concurrent
(*locations, **kw)[source]¶ Rotate the backups in the given locations concurrently.
Parameters: - locations – One or more values accepted by
coerce_location()
. - kw – Any keyword arguments are passed on to
rotate_backups()
.
This function uses
rotate_backups()
to prepare rotation commands for the given locations and then it removes backups in parallel, one backup per mount point at a time.The idea behind this approach is that parallel rotation is most useful when the files to be removed are on different disks and so multiple devices can be utilized at the same time.
Because mount points are per system
rotate_concurrent()
will also parallelize over backups located on multiple remote systems.- locations – One or more values accepted by
-
rotate_backups
(location, load_config=True, prepare=False)[source]¶ Rotate the backups in a directory according to a flexible rotation scheme.
Parameters: - location – Any value accepted by
coerce_location()
. - load_config – If
True
(so by default) the rotation scheme and other options can be customized by the user in a configuration file. In this case the caller’s arguments are only used when the configuration file doesn’t define a configuration for the location. - prepare – If this is
True
(not the default) thenrotate_backups()
will prepare the required rotation commands without running them.
Returns: A list with the rotation commands (
ExternalCommand
objects).Raises: ValueError
when the given location doesn’t exist, isn’t readable or isn’t writable. The third check is only performed when dry run isn’t enabled.This function binds the main methods of the
RotateBackups
class together to implement backup rotation with an easy to use Python API. If you’re using rotate-backups as a Python API and the default behavior is not satisfactory, consider writing your ownrotate_backups()
function based on the underlyingcollect_backups()
,group_backups()
,apply_rotation_scheme()
andfind_preservation_criteria()
methods.- location – Any value accepted by
-
load_config_file
(location)[source]¶ Load a rotation scheme and other options from a configuration file.
Parameters: location – Any value accepted by coerce_location()
.Returns: The configured or given Location
object.
-
collect_backups
(location)[source]¶ Collect the backups at the given location.
Parameters: location – Any value accepted by coerce_location()
.Returns: A sorted list
ofBackup
objects (the backups are sorted by their date).Raises: ValueError
when the given directory doesn’t exist or isn’t readable.
-
match_to_datetime
(match)[source]¶ Convert a regular expression match to a
datetime
value.Parameters: match – A regular expression match object. Returns: A datetime
value.Raises: exceptions.ValueError
when a required date component is not captured by the pattern, the captured value is an empty string or the captured value cannot be interpreted as a base-10 integer.See also
-
group_backups
(backups)[source]¶ Group backups collected by
collect_backups()
by rotation frequencies.Parameters: backups – A set
ofBackup
objects.Returns: A dict
whose keys are the names of rotation frequencies (‘hourly’, ‘daily’, etc.) and whose values are dictionaries. Each nested dictionary contains lists ofBackup
objects that are grouped together because they belong into the same time unit for the corresponding rotation frequency.
-
apply_rotation_scheme
(backups_by_frequency, most_recent_backup)[source]¶ Apply the user defined rotation scheme to the result of
group_backups()
.Parameters: - backups_by_frequency – A
dict
in the format generated bygroup_backups()
. - most_recent_backup – The
datetime
of the most recent backup.
Raises: ValueError
when the rotation scheme dictionary is empty (this would cause all backups to be deleted).Note
This method mutates the given data structure by removing all backups that should be removed to apply the user defined rotation scheme.
- backups_by_frequency – A
-
find_preservation_criteria
(backups_by_frequency)[source]¶ Collect the criteria used to decide which backups to preserve.
Parameters: backups_by_frequency – A dict
in the format generated bygroup_backups()
which has been processed byapply_rotation_scheme()
.Returns: A dict
withBackup
objects as keys andlist
objects containing strings (rotation frequencies) as values.
-
-
class
rotate_backups.
Location
(**kw)[source]¶ Location
objects represent a root directory containing backups.Here’s an overview of the
Location
class:Superclass: PropertyManager
Special methods: __str__()
Public methods: add_hints()
,ensure_exists()
,ensure_readable()
,ensure_writable()
andmatch()
Properties: context
,directory
,have_ionice
,have_wildcards
,is_remote
,key_properties
,mount_point
andssh_alias
When you initialize a
Location
object you are required to provide values for thecontext
anddirectory
properties. You can set the values of thecontext
anddirectory
properties by passing keyword arguments to the class initializer.-
context
[source]¶ An execution context created using
executor.contexts
.Note
The
context
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named context (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
directory
[source]¶ The pathname of a directory containing backups (a string).
Note
The
directory
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named directory (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
have_ionice
[source]¶ True
when ionice is available,False
otherwise.Note
The
have_ionice
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
have_wildcards
[source]¶ True
ifdirectory
is a filename pattern,False
otherwise.Note
The
have_wildcards
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
mount_point
[source]¶ The pathname of the mount point of
directory
(a string orNone
).If the
stat --format=%m ...
command that is used to determine the mount point fails, the value of this property defaults toNone
. This enables graceful degradation on e.g. Mac OS X whosestat
implementation is rather bare bones compared to GNU/Linux.Note
The
mount_point
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
is_remote
[source]¶ True
if the location is remote,False
otherwise.Note
The
is_remote
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
ssh_alias
[source]¶ The SSH alias of a remote location (a string or
None
).Note
The
ssh_alias
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
key_properties
¶ A list of strings with the names of the
key
properties.Overrides
key_properties
to customize the ordering ofLocation
objects so that they are ordered first by theirssh_alias
and second by theirdirectory
.
-
ensure_exists
(override=False)[source]¶ Sanity check that the location exists.
Parameters: override – True
to log a message,False
to raise an exception (when the sanity check fails).Returns: True
if the sanity check succeeds,False
if it fails (and override isTrue
).Raises: ValueError
when the sanity check fails and override isFalse
.See also
-
ensure_readable
(override=False)[source]¶ Sanity check that the location exists and is readable.
Parameters: override – True
to log a message,False
to raise an exception (when the sanity check fails).Returns: True
if the sanity check succeeds,False
if it fails (and override isTrue
).Raises: ValueError
when the sanity check fails and override isFalse
.See also
-
ensure_writable
(override=False)[source]¶ Sanity check that the directory exists and is writable.
Parameters: override – True
to log a message,False
to raise an exception (when the sanity check fails).Returns: True
if the sanity check succeeds,False
if it fails (and override isTrue
).Raises: ValueError
when the sanity check fails and override isFalse
.See also
-
add_hints
(message)[source]¶ Provide hints about failing sanity checks.
Parameters: message – The message to the user (a string). Returns: The message including hints (a string). When superuser privileges aren’t being used a hint about the
--use-sudo
option will be added (in case a sanity check failed because we don’t have permission to one of the parent directories).In all cases a hint about the
--force
option is added (in case the sanity checks themselves are considered the problem, which is obviously up to the operator to decide).See also
-
-
class
rotate_backups.
Backup
(**kw)[source]¶ Backup
objects represent a rotation subject.Here’s an overview of the
Backup
class:Superclass: PropertyManager
Special methods: __getattr__()
Properties: pathname
,timestamp
andweek
When you initialize a
Backup
object you are required to provide values for thepathname
andtimestamp
properties. You can set the values of thepathname
andtimestamp
properties by passing keyword arguments to the class initializer.-
key_properties
= ('timestamp', 'pathname')¶ Customize the ordering of
Backup
objects.Backup
objects are ordered first by theirtimestamp
and second by theirpathname
. This class variable overrideskey_properties
.
-
pathname
[source]¶ The pathname of the backup (a string).
Note
The
pathname
property is akey_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named pathname (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.
-
timestamp
[source]¶ The date and time when the backup was created (a
datetime
object).Note
The
timestamp
property is akey_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named timestamp (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.
-
rotate_backups.cli
¶
Usage: rotate-backups [OPTIONS] [DIRECTORY, ..]
Easy rotation of backups based on the Python package by the same name.
To use this program you specify a rotation scheme via (a combination of) the
--hourly
, --daily
, --weekly
, --monthly
and/or --yearly
options and the
directory (or directories) containing backups to rotate as one or more
positional arguments.
You can rotate backups on a remote system over SSH by prefixing a DIRECTORY with an SSH alias and separating the two with a colon (similar to how rsync accepts remote locations).
Instead of specifying directories and a rotation scheme on the command line you
can also add them to a configuration file. For more details refer to the online
documentation (see also the --config
option).
Please use the --dry-run
option to test the effect of the specified rotation
scheme before letting this program loose on your precious backups! If you don’t
test the results using the dry run mode and this program eats more backups than
intended you have no right to complain ;-).
Supported options:
Option | Description |
---|---|
-M , --minutely=COUNT |
In a literal sense this option sets the number of “backups per minute” to
preserve during rotation. For most use cases that doesn’t make a lot of
sense :-) but you can combine the --minutely and --relaxed options to
preserve more than one backup per hour. Refer to the usage of the -H ,
--hourly option for details about COUNT . |
-H , --hourly=COUNT |
Set the number of hourly backups to preserve during rotation:
|
-d , --daily=COUNT |
Set the number of daily backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-w , --weekly=COUNT |
Set the number of weekly backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-m , --monthly=COUNT |
Set the number of monthly backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-y , --yearly=COUNT |
Set the number of yearly backups to preserve during rotation. Refer to the
usage of the -H , --hourly option for details about COUNT . |
-t , --timestamp-pattern=PATTERN |
Customize the regular expression pattern that is used to match and extract
timestamps from filenames. PATTERN is expected to be a Python compatible
regular expression that must define the named capture groups ‘year’,
‘month’ and ‘day’ and may define ‘hour’, ‘minute’ and ‘second’. |
-I , --include=PATTERN |
Only process backups that match the shell pattern given by PATTERN . This
argument can be repeated. Make sure to quote PATTERN so the shell doesn’t
expand the pattern before it’s received by rotate-backups. |
-x , --exclude=PATTERN |
Don’t process backups that match the shell pattern given by PATTERN . This
argument can be repeated. Make sure to quote PATTERN so the shell doesn’t
expand the pattern before it’s received by rotate-backups. |
-j , --parallel |
Remove backups in parallel, one backup per mount point at a time. The idea behind this approach is that parallel rotation is most useful when the files to be removed are on different disks and so multiple devices can be utilized at the same time. Because mount points are per system the |
-p , --prefer-recent |
By default the first (oldest) backup in each time slot is preserved. If you’d prefer to keep the most recent backup in each time slot instead then this option is for you. |
-r , --relaxed |
By default the time window for each rotation scheme is enforced (this is
referred to as strict rotation) but the
If the explanation above is not clear enough, here’s a simple way to decide whether you want to customize this behavior or not:
|
-i , --ionice=CLASS |
Use the “ionice” program to set the I/O scheduling class and priority of
the “rm” invocations used to remove backups. CLASS is expected to be one of
the values “idle” (3), “best-effort” (2) or “realtime” (1). Refer to the
man page of the “ionice” program for details about these values. The
numeric values are required by the ‘busybox’ implementation of ‘ionice’. |
-c , --config=FILENAME |
Load configuration from
Any available configuration files are loaded in the order given above, so that sections in user-specific configuration files override sections by the same name in system-wide configuration files. For more details refer to the online documentation. |
-C , --removal-command=CMD |
Change the command used to remove backups. The value of As an example of why you might want to change this, CephFS snapshots are represented as regular directory trees that can be deleted at once with a single ‘rmdir’ command (even though according to POSIX semantics this command should refuse to remove nonempty directories, but I digress). |
-u , --use-sudo |
Enable the use of “sudo” to rotate backups in directories that are not readable and/or writable for the current user (or the user logged in to a remote system over SSH). |
-S , --syslog=CHOICE |
Explicitly enable or disable system logging instead of letting the program figure out what to do. The values ‘1’, ‘yes’, ‘true’ and ‘on’ enable system logging whereas the values ‘0’, ‘no’, ‘false’ and ‘off’ disable it. |
-f , --force |
If a sanity check fails an error is reported and the program aborts. You
can use --force to continue with backup rotation instead. Sanity checks
are done to ensure that the given DIRECTORY exists, is readable and is
writable. If the --removal-command option is given then the last sanity
check (that the given location is writable) is skipped (because custom
removal commands imply custom semantics). |
-n , --dry-run |
Don’t make any changes, just print what would be done. This makes it easy to evaluate the impact of a rotation scheme without losing any backups. |
-v , --verbose |
Increase logging verbosity (can be repeated). |
-q , --quiet |
Decrease logging verbosity (can be repeated). |
-h , --help |
Show this message and exit. |
Change log¶
The change log lists notable changes to the project:
Changelog¶
The purpose of this document is to list all of the notable changes to this project. The format was inspired by Keep a Changelog. This project adheres to semantic versioning.
- Release 8.1 (2020-05-17)
- Release 8.0 (2020-02-18)
- Release 7.2 (2020-02-14)
- Release 7.1 (2020-02-13)
- Release 7.0 (2020-02-12)
- Release 6.0 (2018-08-03)
- Release 5.3 (2018-08-03)
- Release 5.2 (2018-04-27)
- Release 5.1 (2018-04-27)
- Release 5.0 (2018-03-29)
- Release 4.4 (2017-04-13)
- Release 4.3.1 (2017-04-13)
- Release 4.3 (2016-10-31)
- Release 4.2 (2016-08-05)
- Release 4.1 (2016-08-05)
- Release 4.0 (2016-07-09)
- Release 3.5 (2016-07-09)
- Release 3.4 (2016-07-09)
- Release 3.3 (2016-07-09)
- Release 3.2 (2016-07-08)
- Release 3.1 (2016-04-13)
- Release 3.0 (2016-04-13)
- Release 2.3 (2015-08-30)
- Release 2.2 (2015-07-19)
- Release 2.1 (2015-07-19)
- Release 2.0 (2015-07-19)
- Release 1.1 (2015-07-19)
- Release 1.0 (2015-07-19)
- Release 0.1.2 (2015-07-15)
- Release 0.1.1 (2014-07-03)
- Release 0.1 (2014-07-03)
Release 8.1 (2020-05-17)¶
- Bug fix to really make the ‘hour’, ‘minute’ and ‘second’ capture groups in user defined timestamp patterns optional (this fixes issue #26).
- Fixed humanfriendly 8 deprecation warnings.
Release 8.0 (2020-02-18)¶
This is a bit of an awkward release:
- An
ImportError
was reported in issue #24 caused by a backwards incompatible change in humanfriendly concerning an undocumented module level variable (shouldn’t have used that). - I’ve now updated rotate-backups to be compatible with the newest release of humanfriendly however in the mean time that package dropped support for Python 3.4.
- This explains how a simple bug fix release concerning two lines in the code base triggered a major version bump because compatibility is changed.
- While I was at it I set up Python 3.8 testing on Travis CI which seems to work fine, so I’ve documented Python 3.8 as compatible. Python 3.9 seems to be a whole other story, I’ll get to that soon.
Release 7.2 (2020-02-14)¶
Merged pull request #23 which makes it possible to customize the regular
expression that’s used to match timestamps in filenames using a new command
line option rotate-backups --timestamp-pattern
.
The pull request wasn’t exactly complete (the code couldn’t have run as written, although it showed the general idea clear enough) so I decided to treat #23 as more of a feature suggestion. However there was no reason no to merge the pull request and use it as a base for my changes, hence why I decided to do so despite rewriting the code.
Changes from the pull request:
- Renamed
timestamp
totimestamp_pattern
to make it less ambiguous. - Added validation that custom patterns provided by callers define named capture groups corresponding to the required date components (year, month and day).
- Rewrote the mapping from capture groups to
datetime.datetime
arguments as follows:- Previously positional
datetime.datetime
arguments were used which depended on the order of capture groups in the hard coded regular expression pattern to function correctly. - Now that users can define their own patterns, this is no longer a
reasonable approach. As such the code now constructs and passes a
dictionary of keyword arguments to
datetime.datetime
.
- Previously positional
- Updated the documentation and the command line interface usage message to describe the new command line option and configuration file option.
- Added tests for the new behavior.
Release 7.1 (2020-02-13)¶
- Make it possibly to disable system logging using
rotate-backups --syslog=false
(fixes #20). - Explicitly support numeric ionice classes (as required by
busybox and suggested in #14):
- This follows up on a pull request to executor (a dependency of rotate-backups) that was merged in 2018.
- Since that pull request was merged this new “feature” has been implicitly supported by rotate-backups by upgrading the installed version of the executor package, however this probably wasn’t clear to anyone who’s not a Python developer 😇.
- I’ve now merged pull request #14 which adds a test to confirm that numeric ionice classes are supported.
- I also bumped the executor requirement and updated the usage instructions to point out that numeric ionice classes are now supported.
Release 7.0 (2020-02-12)¶
Significant changes:
- Sanity checks are done to ensure the directory with backups exists, is
readable and is writable. However #18 made it clear that such sanity
checks can misjudge the situation, which made me realize an escape hatch
should be provided. The new
--force
option makesrotate-backups
continue even if sanity checks fail. - Skip the sanity check that the directory with backups is writable when the
--removal-command
option is given (because custom removal commands imply custom semantics, see #18 for an example).
Miscellaneous changes:
- Start testing on Python 3.7 and document compatibility.
- Dropped Python 2.6 (I don’t think anyone still cares about this 😉).
- Copied Travis CI workarounds for MacOS from humanfriendly.
- Updated
Makefile
to use Python 3 for local development. - Bumped copyright to 2020.
Release 6.0 (2018-08-03)¶
This is a bug fix release that changes the behavior of the program, and because rotate-backups involves the deletion of important files I’m considering this a significant change in behavior that deserves a major version bump…
It was reported in issue #12 that filenames that match the filename pattern
but contain digits with invalid values for the year/month/day/etc fields would
cause a ValueError
exception to be raised.
Starting from this release these filenames are ignored instead, although a warning is logged to make sure the operator understands what’s going on.
Release 5.3 (2018-08-03)¶
- Merged pull request #11 which introduces the
--use-rmdir
option with the suggested use case of removing CephFS snapshots. - Replaced
--use-rmdir
with--removal-command=rmdir
(more general).
Release 5.2 (2018-04-27)¶
- Added support for filename patterns in configuration files (#10).
- Bug fix: Skip human friendly pathname formatting for remote backups.
- Improved documentation using
property_manager.sphinx
module.
Release 5.1 (2018-04-27)¶
Release 5.0 (2018-03-29)¶
The focus of this release is improved configuration file handling:
Refactor configuration file handling (backwards incompatible). These changes are backwards incompatible because of the following change in semantics between the logic that was previously in rotate-backups and has since been moved to update-dotdee:
- Previously only the first configuration file that was found in a default location was loaded (there was a ‘break’ in the loop).
- Now all configuration files in default locations will be loaded.
My impression is that this won’t bite any unsuspecting users, at least not in a destructive way, but I guess only time and a lack of negative feedback will tell :-p.
Added Python 3.6 to supported versions.
Include documentation in source distributions.
Change theme of Sphinx documentation.
Moved test helpers to
humanfriendly.testing
.
Release 4.4 (2017-04-13)¶
Moved ionice
support to executor.
Release 4.3.1 (2017-04-13)¶
Restore Python 2.6 compatibility by pinning simpleeval dependency.
While working on an unreleased Python project that uses rotate-backups I noticed that the tox build for Python 2.6 was broken. Whether it’s worth it for me to keep supporting Python 2.6 is a valid question, but right now the readme and setup script imply compatibility with Python 2.6 so I feel half obliged to ‘fix this issue’ :-).
Release 4.3 (2016-10-31)¶
Added MacOS compatibility (#6):
- Ignore
stat --format=%m
failures. - Don’t use
ionice
when not available.
Release 4.2 (2016-08-05)¶
Release 4.1 (2016-08-05)¶
- Enable choice for newest backup per time slot (#5).
- Converted
RotateBackups
attributes to properties (I ❤ documentability :-). - Renamed ‘constructor’ to ‘initializer’ where applicable.
- Simplified the
rotate_backups.cli
module a bit.
Release 4.0 (2016-07-09)¶
Added support for concurrent backup rotation.
Release 3.5 (2016-07-09)¶
- Use key properties on
Location
objects. - Bring test coverage back up to >= 90%.
Release 3.4 (2016-07-09)¶
Added support for expression evaluation for retention periods.
Release 3.3 (2016-07-09)¶
Started using verboselogs.
Release 3.2 (2016-07-08)¶
Added support for Python 2.6 :-P.
By switching to the
key_property
support added in property-manager 2.0 I was able to reduce code duplication and improve compatibility:6 files changed, 20 insertions(+), 23 deletions(-)
This removes the dependency on
functools.total_ordering
and to the best of my knowledge this was the only Python >= 2.7 feature that I was using so out of curiosity I changedtox.ini
to run the tests on Python 2.6 and indeed everything worked fine! :-)Refactored the makefile and
setup.py
script (checkers, docs, wheels, twine, etc).
Release 3.1 (2016-04-13)¶
Implement relaxed rotation mode, adding a --relaxed
option (#2, #3).
Release 3.0 (2016-04-13)¶
- Support for backup rotation on remote systems.
- Added Python 3.5 to supported versions.
- Added support for
-q
,--quiet
command line option. - Delegate system logging to coloredlogs.
- Improved
rotate_backups.load_config_file()
documentation. - Use
humanfriendly.sphinx
module to generate documentation. - Configured autodoc to order members based on source order.
Some backwards incompatible changes slipped in here, e.g. removing
Backup.__init__()
and renaming Backup.datetime
to Backup.timestamp
.
In fact the refactoring that I’ve started here isn’t finished yet, because the
separation of concerns between the RotateBackups
, Location
and
Backup
classes doesn’t make a lot of sense at the moment and I’d like to
improve on this. Rewriting projects takes time though :-(.
Release 2.3 (2015-08-30)¶
Add/restore Python 3.4 compatibility.
It was always the intention to support Python 3 but a couple of setbacks made it harder than just “flipping the switch” before now :-). This issue was reported here: https://github.com/xolox/python-naturalsort/issues/2.
Release 2.2 (2015-07-19)¶
Added support for configuration files.
Release 2.1 (2015-07-19)¶
Bug fix: Guard against empty rotation schemes.
Release 2.0 (2015-07-19)¶
Backwards incompatible: Implement a new Python API.
The idea is that this restructuring will make it easier to re-use (parts of) the rotate-backups package in my other Python projects..
Release 1.1 (2015-07-19)¶
Merged pull request #1: Add include/exclude filters.
I made significant changes while merging this (e.g. the short option for the include list and the use of shell patterns using the fnmatch module) and I added tests to verify the behavior of the include/exclude logic.
Release 1.0 (2015-07-19)¶
- Started working on a proper test suite.
- Split the command line interface from the Python API.
- Prepare for API documentation on Read The Docs.
- Switch from
py_modules=[...]
topackages=find_packages()
insetup.py
.
Release 0.1.2 (2015-07-15)¶
- Bug fix for
-y
,--yearly
command line option mapping. - Fixed some typos (in the README and a comment in
setup.py
).
Release 0.1.1 (2014-07-03)¶
- Added missing dependency.
- Removed Sphinx-isms from README (PyPI doesn’t like it, falls back to plain text).
Release 0.1 (2014-07-03)¶
Initial commit (not very well tested yet).