Welcome to Platform Migrator’s documentation!¶
Platform migrator is a tool which helps with migrating software from system to another. Its main aim is to ease the process of installing dependencies required for the software. It interfaces with various package managers to install the required dependencies and runs tests to identify whether the installation was succesful or not.
Abstract¶
Currently, one of the major problems in software development and maintenance, specially in academia, is managing packages across time and systems. An application developed under a particular package manager using a certain set of packages does not always work reliably when ported to a different system or when abandoned for a period of time and picked up again with newer versions of the packages. In this report, we provide and describe Platform Migrator, a software that makes it easy to test applications across systems by identifying various packages in the base system, figuring out their corresponding equivalents in the new system and testing whether the software works as expected on the new system. Platform migrator can migrate software written and set up inside a conda environment to any Linux based system with conda or some other package manager. The philosophy of platform migrator is to identify a closure of the required dependencies for the software being migrated using the conda environment metadata and then use that closure to install the various dependencies on the target system. This documentation provides comprehensive details on how to use platform migrator and what it does internally to migrate software from one system to another. It also contains tutorials and case studies that can be replicated for better understanding of the process.
Platform Migrator Introduction¶
Platform migrator is a tool which helps with migrating software from system to another. Its main aim is to ease the process of installing dependencies required for the software. It interfaces with various package managers to install the required dependencies and runs tests to identify whether the installation was successful or not.
For platform migrator, a system is defined by a set of package managers provided as part of a configuration file. This set may include some or all of the package managers available on the machine. In case only some package managers are used, the system is defined only as the set of those package managers. This can be used to migrate dependencies of a software from one package manager to another, on the same machine.
The two systems could also be different OSes or running in a VM. As long as there is a package manager available, platform migrator can try to port software across the systems. The rest of the documentation uses the terms base system and target system to refer to these systems:
- base system
- This is the system on which the software is currently installed within a conda environment. The current version of platform migrator requires that there is a conda installation which can provide the list of packages used by the software. The software itself also must be available in an executable form, either as a compiled object or as source code.
- target system
- This is the system to which the software will be migrated. Platform migrator should be installed on this system. Conda is not required on this system, but there must be some package manager present that the user can use to install software.
The base system and the target system must be able to connect with each other over a network through HTTP or the user of the base system must also install platform migrator and email the zip file produced over to the target system.
Installation¶
Platform migrator requires Python >= 3.5 to be available on the target system. It can be installed directly from pip on the target system
pip install --user platform-migrator
Or, if you want to install a development version
pip install --user git+https://gitlab.com/mmc691/platform-migrator.git
- Note:
- On some systems, you may need to specify
pip3
instead ofpip
to make sure that Python 3 is used. Runpip --version
to check which version of Python the default pip installation uses.
If pip is not available, it can be installed by cloning the git repository or downloading a zip file from Gitlab repository and executing
git clone https://gitlab.com/mmc691/platform-migrator
cd platform-migrator
python3 setup.py install --user
Execution¶
This provides a basic overview of how platform migrator works. See the tutorials and the internals documentation for the full details on how to use platform migrator.
The whole process is executed in 4 main steps:
Start the server on the target system with
platform-migrator server start
. This starts an HTTP server which by default listens on localhost:9001.On the base system, execute the following on the command line:
curl http://<server-name>:<server-port>/migrate > script.py python script.py
The server name and port are the hostname and port on which the server on the target system is listening. See
base_sys_script
documentation for details on what the script does.Generate the test configuration file on the target system. See the Writing test configuration files section on how to do this.
Run
platform-migrator migrate <name> <config>
on the target system. See the Tutorials and Case Studies section for detailed description about this step.
Alternately, if a network connection cannot be setup between the systems, the software can be transported over email as a zip file. In this case, platform migrator must be installed on the base system as well as the target system. The steps will be:
- Package the software on the base system using
platform-migrator pack
and send the zip file created by platform migrator to the target system. - Unpack the zip file on the target system using
platform-migrator unpack <zip-file>
. - Generate the test configuration file on the target system. See the Writing test configuration files section on how to do this.
- Run
platform-migrator migrate <name> <config>
on the target system. See the Tutorials and Case Studies section for detailed description about this step.
If the migration was successful, the software will be saved in the configured output directory.
Supported Package Managers¶
Out of the box, platform migrator supports conda as a package manager on any OS. If you wish to use platform migrator only with conda, you can skip this section. For Linux distros, pip, pacman, apt-get and aptitude are also supported out of the box. For Mac OS, pip is supported out of the box. For Windows, pip is supported in a POSIX shell environment like Cygwin.
In case of pip, pip2 and pip3 options are provided as well to explicitly use pip for a specific Python version. Using pip as the package manager will default to whichever version of pip is installed as the default. The other version of pip must be installed before it can be used.
However, platform migrator allows you to configure your own package manager if you do not wish to use any of the package managers listed above. See Package manager config file and Configuring an alternate package manager for complete instructions for that.
Git repo and other links¶
In case of bugs or questions, please create an issue through Gitlab.
Command Line Interface (CLI)¶
Command line arguments and options¶
On a target system, platform migrator executes in server mode or in migrate
mode. In server mode, it accepts commands to control the HTTP server,
interacts with the base system to receive the software and prepares it for
migration. In migrate mode, it tries to migrate a software that has been
prepared using either the server mode or from a zip file via the pack
and unpack
commands described below.
Additionally pack
and unpack
commands are available that can be used to
pack and unpack a zip file for transfer over email or some means other than the
platform migrator server. The pack
command is run on the base system and it
executes the client side of the server mode, while the unpack
command is
run on the target system to prepare the software for migration. These are
helper commands which can be used as a replacement for the server mode, when it
is not possible to connect the two systems over a network.
The usage for server mode is
platform-migrator server [-h] [--host HOST] [--port PORT] {start,stop,status}
commands:
start Starts a simple http server on specified host and port
stop Stops the currently running server
status Prints whether a server is currently running or not
optional arguments:
-h, --help show this help message and exit
--host HOST Hostname or IP address to listen on (Default: localhost)
--port PORT Port number to listen on (Default: 9000)
The usage for migrate mode is
platform-migrator migrate [-h] [--skip-tests] [--pck-mgr-configs PCK_MGR_CONFIGS] name test_files [test_files ...]
positional arguments:
name Name of package to migrate
test_files One or more test configuration files for migration
optional arguments:
-h, --help show this help message and exit
--skip-tests Skip tests but install dependencies and copy files
--pck-mgr-configs PCK_MGR_CONFIGS
Config file for additional package managers
The usage for pack mode is
platform-migrator pack [-h] [-n NAME] [-d DIR]
Pack the software in a zip file for emailing
optional arguments:
-h, --help show this help message and exit
-n NAME, --name NAME Name of the software to migrate
-d DIR, --dir DIR Directory containing the software
The usage for unpack mode is
platform-migrator unpack [-h] zip_file
Unpack a zip file
positional arguments:
zip_file The zip file to unpack
optional arguments:
-h, --help show this help message and exit
Additionally, when receiving the script from the server to run, the script
takes the same command line as platform-migrator pack
. It can be executed
as
python script.py [-h] [-n NAME] [-d DIR]
Transfer the software and additional data to the target system
optional arguments:
-h, --help show this help message and exit
-n NAME, --name NAME Name of the software to migrate
-d DIR, --dir DIR Directory containing the software
Environment variables¶
Platform migrator uses CONDA_HOME or CONDA_ROOT_DIR environment variables to
determine the location of conda installation. These should point to the top
level directory under which conda is installed. So, if conda is installed in
the home directory of user bob, the value for either variable should be
/home/bob/anaconda
.
Tutorials¶
The following tutorials describe some use cases of platform migrator in moving software from one system to another.
Migrating from conda to conda¶
This is the simplest and most straightforward use case of platform migrator. The requirements for this use case are:
- Software to be migrated
- The software must be in an executable format (either binary or source code)
- There should be a set of tests that can be run to determine whether the software works as intended or not.
- Base System
- The base system must contain a dedicated conda environment for the software. This may be the conda root environment.
- Access to internet
- Target System
- The target system must be a system where the user can install or has already installed conda. The tutorial will cover installation of conda.
- Access to internet
Preparing the target system¶
First, install platform migrator on the target system as per the instructions in Installation. There is no need for any configuration in this use case.
Next, go to Anaconda installation page and install the version of conda for your target system. You can skip this if you already have conda installed on the target system.
Transferring the software¶
There are 2 ways to do this. The end result of both the ways is the same and has no impact on the overall success or failure of the migration. Following any one of the two options below is sufficient to attempt an migration.
Option 1. Over a local network¶
On the target system, start the platform migrator server by executing
platform-migrator server --host 0.0.0.0 start
Here, platform migrator runs in server mode and starts a daemon process, which
is a simple HTTP server which listens on all IP addresses and port 9001. If you
wish to listen only on a particular IP address and port, use the --host
and
--port
options to change them. Regardless, you should note the IP address of
your target system. On Linux or Mac, this can be obtained by running ifconfig
in a terminal and using the IP address on an interface other than the loopback
interface, lo
. On Windows, this can be obtained by running cmd and typing
in ipconfig
. Note the IP address for use on the base system.
Now, on the base system, download a python script from the target system by running
curl http://<IP-address-of-target-system>:9001/migrate > script.py
If curl is not available, activate the conda root environment on the base system and run the following code in a python interpreter
import requests
resp = requests.get('http://<IP-address-of-target-system>:9001/migrate')
with open('script.py', 'w') as fp:
fp.write(resp.text)
exit()
The script is a Python script generated on the target system and it contains some procedures to probe the conda environment for the packages installed and to transfer the software and settings over to the target system.
Make sure that the conda executable is on the PATH of the shell. If it is not,
run export PATH=$PATH:/path/to/conda/bin
to add it to the PATH variable.
Now, activate the conda environment that is used by the software to be
migrated and run python script.py
and follow the prompts on screen. The
script will ask you to enter the name of the software and the directory under
which the source code is saved, if everything goes well. When prompted
regarding saving a zip file for emailing later, enter n
. The prompts can be
avoided by passing the name of the of the software and directory using command
line options -n
and -d
respectively.
Option 2. With email, scp or other manual methods¶
This an alternate way to transfer software, in case network connection cannot be setup between the base and the target system. In this case, the server startup is not required on the target system and server mode is not involved.
To do this, first install platform migrator on the base system, by following
the installation steps. After that, execute platform-migrator pack
. This
will run the same script as above, but when prompted to save a zip file say
y
. A zip file will be saved in the current directory. This can be copied
to the target system using any method like email, scp, ftp or manual copy,
and unpacked there using platform-migrator unpack
. The prompts can be
avoided by passing the name of the of the software and directory using command
line options -n
and -d
respectively.
Installing and testing on the target system¶
Open up a text editor and create a test configuration for the software. A minimal test configuration would look like
[<software-name>]
package_managers = conda
output_dir = <directory-where-you-want-to-save-the-software>
tests = <test-name0>, <test-name1>
[test_<test-name0>]
exec = <command-to-run-test0>
[test_<test-name1>]
exec = <command-to-run-test1>
The <software-name>
should be replaced by the name used on the base system.
The output_dir
should be an absolute path to the directory where the
software must be saved post migration. The tests
option contains a comma
separated list of tests that should be run on the software. For each test in
this list, there must be a corresponding test_<name>
section which contains
a exec
option containing the command to execute. This command must return 0
on success and any other value on failure. The command will be run from inside
the directory in which software is installed and can use paths relative to that.
So, for example, a test called foo
executes a script called tests/foo.sh
inside the software, the test configuration will look like
[<software-name>]
package_managers = conda
output_dir = /tmp
tests = foo
[test_foo]
exec = bash tests/foo.sh
See Writing test configuration files for the complete documentation on this.
Once the tests have been configured, execute
platform-migrator migrate <software-name> <tests-config-file-name>
Now, platform migrator will run in migrate
mode and try to install the
software along with any required dependencies.
Platform migrator will create a new conda environment called <software-name> and run the tests for the software there. If the tests pass, the software will be copied into the output directory. Otherwise, the error message from the tests will be displayed on screen and the environment will be destroyed.
Installing without any tests¶
In case there are not tests available, you use the --skip-tests
option and
platform migrator will create a conda environment and copy the files to the output
directory. A configuration file is still required but it need not contain any
tests.
Configuring an alternate package manager¶
This tutorial walks you through another essential part of platform migrator,
which is configuring it to use a package manage other than conda. These
additional package managers can be added to the default file in
~/.platform-migrator/config/
on POSIX OSes. For Windows, replace ~
with the user’s home directory.
Open up the package-managers.ini
file in a text editor. The file contains
the documentation of the file along with some sample package managers. These
package managers are supported out of the box on Linux distros, but not on Windows
and Mac. On Windows, you will probably need Cygwin or some other POSIX shell
utility for these to be actually useful.
To add a new package manager to the file, create a new section. The section name will be used in the list of package managers in the test configuration file during the migration process, so keep the indicative of the package manager. Typically, this would be the name of the executable of the package manager.
In the section, add the following options and their values as described:
name: | The actual name. This is only for your benefit and not used by platform migrator. |
---|---|
exec: | The executable of the package manager used on the command line |
install: | The command used to install a package where the package name will be at the end of the command. |
install_version: | |
The command used to install a specific version of package. | |
search: | The command used to search the package manager for packages. This will typically require some further text manipulation since most package managers tend to provide human readable output rather than machine parseable output. |
result_fmt: | A regular expression describing the results of the search. |
Read the Package manager config file for details on how the values of each option should be written. An example of Configuring apt-get is also provided in the Case Studies section.
If you do not wish to edit the default file, you can save your configuration in
a separate .ini file in the ~/.platform-migrator
directory and use it in
addition to the default file by using the
--pck-mgr-configs ~/.platform_migrator/<filename>.ini
option during
migration.
Migrating from conda to a different package manager(s)¶
This is a more complex migration which requires some input and knowledge on the user’s end about the software being migrated. The requirements for this use case are:
- Software to be migrated
- The software must be in an executable format (either binary or source code)
- There should be a set of tests that can be run to determine whether the software works as intended or not.
- Base System
- The base system must contain a dedicated conda environment for the software. This may be the conda root environment.
- Access to internet
- Target System
- The target system must contain one or more package managers and platform migrator must be configured to use them.
- Access to internet
Preparing the target system¶
First, install platform migrator on the target system as per the instructions
in Installation. Follow the tutorial in Configuring an alternate package manager and configure a
new package manager. Further instructions will assume that the package manager
has been configured in a separate file called ~/.platform-migrator/site-pkg-mgrs.ini
Transferring the software¶
This part is exactly the same as in the migration from conda to conda. Please refer the Transferring the software section for the documentation.
Installing and testing on the target system¶
Open up a text editor and create a test configuration for the software. A minimal test configuration would look like
[<software-name>]
package_managers = <site-package-manager0>, <site-package-manager1>
output_dir = <directory-where-you-want-to-save-the-software>
tests = <test-name0>, <test-name1>
[test_<test-name0>]
exec = <command-to-run-test0>
[test_<test-name1>]
exec = <command-to-run-test1>
The package_managers
option is a comma-separated list of the package
managers that will be used by platform-migrator for installing the
required packages. The package managers are used in the order entered, so
<site-package-manager0>
is probed before probing <site-package-manager1>
.
The <software-name>
should be replaced by the name used on the base system.
The output_dir
should be an absolute path to the directory where the
software must be saved post migration. The tests
option contains a comma
separated list of tests that should be run on the software. For each test in
this list, there must be a corresponding test_<name>
section which contains
a exec
option containing the command to execute. This command must return 0
on success and any other value on failure. The command will be run from inside
the directory in which software is installed and can use paths relative to that.
So, for example, a test called foo
executes a script called tests/foo.sh
inside the software and the site runs Archlinux with pip installed on it, the
test configuration will look like
[<software-name>]
package_managers = pacman, pip
output_dir = /tmp
tests = foo
[test_foo]
exec = bash tests/foo.sh
Once the tests have been configured, execute
platform-migrator migrate --pck-mgr-configs site-pkg-mgrs.ini <software-name> <tests-config-file-name>
Now, platform migrator will run in migrate
mode and try to install the
software along with any required dependencies.
Platform migrator will prompt the user for each package that needs to be installed based on the conda environment. For each package, there will be an option to skip the installation using that package manager. In this case, the next package manager will be probed, or the package will not be installed in case there are no further package managers to probe. If a package is not found in any of the package managers, the user will be asked whether the installation should be continued or not. After all packages are selected for installation, each package will be installed individually. Hence, a failure in the installation of one package will not prevent other packages from being installed. Once all the packages are installed, the tests will be executed and if successful, the software will be copied to the output directory.
Installing without any tests¶
In case there are not tests available, you use the --skip-tests
option and
platform migrator will follow the above process for installing the packages and
copy the software over to the output directory, if the installations were
completed successfully.
Package manager config file¶
A package manager is configured in a .ini file. One file can contain multiple package managers.
Each package manager is a section and must contain the following keys:
name: The name of the package manager. This can be any string. exec: The package manager executable. If empty, it defaults to the package manager name. install: An install command which takes the package name as the argument. Wildcard %p can be used to indicate the position of package name in the string. Otherwise, it is assumed that the package name is to be added to the end. install_version: An install command which takes package name and version as arguments. Wildcards %p and %v can be used in a format string to indicate the position of the package and version respectively. search: A search command which takes the search expression an input. Wildcard %s can be used to indicate position of the search expression. Otherwise, it is assumes that the expression should go at the end. result_fmt: A Python regexp which will match individual lines in the output returned by the search command. Wildcards %p and %v must be used to indicate the package name and version.
The commands can have the following substitution parameters:
%e: The package manager executable defined in the same section. It may be used only in the search
,install
orinstall_version
options. If this is not present in the option, the command will be executed as is.%p: The package name. It must be used in the install_version
and inresult_fmt
otherwise the options are invalid. It may also be used in theinstall
option, otherwise it will be appended to the theinstall
option by default. It is not valid in other options.%s: A search string. It may be used only in the search
option and will be appended by default if not provided.%v: The package version. It must be used in the install_version
otherwise it is invalid. It may also be used in theresult_fmt
option. It is not valid in any other option.
Writing test configuration files¶
This describes how to write a test configuration file for a package migration. The test configuration can contain tests and settings for multiple software or multiple test configuration files can be used for a particular software.
In other words, multiple test configurations files are read by platform migrator as if they were one file and only those sections which are applicable to a particular software are used.
There are two main types of sections, software sections and test sections. Test
sections have names starting with test_
. All other sections are treated as
software sections. A special section [DEFAULT]
may be used to create some
default options for each software like package manager names and output
directory location.
Software sections¶
A software section is named after the software being migrated and contains the following options:
[<software-name>]
package_managers = <pkg-mgr0>[, <pkg-mgr1>, ...]
output_dir = /an/absolute/path
tests = <test0>[, <test1>, ...]
The package_managers
options is a comma separated list of package managers
that will be used for migrating that particular software. These package manager
names must correspond to section names in the package manager configuration
files being used for the migration.
The output_dir
is an absolute path to the location where the software
should be saved.
tests
contains a comma separated list of tests for the software. They must
be test section names without the test_
prefix and the test sections must
be in one of the test configuration files being used for the migration.
Test sections¶
A test section begins with test_
and contains only one option, exec
:
[test_<test0>]
exec = python some_script.py
[test_<test1>]
exec = ~/test-script.sh
The option must be a command that can be executed by the default shell or an
executable shell script. Any paths in the command must be relative to the
directory which contains the software being migrated or an absolute path. So,
some_script.py
should be directly inside the software’s top level directory
while test-script.sh
is in the home directory.
If the command returns 0 upon execution, the test is considered successful, otherwise it is considered a failure. The output from the tests is always printed out on the terminal.
Case Studies¶
All cases below have been run using conda version 4.3 or above. The cases where the target system is not conda are only recommended for expert users since the system-wide packages may need to be installed and user needs to be aware of what packages to install.
For all case studies, <ip-address>
should be replaced with the IP address
of the target system and <conda-prefix>
should be replaced with the
directory under which anaconda or miniconda is installed (eg.
/home/joe/miniconda3
).
Hello World over network¶
This is a basic use case which demonstrates how you can migrate simple script
from one platform to another over a network. The script used for migration is
a simple Python script that prints ‘Hello world’ and is located at
tests/basic-migration/simple-app/simple_app.py
in the
git repository.
Two different scenarios, one where the target system is conda and one where the target system is Archlinux were run. The base system used was an Ubuntu 16.04 OS running inside VirtualBox, but it can be any system running conda and bash shell. Miniconda3 was installed on the base system by following the instructions at the Anaconda installation page
Target system with conda¶
First, a conda environment was created on the base system with
<conda-prefix>/bin/conda create -n simple-app python=3
. <conda-prefix>/bin/activate simple-app
The script was created in the home directory with
mkdir -pv simple-app
echo "print('Hello world')" > simple-app/simple_app.py
On the target system, the platform migrator was installed and the server was started with
pip3 install --user git+ssh://git@gitlab.com/mmc691/platform-migrator.git
platform-migrator server --host <IP-address> start
Now, back on the base system, the script was transferred with
curl http://<IP-address>:9001/migrate > script.py
python script.py
When prompted, for the application name, simple-app
was entered, and for the
directory, simple-app
was entered. A zip file was not created by entering
n
on the prompt and the software was transferred over the network.
- Note:
- The
python script.py
command can also be run aspython script.py -n simple-app -d simple-app/
to avoid the prompts.
After this, on the target system, a test configuration was created with the
below contents and saved as test-config.ini
cat > test-config.ini << EOF
[simple-app]
package_managers = conda
output_dir = /tmp
tests = hello-world
[test_hello-world]
exec = python simple_app.py
EOF
- Note:
- If you are trying to reproduce the case, you will have to manually create this file in a text editor. It is not generated by platform migrator.
Platform migrator was then executed with the following command
CONDA_HOME=<conda-prefix> platform-migrator migrate simple-app test-config.ini
It was verified that the package is available in /tmp
by executing
ls -l /tmp
ls -l /tmp/simple-app
Target system with pacman¶
The initial steps of installing conda on the base system and platform migrator
on the target system were completed as described above. The script was also
transferred to the base system using curl command and was executed. However, on
being prompted for the application name, simple-app-pacman
was entered to
avoid conflicts with the previous case. On the target system, the previously
created test-config.ini
was edited and the following content was added to
it
[simple-app-pacman]
package_managers = pacman
output_dir = /tmp
tests = hello-world
Platform migrator was then executed with the following command
platform-migrator migrate simple-app-pacman test-config.ini
Since the script only requires Python, all packages other than python
were
skipped from installation. The tests were run and it was verified that the
software was available in /tmp
with
ls -l /tmp
ls -l /tmp/simple-app-pacman
This use case is available as a regression test in the git repository as
tests/basic-migration/test_simple_app.py
and the test config file is
available as tests/basic-migration/test-config.ini
. The regression test
uses two different conda environments on the local machine instead of a VM or
a remote machine.
Hello World using zip file¶
This is a basic use case which demonstrates how you can migrate simple script
from one platform to another using a zip file. The script used for migration
is a simple Python script that prints ‘Hello world’ and is located at
tests/basic-migration/simple-app/simple_app.py
in the
git repository.
Two different scenarios, one where the target system is conda and one where the target system is Archlinux were run. The base system used was an Ubuntu 16.04 OS running inside VirtualBox, but it can be any system running conda and bash shell. Miniconda3 was installed on the base system by following the instructions at the Anaconda installation page
Target system with conda¶
First, a conda environment was created on the base system with
<conda-prefix>/bin/conda create -n simple-app python=3
. <conda-prefix>/bin/activate simple-app
The script was created in the home directory with
mkdir -pv simple-app
echo "print('Hello world')" > simple-app/simple_app.py
Next, platform migrator was installed and the zip file for the software was created with
pip3 install --user git+ssh://git@gitlab.com/mmc691/platform-migrator.git
platform-migrator pack
When prompted, for the application name, simple-app
was entered, and for the
directory, simple-app
was entered. A zip file was created by entering y
on the prompt and it was copied manually to the target system.
- Note:
- The
platform-migrator pack
command can also be run asplatform-migrator pack -n simple-app -d simple-app/
to avoid the prompts.
Now, on the target system, platform migrator was installed and the software was setup for migration using
pip3 install --user git+ssh://git@gitlab.com/mmc691/platform-migrator.git
platform-migrator unpack simple-app.zip
After this, on the target system, a test configuration was created with the
below contents and saved as test-config.ini
cat > test-config.ini << EOF
[simple-app]
package_managers = conda
output_dir = /tmp
tests = hello-world
[test_hello-world]
exec = python simple_app.py
EOF
- Note:
- If you are trying to reproduce the case, you will have to manually create this file in a text editor. It is not generated by platform migrator.
Platform migrator was then executed with the following command
CONDA_HOME=<conda-prefix> platform-migrator migrate simple-app test-config.ini
It was verified that the package is available in /tmp
by executing
ls -l /tmp
ls -l /tmp/simple-app
Target system with pacman¶
The initial steps of installing conda and platform migrator on the base system
were completed as described above. A zip file was created once again by using
the platform-migrator pack
command and was copied to the target system. On
being prompted for the application name, simple-app-pacman
was entered to
avoid conflicts with the previous case. Again as described above, platform
migrator was installed on the target system and the software was unpacked using
platform-migrator unpack
.
On the target system, the previously created test-config.ini
was edited
and the following content was added to it
[simple-app-pacman]
package_managers = pacman
output_dir = /tmp
tests = hello-world
Platform migrator was then executed with the following command
platform-migrator migrate simple-app-pacman test-config.ini
Since the script only requires Python, all packages other than python
were
skipped from installation. The tests were run and it was verified that the
software was available in /tmp
with
ls -l /tmp
ls -l /tmp/simple-app-pacman
This use case is available as a regression test in the git repository as
tests/basic-migration/test_simple_app.py
and the test config file is
available as tests/basic-migration/test-config.ini
. The regression test
uses two different conda environments on the local machine instead of a VM or
a remote machine.
scikit-learn and scikit-image over network¶
Similar to the Hello World case, this case was run from an Ubuntu 16.04 VM system, once with conda and once with pacman and pip as the target package managers.
The script used for the software is available as tests/basic-migration/scikit-app/scikit_app.py
in the git repo. It was saved on the base system as ~/scikit-app/scikit_app.py
.
Target system with conda¶
First, a conda environment was created on the base system with
<conda-prefix>/bin/conda create -n scikit-app python=2 scikit-learn scikit-image
. <conda-prefix>/bin/activate scikit-app
The script was created in the home directory with
git clone https://gitlab.com/mmc691/platform-migrator
cp -Rv platform-migrator/tests/basic-migration/scikit-app .
On the target system, platform migrator was installed and the server was started with
pip3 install --user git+ssh://git@gitlab.com/mmc691/platform-migrator.git
platform-migrator server --host <IP-address> start
Now, back on the base system, the script was transferred using
curl http://<IP-address>:9001/migrate > script.py
python script.py
When prompted, for the application name, scikit-app
was entered, and for the
directory, scikit-app
was entered. A zip file was not created by entering
n
on the prompt and the software was transferred over the network.
- Note:
- The
python script.py
command can also be run aspython script.py -n scikit-app -d scikit-app/
to avoid the prompts.
Since pip is listed before pacman, packages are first searched for in pip and only if the user decides not to install from pip, are they searched for in pacman. Also, pip2 was explicitly specified since it is known beforehand that the code works only for Python 2. This requires that pip2 is installed on the target system prior to using platform migrator.
Platform migrator was then executed with the following command
platform-migrator migrate scikit-app-pacman test-config.ini
If everything is successful, the application script will be installed under
/tmp
.
Target system with aptitude and pip¶
The initial steps of installing conda on the base system and platform migrator
on the target system were completed as described above. The script was also
transferred to the base system using curl command and was executed. However, on
being prompted for the application name, scikit-app-apt
was entered to
avoid conflicts with the previous case. On the target system, the previously
created test-config.ini
was edited and the following content was added to
it
[simple-app-apt]
package_managers = pip2, aptitude
output_dir = /tmp
tests = scikit-app
Since pip is listed before aptitude, packages are first searched for in pip and only if the user decides not to install from pip, are they searched for in aptitude.
Platform migrator was then executed with the following command
platform-migrator migrate scikit-app-apt test-config.ini
If everything is successful, the application script will be installed under
/tmp
.
The conda portion of this use case is available as a regression test in the git
repository as tests/basic-migration/test_scikit_app.py
and the test config
file is available as tests/basic-migration/test-config.ini
. The regression
test uses two different conda environments on the local machine instead of a VM
or a remote machine.
scikit-learn and scikit-image using zip file¶
This case study is exactly the same as the previous one, except the software
was sent over to the target system as a zip file created using
platform-migrator pack
. This was also run from an Ubuntu 16.04 VM system,
once with conda and once with pacman and pip as the target package managers.
The script used for the software is available as
tests/basic-migration/scikit-app/scikit_app.py
in the git repo. It was
saved on the base system as ~/scikit-app/scikit_app.py
.
Target system with conda¶
First, platform migrator was installed on the base system with
pip3 install --user platform-migrator
Next, a conda environment was created on the base system (same as above) with
<conda-prefix>/bin/conda create -n scikit-app python=2 scikit-learn scikit-image
. <conda-prefix>/bin/activate scikit-app
The script was created in the home directory with
git clone https://gitlab.com/mmc691/platform-migrator
cp -Rv platform-migrator/tests/basic-migration/scikit-app .
Then, the a zip file was created using
platform-migrator pack
When prompted, for the application name, scikit-app
was entered, and for the
directory, ~/scikit-app
was entered. Here, a zip file was generated by
entering y
when prompted to create the file. The zip file was then emailed
to the target system.
- Note:
- The
platform-migrator pack
command can also be run asplatform-migrator pack -n scikit-app -d scikit-app/
to avoid the prompts.
After this, on the target system, a test configuration was created with the
below contents and saved as test-config.ini
cat > test-config.ini << EOF
[scikit-app]
package_managers = conda
output_dir = /tmp
tests = scikit-app
[test_scikit-app]
exec = python2 scikit_app.py
EOF
- Note:
- If you are trying to reproduce the case, you will have to manually create this file in a text editor. It is not generated by platform migrator.
Platform migrator was then executed with the following command
CONDA_HOME=<conda-prefix> platform-migrator migrate scikit-app test-config.ini
If everything is successful, the application script will be installed under
/tmp
.
Target system with pacman and pip¶
The initial steps of installing conda on the base system and platform migrator
on the target system were completed as described above. The script was also
transferred to the base system using curl command and was executed. However, on
being prompted for the application name, scikit-app-pacman
was entered to
avoid conflicts with the previous case. On the target system, the previously
created test-config.ini
was edited and the following content was added to
it
[simple-app-pacman]
package_managers = pip2, pacman
output_dir = /tmp
tests = scikit-app
On the target system, platform migrator was installed and the zip file was unpacked with
pip3 install --user git+ssh://git@gitlab.com/mmc691/platform-migrator.git
platform-migrator unpack <zip-file>
The rest of the steps are same as when transferring over a network, but are
repeated here for the sake of completeness. After this, on the target system,
a test configuration was created with the below contents and saved as
test-config.ini
cat > test-config.ini << EOF
[scikit-app]
package_managers = conda
output_dir = /tmp
tests = scikit-app
[test_scikit-app]
exec = python2 scikit_app.py
EOF
- Note:
- If you are trying to reproduce the case, you will have to manually create this file in a text editor. It is not generated by platform migrator.
Platform migrator was then executed with the following command
CONDA_HOME=<conda-prefix> platform-migrator migrate scikit-app test-config.ini
If everything is successful, the application script will be installed under
/tmp
.
Target system with pacman and pip¶
The initial steps of installing conda on the base system and platform migrator
on the target system were completed as described above. The zip file was
created and setup to the target system using pack and unpack commands. On
being prompted for the application name, scikit-app-pacman
was entered to
avoid conflicts with the previous case. On the target system, the previously
created test-config.ini
was edited and the following content was added to
it
[simple-app-pacman]
package_managers = pip2, pacman
output_dir = /tmp
tests = scikit-app
Since pip is listed before pacman, packages are first searched for in pip and only if the user decides not to install from pip, are they searched for in pacman. Also, pip2 was explicitly specified since it is known beforehand that the code works only for Python 2. This requires that pip2 is installed on the target system prior to using platform migrator.
Platform migrator was then executed with the following command
platform-migrator migrate scikit-app-pacman test-config.ini
If everything is successful, the application script will be installed under
/tmp
.
Target system with aptitude and pip¶
The initial steps of installing conda on the base system and platform migrator
on the target system were completed as described above. The zip file was
created and setup to the target system using pack and unpack commands. On
being prompted for the application name, scikit-app-apt
was entered to
avoid conflicts with the previous case. On the target system, the previously
created test-config.ini
was edited and the following content was added to
it
[simple-app-apt]
package_managers = pip2, aptitude
output_dir = /tmp
tests = scikit-app
Since pip is listed before aptitude, packages are first searched for in pip and only if the user decides not to install from pip, are they searched for in aptitude.
Platform migrator was then executed with the following command
platform-migrator migrate scikit-app-apt test-config.ini
If everything is successful, the application script will be installed under
/tmp
.
The conda portion of this use case is available as a regression test in the git
repository as tests/basic-migration/test_scikit_app.py
and the test config
file is available as tests/basic-migration/test-config.ini
. The regression
test uses two different conda environments on the local machine instead of a VM
or a remote machine.
OpenAlea¶
This case was only done for conda to conda migration due to the complex dependencies required to install OpenAlea. The base system used was an Ubuntu 14.04 server with miniconda2 installed. The target system used was Archlinux with miniconda3 installed.
First, on the base system, a list of modules to install was obtained by searching the OpenAlea channel using
conda search -c OpenAlea --override-channels openalea.*
conda search -c OpenAlea --override-channels vplants.*
conda search -c OpenAlea --override-channels alinea.*
A conda environment was created on the base system with the above modules using the following commands
conda create -n openalea
. <conda-prefix>/bin/activate openalea
conda install -c OpenAlea <all-openalea-modules> <all-vplants-modules> <all-alinea-modules>
A simple script was created to see if all packages can be imported. This script was used as the application to migrate.
cd ~
mkdir openalea
cat > openalea/test.py <<< EOF
try:
import openalea, vplants, alinea
except ImportError:
print('OpenAlea install failed')
exit(-1)
else:
print('OpenAlea installed succesfully')
exit(0)
EOF
Transferring over network¶
When using the network to transfer the software, the server was started
on the target system by executing
platform-migrator server --host <ip-address> start
After that, the request was made to the platform-migrator server and the environment and the test script were sent over. The conda environment is still active on the base the system.
curl http://<ip-address>:9001/migrate > script.py
python script.py
The details were entered when prompted and a zip file was not created. After this, the steps from the Installation and Testing section below were executed.
- Note:
- The
python script.py
command can also be run aspython script.py -n openalea -d openalea/
to avoid the prompts.
Transferring using zip file¶
When using a zip file to transfer the software, platform migrator was installed
on the base system and the software was packed with
platform-migrator pack -n openalea -d openalea
.
The zip file was manually transferred to the target system using scp. Then, it
was unpacked on the target system with platform-migrator unpack openalea.zip
.
Installation and Testing¶
Now, on the target system, the test-config.ini
file was created with
cat > test-config.ini << EOF
[openalea]
package_managers = conda
output_dir = /tmp
tests = import_openalea
[test_import_openalea]
exec = python test.py
EOF
- Note:
- If you are trying to reproduce the case, you will have to manually create this file in a text editor. It is not generated by platform migrator.
The migration was performed with
platform-migrator migrate openalea test-config.ini
If the conda environment was succesfully created, the test will pass otherwise it will fail.
Configuring apt-get¶
This case study describes how apt-get package manager was configured in the
config/package-managers.ini
file. This is the typical process that should
be followed for configuring a new package manager.
First, a new section called apt-get
was created in the config file. Then,
the various options were added without any values.
[apt-get]
name =
exec =
install =
install_version =
search =
result_fmt =
The name
and exec
options were set to apt-get
since the executable
is called apt-get
.
The installation of a new package is done using
sudo apt-get -y install <package-name>
. So, the install
option was set
to sudo %e install
. The %e
wildcard is replaced by the value in
exec
option and the package name is automatically append to the end of
the command by platform migrator since the %p
wildcard is not specified.
To install a specific version of the package in apt-get, -
is used as a
delimiter. So, the install_version
option was set to
sudo %e apt-get %p-%v
and the %p
and %v
wildcards were used to
specify the position of the package name and version.
Now, packages are searched using the apt-cache
command. Since platform
migrator searched using package name and version only, the -n
flag is
specified to apt-cache search
so that package descriptions are not
searched. However, the results from this still contain a small description
for the package. So, the description needs to trimmed using other tools
typically available on OSes which use apt-get
. First, the results are
piped into awk
with the field delimiter set to ' - '
. The spaces
around the hyphen make sure that the package name and description are split
but the version and build string are part of the name. Only the name is
printed out using '{print $1}'
command. So, the search command now looks
like apt-cache search -n %s | awk -F ' - ' '{print $2}'
. Here, the %s
wildcard replaced by the search expression.
Since the results are now in a format suitable to use for the install command,
each result is treated as a separate package. So, the result_fmt
option is
set to %p
. The final config section looks like
[apt-get]
name = apt-get
exec = apt-get
install = sudo %e -y install
install_version = sudo %e -y install %p-%v
search = apt-cache search -n %s | awk -F ' - ' '{print $2}'
result_fmt = %p
For other default package managers, a similar process was followed.
Process Internals¶
Internal data and files¶
When platform migrator is installed, it creates an internal directory,
.platform_migrator
in the home directory of the user. This directory is
stores all data related to the migrations attempted and also acts as the
initial working directory for the process and the server. Platform migrator
does not put any internal data outside this directory unless instructed to
do so otherwise.
Starting the server¶
When the command platform-migrator server start
is executed, the
main
module, switches the working directory to
.platform_migrator
and executes the script
server
as subprocess and exits.
The server script contains
MigrateRequestHandler
and
PMServer
classes, which are the request
handler and the HTTP server respectively. The request handler implements a
do_GET()
method,
which listens for get requests on the /migrate
route and a
do_POST()
, which
listens for any incoming data on /yml
, /min
and /zip
routes. The
/yml
route is to receive the conda environment YAML file from the base
system, /min
for the minimal dependencies identified, if any, and /zip
to receive a zip of the software itself.
All three POST routes take a JSON input with two attributes, name
and
data
, where name
contains the name of the software being migrated and
data
contains base64 encoded binary data corresponding to the what the
route takes.
On startup, the server first creates a PIDFILE in the working directory, which
is used to track which PID the server is running as, and also creates a copy of
base_sys_script
with the HOST
and PORT
variables updated to the value under which the server is running. This script
is returned as the response when /migrate
a request is received on
/migrate
route. The server now waits for requests.
The server will fail to start in case a PIDFILE already exists. So, only one instance of the server can be running at any point in time. When the server is stopped, it deletes the PIDFILE and the script before exiting.
Probing the base system¶
On the base system, platform migrator probes the conda environment using the
functions in base_sys_script
.
Receiving the probing script¶
This section is only run when transferring using the server. If transferring
using a zip file, the script is executed using platform-migrator pack
command.
The script gets downloaded to the base system when a request is made to the
platform migrator server on the /migrate
path. See the Tutorials for
how to make the request using curl or Python. The script is compatible with
both Python 2 and 3 specifically to allow it to run on older systems that may
not be upgraded to latest version of Python.
When the script is executed, it first tries to identify which conda
environment is active and uses that or prompts the user to enter the name of
a conda environment and tries to source that before. In case the user input
is required, it assumes that the activate
shell script provided by conda
is in the current working directory.
Handling transitive dependencies¶
Once the conda environment is identified, the script will run the
get_conda_min_deps()
function to
identify a closure of conda packages that can be used to minimize the number
of installs on the target system.
First, all the packages installed in the conda environment are obtained using
conda list
command. Then, for each dependency, a dry run of creating a new
environment with that dependency as a target is done. This gives a list of
packages that are required by the dependency. These packages are a second level
dependency to the software being migrated and are removed from the list of
direct dependencies of the software. Once all the dependencies have been
processed, the remaining ones are the direct requirements for the software.
This list is transferred over to the target system as well and only the
packages in this list are installed on the target system. Their transitive
dependencies are automatically installed by the package manager on the target
system.
Collecting info¶
It then prompts the user to enter the name of the software and the directory where it is saved. The conda environment data, direct dependency list and the software are zipped up and sent back to the target system using the POST routes of the server. The script then exits.
Running a migration¶
Once the data from the base system has been transferred over, any number of
migration attempts can be made on it using platform migrator. All data gets
saved in ~/.platform_migrator/<software-name>/
directory on the target
system. When platform-migrator migrate <software-name> <test-config>
is
executed, the main
script parses the command line
arguments and passes them over to the
migrate()
function, which is works as a
wrapper to control the Migrator
class.
The Migrator
class parses the test
config files and creates a new migration id for the job. This migration id is
used to create a new directory for the migration and allows users to perform
multiple attempts for the same software. In future, this may also store
metadata about the attempted migration.
With conda as a package manager¶
Next, the conda environment YAML file is parsed and conda internal packages are removed. Now, if the test configuration lists conda as one of the package managers available, platform migrator will just use conda and ignore all other package managers. A new conda environment with the same name as the software is created and the YAML file is used to install the packages in it.
Now, the software is unzipped inside the migration directory and the tests configured for the software are run inside a sub-shell with the new conda environment activated. If multiple tests are configured, each test is run in a separate sub-shell. All tests are run even if one of the test fails. However, the migration is marked as an failure if any of the tests fail.
Once the migration is complete, the unzipped software is deleted. If the tests were successful, the software is unzipped again in the output directory from the test configuration.
With external package managers¶
If conda is not one of the package managers listed in the test configuration,
get_package_manager()
is used to
parse the package manager configuration files. The function is a factory
function for creating
:py:class`~platform_migrator.package_manager.PackageManager` objects. Once the
package managers are obtained, each package from the direct dependency list
of the software is searched for in the package managers and the user is prompted
to confirm which of the search result should be used. If the user does not
install any of the offered packages from any of the package managers, an option
to abort the migration is offered.
The searches try to offer the user with as few options as possible by removing by using stricter search criteria first and only using the relaxed criteria if there are no results returned. As soon as a search returns results, they are presented to the user for selection.
Once all the packages have been selected by the user, they are installed one by one. This is intentionally done so that even if one package fails to install, other packages can still be installed and tests can be run.
The tests here are run similar to the conda case, except that there is no conda environment which needs to be activated. Other than that, the same process is used for running the tests.
Future work¶
- Currently, platform migrator stores history of all migrations done, including failed attempts. However, the user cannot easily see this and the history is not used for anything. Work can be done around how this history can be used to determine systems and packages that are not compatible with each other. Also, features can be built around querying it.
- More package managers can be added and configured to be available by default. Also, there can be an option to list the usable package managers available on the system.
- Add integration for fetching software from Github, Gitlab and other git
providers. Right now, the software needs to be fetched manually using
git clone
and then the conda environment needs to be setup using themeta.yaml
. This can be automated in future iterations so that only the git repo URL is required.
Source Code Documentation¶
Main module¶
Server module¶
Migrator module¶
Package manager module¶
Base system script¶
This module runs on the base system as a Python script
The module mainly discovers the existing conda environments and sends the source code over to the target system for further processing.
- NOTE:
- The script should be compatible with both Python 2 and 3. It will
be executed using whichever version of Python is available using
python
command.
-
platform_migrator.base_sys_script.
create_zip_file
(zip_dir, env_yml=None, min_deps=None, quiet=False)[source]¶ Create a zip file of a directory
- Args:
zip_dir: The directory to zip - Kwargs:
env_yml=None (str): If provided, the string is copied as the env.yml
file in the zip archivemin_deps=None (str): If provided, the string is copied as the min-deps.json
file in the zip archivequiet=False: If True, no messages are printed - Return:
- A bytes buffer containing the zip file data
-
platform_migrator.base_sys_script.
get_conda_env
(env=None)[source]¶ Return the yml file of current conda environment
The function runs a subprocess to get the yml file of the current conda environment. The results of the subprocess are returned as a tuple and if the yml file was obtained, it will be index 1 of the tuple.
- Kwargs:
env=None: Conda environment to source, if not the currently active environment. - Return:
- A tuple whose elements are a boolean indicating whether the process ran error free, the stdout of the process and the stderr of the process
-
platform_migrator.base_sys_script.
get_conda_min_deps
(env, source_env=False)[source]¶ Try to create a list of minimal dependencies to install
The function queries the conda environment for all dependencies and then tries to determine the which dependencies are sub-dependencies of others. In this way, a minimal list of dependencies is created.
- Args:
env: The name of the conda environment to query - Kwargs:
source_env=False: Boolean indicator whether to source conda env before getting data or not - Return:
- A list of conda package objects if succesful, or None if the conda envrionment could not be loaded
-
platform_migrator.base_sys_script.
main
(cl_args=None, save_zip=None)[source]¶ Main function that is executed
- Kwargs:
cl_args=None (argparse.Namespace): Any command line args that have been parsed. save_zip=None (bool): If True, a zip file is created for saving without prompting the user. Otherwise, the user is prompted for the action.
-
platform_migrator.base_sys_script.
parse_args
()[source]¶ Parse the command line arguments when run as a script
- Return:
- An argparse.Namespace instance for the parsed arguments
-
platform_migrator.base_sys_script.
send_data
(app_name, path, data, name, quiet=False)[source]¶ Send data back to the server
- Args:
app_name (str): The name of the application path (str): The path to send data to, without the base address data: Buffer containing the data name (str): The name to use for the data in messages - Kwargs:
quiet=False (bool): If True, no messages are printed
-
platform_migrator.base_sys_script.
send_env_yml
(app_name, env_yml, quiet=False)[source]¶ Send the environment yml data back to the server
- Args:
app_name (str): The name of the application env_yml: Buffer containing the environment yml data - Kwargs:
quiet=False (bool): If True, no messages are printed
-
platform_migrator.base_sys_script.
send_min_deps
(app_name, min_deps, quiet=False)[source]¶ Send a zip of the repository back to the server
- Args:
app_name (str): The name of the application min_deps (list): List containing the minimal required dependencies - Kwargs:
quiet=False (bool): If True, no messages are printed
-
platform_migrator.base_sys_script.
send_repo_zip
(app_name, zip_file, quiet=False)[source]¶ Send a zip of the repository back to the server
- Args:
app_name (str): The name of the application zip_file: Buffer containing the zip of the code repository - Kwargs:
quiet=False (bool): If True, no messages are printed
Unpack script¶
This module contains a helper function to unpack a zip file
created by the base_sys_script
into the right location and
format
-
platform_migrator.unpack.
unpack
(packed_zip)[source]¶ Unpack a zip file prepared by base_sys_script
This function unpacks the zip file, removes the env.yml and min-deps.json files from it and repacks the remaining files back as pkg.zip for use with the migrator
- Args:
packed_zip (str): Absolute path to the zip file