MedCo Technical Documentation¶
System Administrator Guide¶
Specifications¶
We recommend the following specifications for running MedCo:
Network Bandwidth: >100 Mbps (ideal), >10 Mbps (minimum), symmetrical
Ports Opening and IP Restrictions: see Network Architecture
Hardware
- CPU: 8 cores (ideal), 4 cores (minimum)
- RAM: >16 GB (ideal), >8GB (minimum)
- Storage: dependent on data loaded, >100GB
Software
- OS: Any flavor of Linux, physical or virtualized (tested with Ubuntu 16.04, 18.04, Fedora 29)
- Softwares: OpenSSL, Docker (tested with Docker 18.09.1) & Docker-Compose (tested with Docker-Compose 1.23.2), Git and Git-LFS
Deployment¶
Local Test Deployment¶
Profile test-local-3nodes
This test profile deploys 3 MedCo nodes on a single machine for test purposes. It can be used either on your local machine, or any other machine to which you have access. The version of the docker images used are the latest released versions. This profile is for example used for the MedCo public demo.
MedCo Node Deployment (except IRCT)¶
First step is to get the MedCo Deployment latest release.
$ cd ~
$ wget https://github.com/lca1/medco-deployment/archive/v0.1.1c.tar.gz
$ tar xvzf v0.1.1c.tar.gz
$ mv medco-deployment-0.1.1c medco-deployment
Next step is to download and build the docker images:
$ cd ~/medco-deployment/compose-profiles/test-local-3nodes
$ docker-compose pull
$ docker-compose build
Final step is to run the nodes. They will run simultaneously, and the logs of the running containers will maintain the console captive. No configuration changes are needed in this scenario before running the nodes. To run them:
$ docker-compose up
Wait some time for the initialization of the containers to be done (up to the message: “i2b2-medco-srv… - Started x of y services (z services are lazy, passive or on-demand)”), this can take up to 10 minutes. For the subsequent runs, the startup will be faster.
IRCT Deployment and Configuration¶
First step is to clone the IRCT
repository with the correct branch.
$ cd ~
$ git clone -b MedCo-v0.1.1 https://github.com/lca1/IRCT.git
Currently IRCT must be deployed separately, this will change in the future:
$ cd ~/IRCT/deployments
$ docker-compose -f docker-compose.medco.test-local-3nodes.yml build
Next, if running on another machine than the local host, a configuration file must be changed.
If running on the local host, the default settings can be left in place.
Edit the file ~/medco-deployment/compose-profiles/test-local-3nodes/.env
to reflect your configuration.
For example:
MEDCO_NODE_URL=https://medco-demo.epfl.ch
HTTP_SCHEME=https
MEDCO_NODE_URL
should include the protocol and the fully qualified domain name of the host,
HTTP_SCHEME
should be http
or https
.
Follow HTTPS Configuration to set up the certificates needed for HTTPS. If you are deploying on another host than the local host without HTTPS take note of the following: Disabling HTTPS requirement for external connections.
In a separate terminal run the IRCT container:
$ chmod -R a+rw ../
$ docker-compose -f docker-compose.medco.test-local-3nodes.yml up
Again, the initial startup takes up a few minutes as IRCT is compiled at that point (up to the message: “irct_1… - Started x of y services (z services are lazy, passive or on-demand)”).
In order to stop the containers, simply hit Ctrl+C
in all the active windows.
Keycloak Configuration¶
Follow the instructions from Keycloak Configuration and then you should be able to login in Glowing Bear.
Test the deployment¶
In order to test that the local test deployment of MedCo is working, access Glowing Bear in your web browser at http://<domain name>
(or https
) and use the credentials previously configured during the Keycloak Configuration. If you are new to Glowing Bear you can watch the Glowing Bear user interface walkthrough video.
By default MedCo loads a specific test data, refer to Description of the default test data for expected results to queries.
To load a dataset, follow the guide Loading Data.
For reference, the database address (host) to use during loading is <domain name>:5432
and the databases i2b2medcosrv0
, i2b2medcosrv1
and i2b2medcosrv2
.
Network Test Deployment¶
Profile test-network
This test profile deploys an arbitrary set of MedCo nodes independently in different machines that together form a MedCo network. This deployment assumes each node is deployed in a single dedicated machine. All the machines have to be reachable between each other. Nodes should agree on a network name and individual indexes beforehand (to be assigned an UID). The next set of steps must be executed individually by each node of the network.
This guide is for the latest released version of the docker images.
Preliminaries¶
First step is to get the MedCo Deployment latest release at each node.
$ cd ~
$ wget https://github.com/lca1/medco-deployment/archive/v0.1.1b.tar.gz
$ tar xvzf v0.1.1b.tar.gz
$ mv medco-deployment-0.1.1b medco-deployment
Generation of the Deployment Profile¶
Next the compose and configuration profiles must be generated using a script. This script is executed in two steps.
- Step 1: each node generates its keys and certificates, and shares its public information with the other nodes
- Step 2: each node collects the public keys and certificates of the all the other nodes
For step 1, the network name should be common to all the nodes. A <node domain name> corresponds to the machine domain name where the node is being deployed. As mentioned before the different parties should have agreed beforehand on the members of the network, and assigned an index to each different node to construct its UID (starting from 0, to n-1, n being the total number of nodes).
$ cd ~/medco-deployment/resources/profile-generation-scripts/test-network
$ bash step1.sh <network name> <node index> <node domain name>
This script will generate part of the configuration profile, including a file srv<node index>-public.tar.gz
.
This file should be shared with the other nodes, and all of them need to place it in the configuration profile folder, ~/medco-deployment/configuration-profiles/test-network-<network name>-node<node index>
, such that all the files inside srv<node index>-public.tar.gz
are in the same location in each node.
Once this is done, step 2 can be executed:
$ bash step2.sh <network name> <node index>
The deployment profile is now ready to be used.
MedCo Node Deployment (except IRCT)¶
Next step is to download and build the docker images, and run a node.
$ cd ~/medco-deployment/compose-profiles/test-network-<network name>-node<node index>
$ docker-compose pull
$ docker-compose build
$ docker-compose up
Wait some time for the initialization of the containers to be done (up to the message: “- Started x of y services (z services are lazy, passive or on-demand)”), this can take up to 10 minutes. For the subsequent runs, the startup will be faster.
IRCT Deployment and Configuration¶
Currently IRCT must be configured manually and deployed separately in each of the nodes. This will change in the future.
$ cd ~
$ git clone -b MedCo-v0.1.1 https://github.com/lca1/IRCT.git
Edit the file ~/IRCT/deployments/.env
and adjust for each node:
MEDCO_NODE_URL=https://<node domain name>
MEDCO_NODE_IDX=<node index>
MEDCO_PROFILE_NAME=test-network-<network name>-node<node index>
Copy all the certificates obtained from the previous step to the folder ~/IRCT/deployments/irct/volumes/certificates/
:
$ cp ~/medco-deployment/configuration-profiles/test-network-<network name>-node<node index>/*.crt ~/IRCT/deployments/irct/volumes/certificates/
Then, build and run the IRCT container:
$ cd ~/IRCT/deployments
$ docker-compose -f docker-compose.medco.test-network.yml build
$ chmod -R a+rw ../
$ docker-compose -f docker-compose.medco.test-network.yml up
Use the pgAdmin tool to add the IRCT configuration (see The PostgreSQL database).
With the query tool, execute the following SQL in the database irct
by adapting to your case:
select add_i2b2_medco_resource(
'i2b2-medco-test-network',
'https://<node 0 domain name>/i2b2/services/,https://<node 1 domain name>/i2b2/services/,...',
'i2b2medco,i2b2medco,i2b2medco',
'medcouser',
'demouser',
'true',
'false',
'edu.harvard.hms.dbmi.bd2k.irct.ri.medco.I2B2MedCoResourceImplementation',
'TREE'
);
Finally, restart IRCT to account for the new configuration by hitting Ctrl+C
in IRCT terminal, and starting it again:
$ docker-compose -f docker-compose.medco.test-network.yml up
In order to stop the containers, simply hit Ctrl+C
in all the active windows.
Keycloak Configuration¶
Follow the instructions from Keycloak Configuration and then you should be able to login in Glowing Bear.
Data Loading¶
Contrary to the other deployment profiles the default test data will not be working (the queries made will fail) since the data is not encrypted with the collective key that was generated (encryption key derived from all the nodes’ public keys).
Run the MedCo loader (see Loading Data) to be able to test this deployment.
For reference, the database address (host) to use during loading is <domain name>:5432
and the database i2b2medco
.
Test the deployment¶
In order to test that the network deployment of MedCo is working, access Glowing Bear in your web browser at http://<node domain name>
and use the credentials previously configured during the Keycloak Configuration. If you are new to Glowing Bear you can watch the Glowing Bear user interface walkthrough video.
Note that by default the certificates generated by the script are self-signed and thus, when using Glowing Bear, the browser will issue a security warning. To use your own valid certificates, see HTTPS Configuration.
Local Development Deployment¶
Profile dev-local-3nodes
This deployment profile deploys 3 MedCo nodes on a single machine for development purposes.
It is meant to be used only on your local machine, i.e. localhost
.
The version of the docker images used are all dev, i.e. the ones built from the development version of the different source codes.
They are available either through Docker Hub, or built locally.
MedCo Node Deployment (except IRCT)¶
First step is to clone the medco-deployment
repository with the correct branch.
This example gets the data in the home directory of the current user, but that can be changed.
$ cd ~
$ git clone -b dev https://github.com/lca1/medco-deployment.git
Next step is to download or build the docker images:
$ cd ~/medco-deployment/compose-profiles/dev-local-3nodes
$ docker-compose pull
$ docker-compose build
Next step is to run the nodes. They will run simultaneously, and the logs of the running containers will maintain the console captive. No configuration changes are needed in this scenario before running the nodes. To run them:
$ docker-compose up
Wait some time for the initialization of the containers to be done (up to the message: “i2b2-medco-srv… - Started x of y services (z services are lazy, passive or on-demand)”), this can take up to 10 minutes. For the subsequent runs, the startup will be faster.
IRCT Deployment and Configuration¶
First step is to clone the IRCT
repository with the correct branch.
$ cd ~
$ git clone -b fork/thehyve https://github.com/lca1/IRCT.git
Currently IRCT must be deployed separately, this will change in the future:
$ cd ~/IRCT/deployments
$ docker-compose -f docker-compose.medco.dev-local-3nodes.yml build
In a separate terminal run the IRCT container:
$ chmod -R a+rw ../
$ docker-compose -f docker-compose.medco.dev-local-3nodes.yml up
Again, the initial startup takes up a few minutes as IRCT is compiled at that point (up to the message: “irct_1… - Started x of y services (z services are lazy, passive or on-demand)”).
Glowing Bear Deployment and Configuration¶
First step is to clone the glowing-bear
repository with the correct branch.
$ cd ~
$ git clone -b picsure https://github.com/lca1/glowing-bear-medco.git
Glowing Bear is deployed separately for development, as we use its very practical development server:
$ cd ~/glowing-bear-medco/deployment
$ docker-compose build dev-server
In another separate terminal run the glowing bear development server:
$ docker-compose up dev-server
In order to stop the containers, simply hit Ctrl+C
in all the active windows.
Keycloak Configuration¶
Follow the instructions from Keycloak Configuration and then you should be able to login in Glowing Bear.
Test the deployment¶
In order to test that the development deployment of MedCo is working, access Glowing Bear in your web browser at http://localhost:4200
and use the credentials previously configured during the Keycloak Configuration. If you are new to Glowing Bear you can watch the Glowing Bear user interface walkthrough video.
By default MedCo loads a specific test data, refer to Description of the default test data for expected results to queries.
To load a dataset, follow the guide Loading Data.
For reference, the database address (host) to use during loading is localhost:5432
and the databases i2b2medcosrv0
, i2b2medcosrv1
and i2b2medcosrv2
.
These pages explain how to deploy MedCo in different scenarios. Each deployment scenario corresponds to a deployment profile, as described below. All these instructions use the deployment scripts from the medco-deployment repository.
If you are new to MedCo…
… and want to try to deploy the system on a single machine to test it, you should should follow the Local Test Deployment guide.
… and want to create or join a MedCo network, you should follow the Network Test Deployment guide.
… and want to develop around MedCo, you should follow the Local Development Deployment guide.
Deployment Profiles¶
A deployment profile is composed of two things:
- a compose profile in
~/medco-deployment/compose-profiles/<profile name>/
: docker-compose file and parameters like ports to expose, log level, etc. - a configuration profile in
~/medco-deployment/configuration-profiles/<profile name>/
: files mounted in the docker containers, containing the cryptographic keys, the certificates, etc.
Some profiles are provided by default, for development or testing purposes.
Those should not be used in a production scenario with real data, as the private keys are set by default, thus not private.
Other types of profiles must generated using the scripts in ~/medco-deployment/resources/profile-generation-scripts/<profile name>/
.
The different profiles are the following:
test-local-3nodes (Local Test Deployment)
- for test on a single machine (used by the MedCo live demo)
- 3 nodes on any host
- using the latest release of the source codes
- no debug logging
- profile pre-generated
test-network (Network Test Deployment)
- for test on several different hosts
- a single node on a host part of a MedCo network
- using the latest release of the source codes
- no debug logging
- profile must be generated prior to use with the provided scripts
dev-local-3nodes (Local Development Deployment)
- for software development
- 3 nodes on the local host
- using development version of source codes
- debug logging enabled
- profile pre-generated
The database is pre-loaded with some encrypted test data using a key that is pre-generated from the combination of all the participating nodes’ public keys. For the test-network deployment profile this data will not be correctly encrypted, since the public key of each node is generated independently, and, as such, the data must be re-loaded.
Configuration¶
Keycloak Configuration¶
Here follows some MedCo-specific instructions for the administration of Keycloak. For anything, please refer to the Keycloak Server Administration Guide.
Accessing the web administration interface¶
In the case of the development profile dev-local-3nodes (i.e. without reverse proxy), the address is http://localhost:8081/auth/admin
.
In the other cases (with the reverse proxy), the address is http://<node domain name>/auth/admin
.
The credentials are :
- User
keycloak
- Password
keycloak
by default, or whatever else was configured at the initial deployment.
Disabling HTTPS requirement for external connections¶
When deploying the test-local-3nodes profile without HTTPS on a machine other than localhost
, the administration interface will refuse to load.
To solve this, access pgAdmin (see The PostgreSQL database) and execute the following SQL on the keycloak
database:
update REALM set ssl_required = 'NONE' where id = 'master';
You need to restart the Keycloak docker container to enable the changes.
Manually add an authorized user¶
- Go to the configuration panel Users, click on Add user.
- Fill the Username field, toggle to
ON
the Email Verified button and click Save. - In the next window, click on Credentials, enter twice the user’s password, toggle to
OFF
the Temporary button if desired and click Reset Password.
Add the default OpenID Connect client configuration for MedCo¶
- Go to the configuration panel Clients, click on Create.
- There specify in Client ID the value
i2b2-local
(or another value if previously configured) and click Save. - In the next window, fill Valid Redirect URIs and Web Origins according to the table below and click Save.
Deployment Profile | Valid Redirect URIs | Web Origins |
---|---|---|
test-local-3nodes | http(s)://<node domain name>/glowing-bear |
http(s)://<node domain name> |
test-network | https://<node domain name>/glowing-bear |
https://<node domain name> |
dev-local-3nodes | http://localhost:4200 |
http://localhost:4200 |
HTTPS Configuration¶
HTTPS is supported for the profiles test-local-3nodes and test-network.
Certificate¶
The certificates are held in the configuration profile folder (e.g, ~/medco-deployment/configuration-profiles/test-local-3nodes
):
- certificate.key: private key
- certificate.crt: certificate of own node
- srv0-certificate.crt, srv1-certificate.crt, …: certificates of all nodes of the network
Enable HTTPS for the Test Local Deployment¶
To enable HTTPS for the profile test-local-3nodes, replace the files certificate.key and certificate.crt from the configuration profile folder with your own versions. Such a certificate can be obtained for example through Let’s Encrypt.
Then edit the file .env
from the compose profile, replace the http
with https
, and restart the deployment.
Configure HTTPS for the Test Network Deployment¶
Coming soon
The PostgreSQL database¶
Administration with PgAdmin¶
PgAdmin can be accessed through http://<node domain name>/pgadmin
with username admin
and password admin
(by default). To access the test database just create a server with the name MedCo
, the address postgresql
, username postgres
and password postgres
.

Loading Data¶
v0 (Genomic Data)¶
The v0 loader expects an ontology, with mutation and clinical data in the MAF format.
As the ontology data you must use ~/medco-loader/data/genomic/tcga_cbio/clinical_data.csv
and ~/medco-loader/data/genomic/tcga_cbio/mutation_data.csv
.
For clinical data you can keep using the same two files or a subset of the data (e.g. 8_clinical_data.csv).
More information about how to generate sample datafiles can be found below.
After the following script is executed all the data is encrypted and ‘deterministically tagged’ in compliance with the MedCo data model.
Loading from the same host¶
If you using the same host machine to deploy and load the data you can use the following table bellow to adapt some of the script parameters depending on the deployment scenario.
This includes the scenario in test-network
where for each of the nodes you want to load data from its hosting machine.
You need to repeat the loading process for all nodes, by modifying the arguments “network”, “entryPointIdx” and “dbName”.
Deployment Profile | –network | –v (volumes) | –dbHost | –dbName |
---|---|---|---|---|
test-local-3nodes | test-local-3nodes_medco-network + test-local-3nodes_medco-srv<node index> |
~/medco-loader/data/genomic:/dataset + ~/medco-deployment/configuration-profiles/test-local-3nodes/group.toml:/group.toml |
postgresql |
i2b2medcosrv<node index> |
test-network | test-network-<network name>-node<node index>_default |
~/medco-loader/data/genomic:/dataset + ~/medco-deployment/configuration-profiles/test-network-<network name>-node<node index>/group.toml:/group.toml |
postgresql |
i2b2medco |
dev-local-3nodes | dev-local-3nodes_medco-network + dev-local-3nodes_medco-srv<node index> |
~/medco-loader/data/genomic:/dataset + ~/medco-deployment/configuration-profiles/dev-local-3nodes/group.toml:/group.toml |
postgresql |
i2b2medcosrv<node index> |
Loading from a different host¶
If you are using an external machine (e.g. your laptop) to load the data into one of the nodes you can use the following table bellow to adapt some of the script parameters depending on the deployment scenario. In this case you do not need to specify the --network
parameters.
You need to repeat the loading process for all nodes, by modifying the arguments “network”, “entryPointIdx” and “dbName”.
Deployment Profile | –v (volumes) | –dbHost | –dbName |
---|---|---|---|
test-local-3nodes | ~/medco-loader/data/genomic:/dataset + ~/medco-deployment/configuration-profiles/test-local-3nodes/group.toml:/group.toml |
<domain name> |
i2b2medcosrv<node index> |
test-network | ~/medco-loader/data/genomic:/dataset + ~/medco-deployment/configuration-profiles/test-network-<network name>-node<node index>/group.toml:/group.toml |
<domain name> |
i2b2medco |
Example¶
The following example allows to load data into a running MedCo development deployment (dev-local-3nodes), on the node 0.
Adapt accordingly arguments network
, entryPointIdx
and dbName
for the 2 other nodes.
cd ~/medco-loader/deployment
docker run --network="dev-local-3nodes_medco-network" --network="dev-local-3nodes_medco-srv0" \
-v ~/medco-loader/data/genomic:/dataset \
-v ~/medco-deployment/configuration-profiles/dev-local-3nodes/group.toml:/group.toml \
medco/medco-loader:v0.1.1 medco-loader -debug 2 v0 --group /group.toml --entryPointIdx 0 \
--ont_clinical /dataset/tcga_cbio/8_clinical_data.csv --sen /dataset/sensitive.txt \
--ont_genomic /dataset/tcga_cbio/8_mutation_data.csv --clinical /dataset/tcga_cbio/8_clinical_data.csv \
--genomic /dataset/tcga_cbio/8_mutation_data.csv --output /dataset/ --dbHost localhost --dbPort 5432 \
--dbName i2b2medcosrv0 --dbUser i2b2 --dbPassword i2b2
Explanation of the arguments:
NAME:
medco-loader v0 - Load genomic data (e.g. tcga_bio dataset)
USAGE:
medco-loader v0 [command options] [arguments...]
OPTIONS:
--group value, -g value UnLynx group definition file
--entryPointIdx value, --entry value Index (relative to the group definition file) of the collective authority server to load the data
--sensitive value, --sen value File containing a list of sensitive concepts
--dbHost value, --dbH value Database hostname
--dbPort value, --dbP value Database port (default: 0)
--dbName value, --dbN value Database name
--dbUser value, --dbU value Database user
--dbPassword value, --dbPw value Database password
--ont_clinical value, --oc value Clinical ontology to load
--ont_genomic value, --og value Genomic ontology to load
--clinical value, --cl value Clinical file to load
--genomic value, --gen value Genomic file to load
--output value, -o value Output path to the .csv files
Data Manipulation¶
Inside ~/medco-loader/data/scripts/
you can find a small python application to extract (or replicate) data out of the original tcga_cbio dataset.
You can decide which patients you want to consider for you ‘new’ dataset or simply randomly pick a sample.
To check that it is working you can query for:
-> MedCo Gemomic Ontology -> Gene Name -> BRPF3
For the small dataset ``8_xxxx``you should obtain 3 matching subjects (one at each site).
v1 (I2B2 Demodata)¶
The v1 loader expects an already existing i2b2 database (in .csv format) that will be converted in a way that is compliant with the MedCo data model. This involves encrypting and ‘deterministically tagging’ some of the data.
List of input (‘original’) files:
- all i2b2metadata files (e.g. i2b2.csv)
- dummy_to_patient.csv
- patient_dimension.csv
- visit_dimension.csv
- concept_dimension.csv
- modifier_dimension.csv
- observation_fact.csv
- table_access.csv
Loading in the same host¶
If you using the same host machine to deploy and load the data you can use the following table bellow to adapt some of the script parameters depending on the deployment scenario.
This includes the scenario in test-network
where for each of the nodes you want to load data from its hosting machine.
You need to repeat the loading process for all nodes, by modifying the arguments “network”, “entryPointIdx” and “dbName”.
Deployment Profile | –network | –v (volumes) | –dbHost | –dbName |
---|---|---|---|---|
test-local-3nodes | test-local-3nodes_medco-network + test-local-3nodes_medco-srv<node index> |
~/medco-loader/data/i2b2:/dataset + ~/medco-deployment/configuration-profiles/test-local-3nodes/group.toml:/group.toml |
postgresql |
i2b2medcosrv<node index> |
test-network | test-network-<network name>-node<node index>_default |
~/medco-loader/data/i2b2:/dataset + ~/medco-deployment/configuration-profiles/test-network-<network name>-node<node index>/group.toml:/group.toml |
postgresql |
i2b2medco |
dev-local-3nodes | dev-local-3nodes_medco-network + dev-local-3nodes_medco-srv<node index> |
~/medco-loader/data/i2b2:/dataset + ~/medco-deployment/configuration-profiles/dev-local-3nodes/group.toml:/group.toml |
postgresql |
i2b2medcosrv<node index> |
Loading in a different host¶
If you are using an external machine (e.g. your laptop) to load the data into one of the nodes you can use the following table bellow to adapt some of the script parameters depending on the deployment scenario. In this case you do not need to specify the --network
parameters.
You need to repeat the loading process for all nodes, by modifying the arguments “network”, “entryPointIdx” and “dbName”.
Deployment Profile | –v (volumes) | –dbHost | –dbName |
---|---|---|---|
test-local-3nodes | ~/medco-loader/data/i2b2:/dataset + ~/medco-deployment/configuration-profiles/test-local-3nodes/group.toml:/group.toml |
<domain name> |
i2b2medcosrv<node index> |
test-network | ~/medco-loader/data/i2b2:/dataset + ~/medco-deployment/configuration-profiles/test-network-<network name>-node<node index>/group.toml:/group.toml |
<domain name> |
i2b2medco |
Dummy Generation¶
The provided example data set files come with dummy data pre-generated.
Those data are random dummy entries whose purpose is to prevent frequency attacks.
For more information on how this dummy generation is done please refer to ~/medco-loader/data/scripts/import-tool/report/report.pdf
.
In a future release, the generation will be done dynamically by the loader.
Example¶
The following example allows to load data into a running MedCo development deployment (dev-local-3nodes), on the node 0.
Adapt accordingly arguments network
, entryPointIdx
and dbName
for the 2 other nodes.
cd ~/medco-loader/deployment
docker run --network="dev-local-3nodes_medco-network" --network="dev-local-3nodes_medco-srv0" \
-v ~/medco-loader/data/i2b2:/dataset -v ~/medco-deployment/configuration-profiles/dev-local-3nodes/group.toml:/group.toml \
medco/medco-loader:v0.1.1 medco-loader -debug 2 v1 --group /group.toml --entryPointIdx 0 --sen /dataset/sensitive.txt \
--files /dataset/files.toml --dbHost localhost --dbPort 5432 --dbName i2b2medcosrv0 --dbUser i2b2 --dbPassword i2b2
NAME:
medco-loader v1 - Convert existing i2b2 data model
USAGE:
medco-loader v1 [command options] [arguments...]
OPTIONS:
--group value, -g value UnLynx group definition file
--entryPointIdx value, --entry value Index (relative to the group definition file) of the collective authority server to load the data
--sensitive value, --sen value File containing a list of sensitive concepts
--dbHost value, --dbH value Database hostname
--dbPort value, --dbP value Database port (default: 0)
--dbName value, --dbN value Database name
--dbUser value, --dbU value Database user
--dbPassword value, --dbPw value Database password
--files value, -f value Configuration toml with the path of the all the necessary i2b2 files
--empty, -e Empty patient and visit dimension tables (y/n)
To check that it is working you can query for:
-> Diagnoses -> Neoplasm -> Benign neoplasm -> Benign neoplasm of breast
You should obtain 2 matching subjects.
The current version offers two different loading alternatives: (v0) loading of clinical and genomic data based on MAF datasets; and (v1) loading of generic i2b2 data. Currently these two loaders support each one dataset:
- v0: a genomic dataset (tcga_cbio publicly available in cBioPortal)
- v1: the i2b2 demodata.
Future releases of this software will allow for other arbitrary data sources, given that they follow a specific structure (e.g. BAM format).
Pre-Requisites¶
First get the repository containing the MedCo loader software, which already contains some test data for you to work with. Not that you need git-lfs for those data to be retrieved with the repository.
$ cd ~
$ git clone -b v0.1.1 https://github.com/lca1/medco-loader.git
Building Application
To get the MedCo loader application, pull it with Docker:
docker pull medco/medco-loader:v0.1.1
Network Architecture¶

External Entities¶
Entities that need to connect to a machine running MedCo can be categorized as follow:
- System administrators: Persons administrating the MedCo node. Likely to remain inside the clinical site internal network.
- End-users: Researchers using MedCo to access the shared. Likely to remain inside the clinical site internal network.
- Other MedCo nodes: MedCo nodes belonging to other clinical sites of the network.
Firewall Ports Opening¶
The following ports should be accessible by the listed entities, which makes IP address white-listing possible:
- Port 22, 5432 (TCP): System Administrators
- Port 80 (TCP): End-Users (HTTP automatic redirect to HTTPS (443))
- Port 443 (TCP): System Administrators, End-Users, Other MedCo Nodes
- Ports 2000-2001 (TCP): Other MedCo Nodes
This guide explains the deployment and configuration of MedCo instances.
Developer Guide¶
System Architecture¶

Containers¶
medco-unlynx¶
The software executing the distributed cryptographic protocols, based on Unlynx.
i2b2-medco¶
The i2b2 stack (all the cells), with the addition of the MedCo i2b2 cell to process the queries. This cell communicates with medco-unlynx to execute the distributed cryptographic protools.
irct¶
The query translation and broadcasting layer.
glowing-bear¶
Nginx web server serving Glowing Bear and the crypto module.
keycloak¶
OpenID Connect identity provider.
postgresql¶
The SQL database used by all other services, contains all the data.
pg-admin¶
A web-based administration tool for the PostgreSQL database.
nginx¶
Web server and (HTTPS-enabled) reverse proxy.
php-fpm¶
PHP processor running with FPM (FastCGI Process Manager), used by Nginx. Executes the PHP code needed to serve the genomic annotations.
Description of the default test data¶
Coming soon
If you are interested in developing around MedCo, the first thing you might want to do is to follow the Local Development Deployment guide to set up the development version of MedCo.
User Guide¶
Coming soon
Disclaimer: MedCo is still an experimental software under development and should not, at this point, use real sensitive data.
Releases¶
0.1.1, 23rd Jan. 2019Deployment for test purposes on several machines, enhancements of documentation and deployment infrastructure, Nginx reverse proxy with HTTPS support, Keycloak update. 0.1, 1st Dec. 2018First public release of MedCo, running with i2b2 v1.7, PIC-SURE/IRCT v1.4 and centralized OpenID Connect authentication. Deployment for development and test purpose on a single machine.
Resources¶
- MedCo software repositories
- I2B2 Cell: https://github.com/lca1/medco-i2b2-cell
- Unlynx Wrapper: https://github.com/lca1/medco-unlynx
- Unlynx Javascript Crypto Library: https://github.com/lca1/medco-unlynx-js
- Data Loader: https://github.com/lca1/medco-loader
- MedCo software forked repositories
- IRCT: https://github.com/lca1/IRCT (forked from https://github.com/hms-dbmi/IRCT)
- Glowing Bear: https://github.com/lca1/glowing-bear-medco (forked from https://github.com/thehyve/glowing-bear)
- Other repositories
- Deployment: https://github.com/lca1/medco-deployment
- Documentation: https://github.com/lca1/medco-documentation
- Other resources
- Docker Hub organization: https://hub.docker.com/u/medco/
- NPM.js organization: https://www.npmjs.com/package/@medco/unlynx-crypto-js-lib
Contact¶
For assistance with deploying MedCo or any other technical questions, send an email at medco-dev@listes.epfl.ch or any of the contributors.
- Mickaël Misbach (Privacy and Security Software Engineer, EPFL) - mickael.misbach@epfl.ch
- Joao Andre Sa (Privacy and Security Software Engineer, EPFL) - joao.gomesdesaesousa@epfl.ch
- Jean Louis Raisaro (Data Protection Specialist, CHUV) - jean.raisaro@chuv.ch
- Juan Troncoso-Pastoriza (Post-Doctoral Researcher, EPFL) - juan.troncoso-pastoriza@epfl.ch
- Jean-Pierre Hubaux (Professor, EPFL) - jean-pierre.hubaux@epfl.ch
License¶
MedCo is licensed under a End User Software License Agreement (‘EULA’) for non-commercial use. If you need more information, please contact us.