The Swedish Frequency Database

Code

Backend

application

class ApproveUser(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.AdminHandler

post(dataset, email)[source]
class Collection(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

get(dataset, ds_version=None)[source]
class CountryList(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

country_list
get()[source]
class DatasetFiles(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.AuthorizedHandler

get(dataset, ds_version=None)[source]
class DatasetUsersCurrent(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.AdminHandler

get(dataset)[source]
class DatasetUsersPending(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.AdminHandler

get(dataset)[source]

Bases: handlers.AuthorizedHandler

post(dataset, ds_version=None)[source]
class GetDataset(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

get(dataset, version=None)[source]
class GetSchema(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Returns the schema.org, and bioschemas.org, annotation for a given url.

This function behaves quite differently from the rest of the application as the structured data testing tool had trouble catching the schema inject when it went through AngularJS. The solution for now has been to make this very general function that “re-parses” the ‘url’ request parameter to figure out what information to return.

get()[source]
class GetUser(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

get()[source]
class ListDatasetVersions(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

get(dataset)[source]
class ListDatasets(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

get()[source]
class LogEvent(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.SafeHandler

post(dataset, event, target)[source]
class QuitHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

get()[source]
class RequestAccess(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.SafeHandler

post(dataset)[source]
class RevokeUser(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.AdminHandler

post(dataset, email)[source]
class SFTPAccess(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.SafeHandler

Creates, or re-enables, sFTP users in the database.

generate_password(size=12)[source]

Generates a password of length ‘size’, comprised of random lowercase and uppercase letters, and numbers.

get()[source]

Returns sFTP credentials for the current user.

post()[source]

Handles generation of new credentials. This function either creates a new set of sftp credentials for a user, or updates the old ones with a new password and expiry date.

Bases: handlers.UnsafeHandler

get(dataset)[source]
class UserDatasetAccess(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.SafeHandler

get()[source]
build_dataset_structure(dataset_version, user=None, dataset=None)[source]
format_bytes(nbytes)[source]

auth

class DeveloperLoginHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.BaseHandler

get()[source]
class DeveloperLogoutHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.BaseHandler

get()[source]
class ElixirLoginHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.BaseHandler, tornado.auth.OAuth2Mixin

get()[source]
get_user(access_token)[source]
get_user_token(code)[source]
class ElixirLogoutHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.BaseHandler

get()[source]

db

class BaseModel(*args, **kwargs)[source]

Bases: peewee.Model

DoesNotExist

alias of BaseModelDoesNotExist

id = <AutoField: BaseModel.id>
class BeaconCounts(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of BeaconCountsDoesNotExist

callcount = <IntegerField: BeaconCounts.callcount>
datasetid = <CharField: BeaconCounts.datasetid>
variantcount = <IntegerField: BeaconCounts.variantcount>
class Collection(*args, **kwargs)[source]

Bases: backend.db.BaseModel

A collection is a source of data which can be sampled into a SampleSet.

DoesNotExist

alias of CollectionDoesNotExist

ethnicity = <CharField: Collection.ethnicity>
id = <AutoField: Collection.id>
name = <CharField: Collection.name>
sample_sets
class Coverage(*args, **kwargs)[source]

Bases: backend.db.BaseModel

Coverage statistics are pre-calculated for each variant for a given dataset.

The fields show the fraction of a population that reaches the mapping coverages given by the variable names.

ex. cov20 = 0.994 means that 99.4% of the population had at a mapping
coverage of at least 20 in this position.
DoesNotExist

alias of CoverageDoesNotExist

chrom = <CharField: Coverage.chrom>
coverage = <ArrayField: Coverage.coverage>
dataset_version = <ForeignKeyField: Coverage.dataset_version>
dataset_version_id = <ForeignKeyField: Coverage.dataset_version>
id = <AutoField: Coverage.id>
mean = <FloatField: Coverage.mean>
median = <FloatField: Coverage.median>
pos = <IntegerField: Coverage.pos>
class Dataset(*args, **kwargs)[source]

Bases: backend.db.BaseModel

A dataset is part of a study, and usually include a certain population.

Most studies only have a single dataset, but multiple are allowed.

DoesNotExist

alias of DatasetDoesNotExist

access
access_current
access_logs
access_pending
avg_seq_depth = <FloatField: Dataset.avg_seq_depth>
beacon_uri = <CharField: Dataset.beacon_uri>
browser_uri = <CharField: Dataset.browser_uri>
current_version
dataset_size = <IntegerField: Dataset.dataset_size>
description = <TextField: Dataset.description>
full_name = <CharField: Dataset.full_name>
has_image()[source]
id = <AutoField: Dataset.id>
sample_sets
seq_center = <CharField: Dataset.seq_center>
seq_tech = <CharField: Dataset.seq_tech>
seq_type = <CharField: Dataset.seq_type>
short_name = <CharField: Dataset.short_name>
study = <ForeignKeyField: Dataset.study>
study_id = <ForeignKeyField: Dataset.study>
versions
class DatasetAccess(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of DatasetAccessDoesNotExist

dataset = <ForeignKeyField: DatasetAccess.dataset>
dataset_id = <ForeignKeyField: DatasetAccess.dataset>
id = <AutoField: DatasetAccess.id>
is_admin = <BooleanField: DatasetAccess.is_admin>
user = <ForeignKeyField: DatasetAccess.user>
user_id = <ForeignKeyField: DatasetAccess.user>
wants_newsletter = <BooleanField: DatasetAccess.wants_newsletter>
class DatasetAccessCurrent(*args, **kwargs)[source]

Bases: backend.db.DatasetAccess

DoesNotExist

alias of DatasetAccessCurrentDoesNotExist

access_requested = <DateTimeField: DatasetAccessCurrent.access_requested>
dataset = <ForeignKeyField: DatasetAccessCurrent.dataset>
dataset_id = <ForeignKeyField: DatasetAccessCurrent.dataset>
has_access = <IntegerField: DatasetAccessCurrent.has_access>
id = <AutoField: DatasetAccessCurrent.id>
is_admin = <BooleanField: DatasetAccessCurrent.is_admin>
user = <ForeignKeyField: DatasetAccessCurrent.user>
user_id = <ForeignKeyField: DatasetAccessCurrent.user>
wants_newsletter = <BooleanField: DatasetAccessCurrent.wants_newsletter>
class DatasetAccessPending(*args, **kwargs)[source]

Bases: backend.db.DatasetAccess

DoesNotExist

alias of DatasetAccessPendingDoesNotExist

access_requested = <DateTimeField: DatasetAccessPending.access_requested>
dataset = <ForeignKeyField: DatasetAccessPending.dataset>
dataset_id = <ForeignKeyField: DatasetAccessPending.dataset>
has_access = <IntegerField: DatasetAccessPending.has_access>
id = <AutoField: DatasetAccessPending.id>
is_admin = <BooleanField: DatasetAccessPending.is_admin>
user = <ForeignKeyField: DatasetAccessPending.user>
user_id = <ForeignKeyField: DatasetAccessPending.user>
wants_newsletter = <BooleanField: DatasetAccessPending.wants_newsletter>
class DatasetFile(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of DatasetFileDoesNotExist

dataset_version = <ForeignKeyField: DatasetFile.dataset_version>
dataset_version_id = <ForeignKeyField: DatasetFile.dataset_version>
download_logs
file_size = <IntegerField: DatasetFile.file_size>
id = <AutoField: DatasetFile.id>
name = <CharField: DatasetFile.name>
uri = <CharField: DatasetFile.uri>

Bases: backend.db.BaseModel

DoesNotExist

alias of DatasetLogoDoesNotExist

data = <BlobField: DatasetLogo.data>
dataset = <ForeignKeyField: DatasetLogo.dataset>
dataset_id = <ForeignKeyField: DatasetLogo.dataset>
id = <AutoField: DatasetLogo.id>
mimetype = <CharField: DatasetLogo.mimetype>
class DatasetVersion(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of DatasetVersionDoesNotExist

available_from = <DateTimeField: DatasetVersion.available_from>
beacon_access = <EnumField: DatasetVersion.beacon_access>
consent_logs
coverage_levels = <ArrayField: DatasetVersion.coverage_levels>
coverage_set
data_contact_name = <CharField: DatasetVersion.data_contact_name>
dataset = <ForeignKeyField: DatasetVersion.dataset>
dataset_id = <ForeignKeyField: DatasetVersion.dataset>
description = <TextField: DatasetVersion.description>
file_access = <EnumField: DatasetVersion.file_access>
files
id = <AutoField: DatasetVersion.id>
mate
metrics_set
num_variants = <IntegerField: DatasetVersion.num_variants>
portal_avail = <BooleanField: DatasetVersion.portal_avail>
ref_doi = <CharField: DatasetVersion.ref_doi>
reference_set = <ForeignKeyField: DatasetVersion.reference_set>
reference_set_id = <ForeignKeyField: DatasetVersion.reference_set>
terms = <TextField: DatasetVersion.terms>
variants
version = <CharField: DatasetVersion.version>
class DatasetVersionCurrent(*args, **kwargs)[source]

Bases: backend.db.DatasetVersion

DoesNotExist

alias of DatasetVersionCurrentDoesNotExist

available_from = <DateTimeField: DatasetVersionCurrent.available_from>
beacon_access = <EnumField: DatasetVersionCurrent.beacon_access>
coverage_levels = <ArrayField: DatasetVersionCurrent.coverage_levels>
data_contact_name = <CharField: DatasetVersionCurrent.data_contact_name>
dataset = <ForeignKeyField: DatasetVersionCurrent.dataset>
dataset_id = <ForeignKeyField: DatasetVersionCurrent.dataset>
description = <TextField: DatasetVersionCurrent.description>
file_access = <EnumField: DatasetVersionCurrent.file_access>
id = <AutoField: DatasetVersionCurrent.id>
num_variants = <IntegerField: DatasetVersionCurrent.num_variants>
portal_avail = <BooleanField: DatasetVersionCurrent.portal_avail>
ref_doi = <CharField: DatasetVersionCurrent.ref_doi>
reference_set = <ForeignKeyField: DatasetVersionCurrent.reference_set>
reference_set_id = <ForeignKeyField: DatasetVersionCurrent.reference_set>
terms = <TextField: DatasetVersionCurrent.terms>
version = <CharField: DatasetVersionCurrent.version>
class EnumField(choices=None, *args, **kwargs)[source]

Bases: peewee.Field

db_field = 'string'
db_value(value)[source]
python_value(value)[source]
class Feature(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of FeatureDoesNotExist

chrom = <CharField: Feature.chrom>
feature_type = <CharField: Feature.feature_type>
gene = <ForeignKeyField: Feature.gene>
gene_id = <ForeignKeyField: Feature.gene>
id = <AutoField: Feature.id>
start = <IntegerField: Feature.start>
stop = <IntegerField: Feature.stop>
strand = <EnumField: Feature.strand>
transcript = <ForeignKeyField: Feature.transcript>
transcript_id = <ForeignKeyField: Feature.transcript>
class Gene(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of GeneDoesNotExist

canonical_transcript = <CharField: Gene.canonical_transcript>
chrom = <CharField: Gene.chrom>
exons
full_name = <CharField: Gene.full_name>
gene_id = <CharField: Gene.gene_id>
id = <AutoField: Gene.id>
name = <CharField: Gene.name>
other_names
reference_set = <ForeignKeyField: Gene.reference_set>
reference_set_id = <ForeignKeyField: Gene.reference_set>
start = <IntegerField: Gene.start>
stop = <IntegerField: Gene.stop>
strand = <EnumField: Gene.strand>
transcripts
variants
class GeneOtherNames(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of GeneOtherNamesDoesNotExist

gene = <ForeignKeyField: GeneOtherNames.gene>
gene_id = <ForeignKeyField: GeneOtherNames.gene>
id = <AutoField: GeneOtherNames.id>
name = <CharField: GeneOtherNames.name>
class Linkhash(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of LinkhashDoesNotExist

dataset_version = <ForeignKeyField: Linkhash.dataset_version>
dataset_version_id = <ForeignKeyField: Linkhash.dataset_version>
expires_on = <DateTimeField: Linkhash.expires_on>
hash = <CharField: Linkhash.hash>
id = <AutoField: Linkhash.id>
user = <ForeignKeyField: Linkhash.user>
user_id = <ForeignKeyField: Linkhash.user>
class Metrics(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of MetricsDoesNotExist

dataset_version = <ForeignKeyField: Metrics.dataset_version>
dataset_version_id = <ForeignKeyField: Metrics.dataset_version>
hist = <ArrayField: Metrics.hist>
id = <AutoField: Metrics.id>
metric = <CharField: Metrics.metric>
mids = <ArrayField: Metrics.mids>
class ReferenceSet(*args, **kwargs)[source]

Bases: backend.db.BaseModel

The gencode, ensembl, dbNSFP and omim data are combined to fill out the Gene, Transcript and Feature tables. DbSNP data is separate, and may be shared between reference sets, so it uses a foreign key instead.

DoesNotExist

alias of ReferenceSetDoesNotExist

current_version
dataset_versions
dbnsfp_version = <CharField: ReferenceSet.dbnsfp_version>
ensembl_version = <CharField: ReferenceSet.ensembl_version>
gencode_version = <CharField: ReferenceSet.gencode_version>
genes
id = <AutoField: ReferenceSet.id>
name = <CharField: ReferenceSet.name>
reference_build = <CharField: ReferenceSet.reference_build>
class SFTPUser(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of SFTPUserDoesNotExist

account_expires = <DateTimeField: SFTPUser.account_expires>
id = <AutoField: SFTPUser.id>
password_hash = <CharField: SFTPUser.password_hash>
user = <ForeignKeyField: SFTPUser.user>
user_id = <ForeignKeyField: SFTPUser.user>
user_name = <CharField: SFTPUser.user_name>
user_uid = <IntegerField: SFTPUser.user_uid>
class SampleSet(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of SampleSetDoesNotExist

collection = <ForeignKeyField: SampleSet.collection>
collection_id = <ForeignKeyField: SampleSet.collection>
dataset = <ForeignKeyField: SampleSet.dataset>
dataset_id = <ForeignKeyField: SampleSet.dataset>
id = <AutoField: SampleSet.id>
phenotype = <CharField: SampleSet.phenotype>
sample_size = <IntegerField: SampleSet.sample_size>
class Study(*args, **kwargs)[source]

Bases: backend.db.BaseModel

A study is a scientific study with a PI and a description, and may include one or more datasets.

DoesNotExist

alias of StudyDoesNotExist

contact_email = <CharField: Study.contact_email>
contact_name = <CharField: Study.contact_name>
datasets
description = <TextField: Study.description>
id = <AutoField: Study.id>
pi_email = <CharField: Study.pi_email>
pi_name = <CharField: Study.pi_name>
publication_date = <DateTimeField: Study.publication_date>
ref_doi = <CharField: Study.ref_doi>
title = <CharField: Study.title>
class Transcript(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of TranscriptDoesNotExist

chrom = <CharField: Transcript.chrom>
gene = <ForeignKeyField: Transcript.gene>
gene_id = <ForeignKeyField: Transcript.gene>
id = <AutoField: Transcript.id>
mim_annotation = <CharField: Transcript.mim_annotation>
mim_gene_accession = <IntegerField: Transcript.mim_gene_accession>
start = <IntegerField: Transcript.start>
stop = <IntegerField: Transcript.stop>
strand = <EnumField: Transcript.strand>
transcript_id = <CharField: Transcript.transcript_id>
transcripts
variants
class User(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of UserDoesNotExist

access_current
access_logs
access_pending
affiliation = <CharField: User.affiliation>
consent_logs
country = <CharField: User.country>
dataset_access
download_logs
email = <CharField: User.email>
has_access(dataset, ds_version=None)[source]

Check whether user has permission to access a dataset

Parameters:
  • dataset (Database) – peewee Database object
  • ds_version (str) – the dataset version
Returns:

allowed to access

Return type:

bool

has_requested_access(dataset)[source]
id = <AutoField: User.id>
identity = <CharField: User.identity>
identity_type = <EnumField: User.identity_type>
is_admin(dataset)[source]
name = <CharField: User.name>
sftp_user
class UserAccessLog(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of UserAccessLogDoesNotExist

action = <EnumField: UserAccessLog.action>
dataset = <ForeignKeyField: UserAccessLog.dataset>
dataset_id = <ForeignKeyField: UserAccessLog.dataset>
id = <AutoField: UserAccessLog.id>
ts = <DateTimeField: UserAccessLog.ts>
user = <ForeignKeyField: UserAccessLog.user>
user_id = <ForeignKeyField: UserAccessLog.user>
class UserConsentLog(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of UserConsentLogDoesNotExist

dataset_version = <ForeignKeyField: UserConsentLog.dataset_version>
dataset_version_id = <ForeignKeyField: UserConsentLog.dataset_version>
id = <AutoField: UserConsentLog.id>
ts = <DateTimeField: UserConsentLog.ts>
user = <ForeignKeyField: UserConsentLog.user>
user_id = <ForeignKeyField: UserConsentLog.user>
class UserDownloadLog(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of UserDownloadLogDoesNotExist

dataset_file = <ForeignKeyField: UserDownloadLog.dataset_file>
dataset_file_id = <ForeignKeyField: UserDownloadLog.dataset_file>
id = <AutoField: UserDownloadLog.id>
ts = <DateTimeField: UserDownloadLog.ts>
user = <ForeignKeyField: UserDownloadLog.user>
user_id = <ForeignKeyField: UserDownloadLog.user>
class Variant(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of VariantDoesNotExist

allele_count = <IntegerField: Variant.allele_count>
allele_freq = <FloatField: Variant.allele_freq>
allele_num = <IntegerField: Variant.allele_num>
alt = <CharField: Variant.alt>
chrom = <CharField: Variant.chrom>
dataset_version = <ForeignKeyField: Variant.dataset_version>
dataset_version_id = <ForeignKeyField: Variant.dataset_version>
filter_string = <CharField: Variant.filter_string>
genes
hom_count = <IntegerField: Variant.hom_count>
id = <AutoField: Variant.id>
orig_alt_alleles = <ArrayField: Variant.orig_alt_alleles>
pos = <IntegerField: Variant.pos>
quality_metrics = <BinaryJSONField: Variant.quality_metrics>
ref = <CharField: Variant.ref>
rsid = <IntegerField: Variant.rsid>
site_quality = <FloatField: Variant.site_quality>
transcripts
variant_id = <CharField: Variant.variant_id>
vep_annotations = <BinaryJSONField: Variant.vep_annotations>
class VariantGenes(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of VariantGenesDoesNotExist

gene = <ForeignKeyField: VariantGenes.gene>
gene_id = <ForeignKeyField: VariantGenes.gene>
id = <AutoField: VariantGenes.id>
variant = <ForeignKeyField: VariantGenes.variant>
variant_id = <ForeignKeyField: VariantGenes.variant>
class VariantMate(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of VariantMateDoesNotExist

allele_count = <IntegerField: VariantMate.allele_count>
allele_freq = <FloatField: VariantMate.allele_freq>
allele_num = <IntegerField: VariantMate.allele_num>
alt = <CharField: VariantMate.alt>
chrom = <CharField: VariantMate.chrom>
chrom_id = <CharField: VariantMate.chrom_id>
dataset_version = <ForeignKeyField: VariantMate.dataset_version>
dataset_version_id = <ForeignKeyField: VariantMate.dataset_version>
id = <AutoField: VariantMate.id>
mate_chrom = <CharField: VariantMate.mate_chrom>
mate_id = <CharField: VariantMate.mate_id>
mate_start = <IntegerField: VariantMate.mate_start>
pos = <IntegerField: VariantMate.pos>
ref = <CharField: VariantMate.ref>
variant_id = <CharField: VariantMate.variant_id>
class VariantTranscripts(*args, **kwargs)[source]

Bases: backend.db.BaseModel

DoesNotExist

alias of VariantTranscriptsDoesNotExist

id = <AutoField: VariantTranscripts.id>
transcript = <ForeignKeyField: VariantTranscripts.transcript>
transcript_id = <ForeignKeyField: VariantTranscripts.transcript>
variant = <ForeignKeyField: VariantTranscripts.variant>
variant_id = <ForeignKeyField: VariantTranscripts.variant>
build_dict_from_row(row)[source]
get_admin_datasets(user)[source]

Get a list of datasets where user is admin

Parameters:user (User) – Peewee User object for the user of interest
Returns:
Return type:DataSetAccess
get_dataset(dataset: str)[source]

Given dataset name get Dataset

Parameters:dataset (str) – short name of the dataset
Returns:the corresponding DatasetVersion entry
Return type:Dataset
get_dataset_version(dataset: str, version: str = None)[source]

Given dataset get DatasetVersion

Parameters:dataset (str) – short name of the dataset
Returns:the corresponding DatasetVersion entry
Return type:DatasetVersion
get_next_free_uid()[source]

Get the next free uid >= 10000 and > than the current uids from the sftp_user table in the db.

Returns:the next free uid
Return type:int

handlers

class AdminHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.SafeHandler

prepare()[source]

This method is called before any other method. Having the decorator @tornado.web.authenticated here implies that all the Handlers that inherit from this one are going to require authentication in all their methods.

class AngularTemplate(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.UnsafeHandler

get(path)[source]
initialize(path)[source]
class AuthorizedHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.SafeHandler

prepare()[source]

This method is called before any other method. Having the decorator @tornado.web.authenticated here implies that all the Handlers that inherit from this one are going to require authentication in all their methods.

class AuthorizedStaticNginxFileHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.AuthorizedHandler, backend.handlers.BaseStaticNginxFileHandler

Serve static files for authenticated users from the nginx frontend

Requires a “path” argument in constructor which should be the root of the nginx frontend where the files can be found. Then configure the nginx frontend something like this:

location <path> {
    internal;
    alias <location of files>;
}
class BaseHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: tornado.web.RequestHandler

Base Handler. Handlers should not inherit from this class directly but from either SafeHandler or UnsafeHandler to make security status explicit.

get_current_user()[source]

Override to determine the current user from, e.g., a cookie.

This method may not be a coroutine.

on_finish()[source]

Called after the end of a request.

Override this method to perform cleanup, logging, etc. This method is a counterpart to prepare. on_finish may not produce any output, as it is called after the response has been sent to the client.

prepare()[source]

Called at the beginning of a request before get/post/etc.

Override this method to perform common initialization regardless of the request method.

Asynchronous support: Use async def or decorate this method with .gen.coroutine to make it asynchronous. If this method returns an Awaitable execution will not proceed until the Awaitable is done.

New in version 3.1: Asynchronous support.

set_user_msg(msg, level='info')[source]

This function sets the user message cookie. The system takes four default levels, ‘success’, ‘info’, ‘warning’, and ‘error’. Messages set to other levels will be defaulted to ‘info’.

write(chunk)[source]

Writes the given chunk to the output buffer.

To write the output to the network, use the flush() method below.

If the given chunk is a dictionary, we write it as JSON and set the Content-Type of the response to be application/json. (if you want to send JSON as a different Content-Type, call set_header after calling write()).

Note that lists are not converted to JSON because of a potential cross-site security vulnerability. All JSON output should be wrapped in a dictionary. More details at http://haacked.com/archive/2009/06/25/json-hijacking.aspx/ and https://github.com/facebook/tornado/issues/1009

write_error(status_code, **kwargs)[source]

Overwrites write_error method to have custom error pages. http://tornado.readthedocs.org/en/latest/web.html#tornado.web.RequestHandler.write_error

class BaseStaticNginxFileHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.UnsafeHandler

Serve static files for users from the nginx frontend

Requires a path argument in constructor which should be the root of the nginx frontend where the files can be found. Then configure the nginx frontend something like this:

location <path> {
    internal;
    alias <location of files>;
}
get(dataset, file, ds_version=None, user=None)[source]
initialize(path)[source]
class SafeHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.BaseHandler

All handlers that need authentication and authorization should inherit from this class.

prepare()[source]

This method is called before any other method. Having the decorator @tornado.web.authenticated here implies that all the Handlers that inherit from this one are going to require authentication in all their methods.

class SafeStaticFileHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: tornado.web.StaticFileHandler, backend.handlers.SafeHandler

Serve static files for logged in users

class TemporaryStaticNginxFileHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.BaseStaticNginxFileHandler

get(dataset, ds_version, hash_value, file)[source]
class UnsafeHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: backend.handlers.BaseHandler

route

class Application(settings)[source]

Bases: tornado.web.Application

settings

test

class RequestTests(methodName='runTest')[source]

Bases: unittest.case.TestCase

HOST = 'http://localhost:4000'
assertHTTPCode(path, code=200, method='get', *args, **kwargs)[source]
cookies
destroySession()[source]
get(path, *args, **kwargs)[source]
getUrl(path)[source]
login_user(user)[source]
newSession()[source]
post(path, *args, **kwargs)[source]
session
class TestAdminAccess(methodName='runTest')[source]

Bases: backend.test.RequestTests

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

test_admin_is_admin()[source]
test_admin_list_users()[source]
test_admin_list_users_get_data()[source]
test_admin_list_users_only_own_project_1()[source]
test_admin_list_users_only_own_project_2()[source]
test_login_admin()[source]
class TestEndpoints(methodName='runTest')[source]

Bases: backend.test.RequestTests

test_dataset_collection()[source]
test_datasets()[source]
test_get_one_version()[source]
test_get_users_current()[source]
test_get_users_pending()[source]
test_get_versions()[source]
test_list_files()[source]
test_one_dataset()[source]
test_request_access()[source]
test_user_datasets()[source]
test_user_me()[source]
class TestLoggedInUser(methodName='runTest')[source]

Bases: backend.test.RequestTests

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

testLoggedInFiles()[source]
testLoggedInTempLinkGet()[source]
testLoggedInTempLinkPost()[source]
testLoggedInTempLinkPostXSRF1()[source]
testLoggedInTempLinkPostXSRF2()[source]
class TestRequestAccess(methodName='runTest')[source]

Bases: backend.test.RequestTests

USER = 'e1'
setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

test_get_xsrf_token()[source]
test_login()[source]
test_request_access_correctly()[source]
test_request_access_with_get()[source]
test_request_access_with_wrong_xsrf_1()[source]
test_request_access_with_wrong_xsrf_2()[source]
test_request_access_without_xsrf()[source]
class TestUserManagement(methodName='runTest')[source]

Bases: backend.test.RequestTests

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

test_admin_approve_user()[source]
test_full_user_roundabout()[source]
test_recently_approved_user_can_list_files()[source]
test_recently_revoked_user_cant_list_files()[source]

Variant browser

browser_handlers

Request handlers for the variant browser.

class Autocomplete(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Provide autocompletion for protein names based on current query.

get(dataset: str, query: str, ds_version: str = None)[source]

Provide autocompletion for protein names based on current query.

Parameters:
  • dataset (str) – dataset short name
  • query (str) – query
  • ds_version (str) – dataset version
class Download(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Download variants in CSV format.

get(dataset: str, datatype: str, item: str, ds_version: str = None, filter_type: str = None)[source]

Download variants in CSV format.

Will filter the variants if filter_type is provided.

Parameters:
  • dataset (str) – dataset short name
  • datatype (str) – type of data
  • item (str) – query item
  • ds_version (str) – dataset version
  • filter_type (str) – type of filter to apply
class GetCoverage(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Retrieve coverage.

get(dataset: str, datatype: str, item: str, ds_version: str = None)[source]

Retrieve coverage.

Parameters:
  • dataset (str) – dataset short name
  • datatype (str) – type of data
  • item (str) – query item
  • ds_version (str) – dataset version
class GetCoveragePos(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Retrieve coverage range.

get(dataset: str, datatype: str, item: str, ds_version: str = None)[source]

Retrieve coverage range.

Parameters:
  • dataset (str) – dataset short name
  • datatype (str) – type of data
  • item (str) – query item
  • ds_version (str) – dataset version
class GetGene(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Request information about a gene.

get(dataset: str, gene: str, ds_version: str = None)[source]

Request information about a gene.

Parameters:
  • dataset (str) – short name of the dataset
  • gene (str) – the gene id
  • ds_version (str) – dataset version
class GetRegion(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Request information about genes in a region.

get(dataset: str, region: str, ds_version: str = None)[source]

Request information about genes in a region.

Parameters:
  • dataset (str) – short name of the dataset
  • region (str) – the region in the format chr-startpos-endpos
  • ds_version (str) – dataset version
class GetTranscript(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Request information about a transcript.

get(dataset: str, transcript: str, ds_version: str = None)[source]

Request information about a transcript.

Parameters:
  • dataset (str) – short name of the dataset
  • transcript (str) – the transcript id
Returns:

transcript (transcript and exons), gene (gene information)

Return type:

dict

class GetVariant(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Request information about a gene.

get(dataset: str, variant: str, ds_version: str = None)[source]

Request information about a gene.

Parameters:
  • dataset (str) – short name of the dataset
  • variant (str) – variant in the format chrom-pos-ref-alt
class GetVariants(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Retrieve variants.

get(dataset: str, datatype: str, item: str, ds_version: str = None)[source]

Retrieve variants.

Parameters:
  • dataset (str) – short name of the dataset
  • datatype (str) – gene, region, or transcript
  • item (str) – item to query
class Search(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]

Bases: handlers.UnsafeHandler

Perform a search for the wanted object.

get(dataset: str, query: str, ds_version: str = None)[source]

Perform a search for the wanted object.

Parameters:
  • dataset (str) – short name of the dataset
  • query (str) – search query
lookups

Lookups for a PostgreSQL database with genomic data.

Lookup functions for the variant browser.

autocomplete(dataset: str, query: str, ds_version: str = None)[source]

Provide autocomplete suggestions based on the query.

Parameters:
  • dataset (str) – short name of dataset
  • query (str) – the query to compare to the available gene names
  • ds_version (str) – the dataset version
Returns:

A list of genes names whose beginning matches the query

Return type:

list

get_awesomebar_result(dataset: str, query: str, ds_version: str = None)[source]

Parse the search input.

Datatype is one of:

  • gene
  • transcript
  • variant
  • dbsnp_variant_set
  • region

Identifier is one of:

  • ensembl ID for gene
  • variant ID string for variant (eg. 1-1000-A-T)
  • region ID string for region (eg. 1-1000-2000)

Follow these steps:

  • if query is an ensembl ID, return it
  • if a gene symbol, return that gene’s ensembl ID
  • if an RSID, return that variant’s string
Parameters:
  • dataset (str) – short name of dataset
  • query (str) – the search query
  • ds_version (str) – the dataset version
Returns:

(datatype, identifier)

Return type:

tuple

get_coverage_for_bases(dataset: str, chrom: str, start_pos: int, end_pos: int = None, ds_version: str = None)[source]

Get the coverage for the list of bases given by start_pos->end_pos, inclusive.

Parameters:
  • dataset (str) – short name for the dataset
  • chrom (str) – chromosome
  • start_pos (int) – first position of interest
  • end_pos (int) – last position of interest; if None it will be set to start_pos
  • ds_version (str) – version of the dataset
Returns:

coverage dicts for the region of interest. None if failed

Return type:

list

get_coverage_for_transcript(dataset: str, chrom: str, start_pos: int, end_pos: int = None, ds_version: str = None)[source]

Get the coverage for the list of bases given by start_pos->end_pos, inclusive.

Parameters:
  • dataset (str) – short name for the dataset
  • chrom (str) – chromosome
  • start_pos (int) – first position of interest
  • end_pos (int) – last position of interest; if None it will be set to start_pos
  • ds_version (str) – version of the dataset
Returns:

coverage dicts for the region of interest

Return type:

list

get_exons_in_transcript(dataset: str, transcript_id: str, ds_version=None)[source]

Retrieve exons associated with the given transcript id.

Parameters:
  • dataset (str) – short name of the dataset
  • transcript_id (str) – the id of the transcript
  • ds_version (str) – dataset version
Returns:

dicts with values for each exon sorted by start position

Return type:

list

get_gene(dataset: str, gene_id: str, ds_version: str = None)[source]

Retrieve gene by gene id.

Parameters:
  • dataset (str) – short name of the dataset
  • gene_id (str) – the id of the gene
  • ds_version (str) – dataset version
Returns:

values for the gene; None if not found

Return type:

dict

get_gene_by_dbid(gene_dbid: str)[source]

Retrieve gene by gene database id.

Parameters:gene_dbid (str) – the database id of the gene
Returns:values for the gene; empty if not found
Return type:dict
get_gene_by_name(dataset: str, gene_name: str, ds_version=None)[source]

Retrieve gene by gene_name.

Parameters:
  • dataset (str) – short name of the dataset
  • gene_name (str) – the id of the gene
  • ds_version (str) – dataset version
Returns:

values for the gene; empty if not found

Return type:

dict

get_genes_in_region(dataset: str, chrom: str, start_pos: int, stop_pos: int, ds_version: str = None)[source]

Retrieve genes located within a region.

Parameters:
  • dataset (str) – short name of the dataset
  • chrom (str) – chromosome name
  • start_pos (int) – start of region
  • stop_pos (int) – end of region
  • ds_version (str) – dataset version
Returns:

values for the gene; empty if not found

Return type:

dict

get_raw_variant(dataset: str, pos: int, chrom: str, ref: str, alt: str, ds_version: str = None)[source]

Retrieve variant by position and change.

Parameters:
  • dataset (str) – short name of the reference set
  • pos (int) – position of the variant
  • chrom (str) – name of the chromosome
  • ref (str) – reference sequence
  • alt (str) – variant sequence
  • ds_version (str) – dataset version
Returns:

values for the variant; None if not found

Return type:

dict

get_transcript(dataset: str, transcript_id: str, ds_version: str = None)[source]

Retrieve transcript by transcript id.

Also includes exons as [‘exons’]

Parameters:
  • dataset (str) – short name of the dataset
  • transcript_id (str) – the id of the transcript
  • ds_version (str) – dataset version
Returns:

values for the transcript, including exons; None if not found

Return type:

dict

get_transcripts_in_gene(dataset: str, gene_id: str, ds_version: str = None)[source]

Get the transcripts associated with a gene.

Parameters:
  • dataset (str) – short name of the reference set
  • gene_id (str) – id of the gene
  • ds_version (str) – dataset version
Returns:

transcripts (dict) associated with the gene; empty if no hits

Return type:

list

get_transcripts_in_gene_by_dbid(gene_dbid: int)[source]

Get the transcripts associated with a gene.

Parameters:gene_dbid (int) – database id of the gene
Returns:transcripts (dict) associated with the gene; empty if no hits
Return type:list
get_variant(dataset: str, pos: int, chrom: str, ref: str, alt: str, ds_version: str = None)[source]

Retrieve variant by position and change.

Parameters:
  • dataset (str) – short name of the dataset
  • pos (int) – position of the variant
  • chrom (str) – name of the chromosome
  • ref (str) – reference sequence
  • alt (str) – variant sequence
  • ds_version (str) – version of the dataset
Returns:

values for the variant; None if not found

Return type:

dict

get_variants_by_rsid(dataset: str, rsid: str, ds_version: str = None)[source]

Retrieve variants by their associated rsid.

Parameters:
  • dataset (str) – short name of dataset
  • rsid (str) – rsid of the variant (starting with rs)
  • ds_version (str) – version of the dataset
Returns:

variants as dict; no hits returns None

Return type:

list

get_variants_in_gene(dataset: str, gene_id: str, ds_version: str = None)[source]

Retrieve variants present inside a gene.

Parameters:
  • dataset (str) – short name of the dataset
  • gene_id (str) – id of the gene
  • ds_version (str) – version of the dataset
Returns:

values for the variants

Return type:

list

get_variants_in_region(dataset: str, chrom: str, start_pos: int, end_pos: int, ds_version: str = None)[source]

Variants that overlap a region.

Parameters:
  • dataset (str) – short name of the dataset
  • chrom (str) – name of the chromosom
  • start_pos (int) – start of the region
  • end_pos (int) – start of the region
  • ds_version (str) – version of the dataset
Returns:

variant dicts, None if no hits

Return type:

list

get_variants_in_transcript(dataset: str, transcript_id: str, ds_version: str = None)[source]

Retrieve variants inside a transcript.

Parameters:
  • dataset (str) – short name of the dataset
  • transcript_id (str) – id of the transcript (ENST)
  • ds_version (str) – version of the dataset
Returns:

values for the variant; None if not found

Return type:

dict

route

Routing definitions

utils

Utility functions for lookups and browser_handlers.

add_consequence_to_variant(variant: dict)[source]

Add information about variant consequence to a variant.

Parameters:variant (dict) – variant information
add_consequence_to_variants(variant_list: list)[source]

Add information about variant consequence to multiple variants.

Parameters:
  • variant_list (list) – list of variants
  • datatype (str) – type of data
  • item (str) – query item
annotation_severity(annotation: dict)[source]

Evaluate severity of the consequences; “bigger is more important”.

Parameters:annotation (dict) – vep_annotation from a variant
Returns:severity score
Return type:float
get_coverage(dataset: str, datatype: str, item: str, ds_version: str = None)[source]

Retrieve coverage for a gene/region/transcript.

Parameters:
  • dataset (str) – short name of the dataset
  • datatype (str) – type of “region” (gene/region/transcript)
  • item (str) – the datatype item to look up
  • ds_version (str) – the dataset version
Returns:

start, stop, coverage list

Return type:

dict

get_coverage_pos(dataset: str, datatype: str, item: str, ds_version: str = None)[source]

Retrieve coverage range.

Parameters:
  • dataset (str) – short name of the dataset
  • datatype (str) – type of “region” (gene/region/transcript)
  • item (str) – the datatype item to look up
Returns:

start, stop, chrom

Return type:

dict

get_flags_from_variant(variant: dict)[source]

Get flags from variant.

Checks for: * MNP * LoF (loss of function)

Parameters:variant (dict) – a variant
Returns:flags for the variant
Return type:list
get_proper_hgvs(annotation: dict)[source]

Get HGVS for change, either at transcript or protein level.

Parameters:annotation (dict) – VEP annotation with HGVS information
Returns:variant effect at aa level in HGVS format (p.), None if parsing fails
Return type:str
get_protein_hgvs(annotation)[source]

Aa changes in HGVS format.

Parameters:annotation (dict) – VEP annotation with HGVS information
Returns:variant effect at aa level in HGVS format (p.), None if parsing fails
Return type:str
get_transcript_hgvs(annotation: dict)[source]

Nucleotide change in HGVS format.

Parameters:annotation (dict) – VEP annotation with HGVS information
Returns:variant effect at nucleotide level in HGVS format (c.), None if parsing fails
Return type:str
get_variant_list(dataset: str, datatype: str, item: str, ds_version: str = None)[source]

Retrieve variants for a datatype.

Parameters:
  • dataset (str) – dataset short name
  • datatype (str) – type of data
  • item (str) – query item
  • ds_version (str) – dataset version
Returns:

{variants:list, headers:list}

Return type:

dict

is_region_too_large(start: int, stop: int)[source]

Evaluate whether the size of a region is larger than maximum query.

Parameters:
  • start (int) – Start position of the region
  • stop (int) – End position of the region
Returns:

True if too large

Return type:

bool

order_vep_by_csq(annotation_list: list)[source]

Will add “major_consequence” to each annotation and order by severity.

Parameters:annotation_list (list) – VEP annotations (as dict)
Returns:annotations ordered by major consequence severity
Return type:list
parse_dataset(dataset: str, ds_version: str = None)[source]

Check/parse if the dataset name is in the beacon form (reference:dataset:version).

Parameters:
  • dataset (str) – short name of the dataset
  • ds_version (str) – the dataset version
Returns:

(dataset, version)

Return type:

tuple

parse_region(region: str)[source]

Parse a region with either one or two positions

Parameters:region (str) – region, e.g. 3-100-200 or 3-100
Returns:(chrom, start, pos)
Return type:tuple
remove_extraneous_information(variant: dict)[source]

Remove information that is not used in the frontend from a variant.

Parameters:variant (dict) – variant data from database
remove_extraneous_vep_annotations(annotation_list: list)[source]

Remove annotations with low-impact consequences (less than intron variant).

Parameters:annotation_list (list) – VEP annotations (as dict)
Returns:VEP annotations with higher impact
Return type:list
worst_csq_from_csq(csq: str)[source]

Find worst consequence in a possibly &-filled consequence string.

Parameters:csq (str) – string of consequences, seperated with & (if multiple)
Returns:the worst consequence
Return type:str
worst_csq_from_list(csq_list: list)[source]

Choose the worst consequence.

Parameters:csq_list (list) – list of consequences
Returns:the worst consequence
Return type:str
worst_csq_index(csq_list: list)[source]

Find the index of the worst consequence.

Corresponds to the lowest value (index) from CSQ_ORDER_DICT.

Parameters:csq_list (list) – consequences
Returns:index in CSQ_ODER_DICT of the worst consequence
Return type:int
worst_csq_with_vep(annotation_list: list)[source]

Choose the vep annotation with the most severe consequence.

Add a”major_consequence” field for that annotation.

Parameters:annotation_list (list) – VEP annotations
Returns:the annotation with the most severe consequence
Return type:dict

Documentation

Set up a development system

In order to set up a minimal database system for development:

  1. Install docker (and docker-compose in case it’s not included in the installation)

  2. Start the server and database:

    ` $ docker-compose up `

  3. Add test data:

    ` $ psql -h localhost -U postgres swefreq -f test/data/browser_test_data.sql `

The test data contains all data required for the browser tests.

Importing data

The data import system can be found at scripts/importer, and helper scripts are available in scripts/.

Merge accounts for Elixir AAI

It is possible to maintain the same dataset permissions with other logins (e.g. you are a dataset admin with your institutional account and want to be able to login with your ORCID account and still be an admin).

To merge the accounts:

  1. Log in to the Perun Identity consolidator at https://perun.elixir-czech.cz/fed/gui with your account with admin access.
  2. Go to the authentication tab and click identity consolidator >>.
  3. Log in with your second account.