Welcome to jsonldschema’s documentation!

The JSONLDschema python package offers functionality to support the creation and use of machine-actionable and FAIR (Findable, Accessible, Interoperable and Reusable) metadata models expressed as JSON-schemas for JSON-LD data.

The approach relies on representing the metadata models as JSON-schemas with additional JSON-LD context files to provide semantic annotations to the model.

This python package can be used in combination with some visualisation tools:

We have used this approach in several metadata models, such as the Data Tag Suite (DATS) model for data discovery and multiple metadata models generated from Minimum Information Requirements.

How to compare a set of JSON-Schemas

Code documentation

How to compare and merge set of JSON-LD-schemas: code documentation

_images/classes_semDiff_compareEntities.png
class compareEntities.EntityCoverage(schema_a, context_a, schema_b, context_b)[source]

A class that compute the overlap between two JSON schemas semantic values taken from context files. This operation is not commutative. Thus, to find out if the schema/context pairs are equivalent, we need to run both semDiff(s_a, c_a, s_b, c_b) and semDiff(s_b, c_b, s_a, c_a)

Parameters:
  • schema_a – the content of the first schema
  • context_a – the context content bound to the first schema
  • schema_b – the content of the second schema
  • context_b – the context content bound to the second schema
_EntityCoverage__build_context_dict(schema_input)

A private method that associate each field in a schema to it’s semantic value in the context and reverse the result

Parameters:schema_input
Return sorted_values:
 a dictionary of semantic values and their corresponding field
Return ignored_fields:
 a list of fields that were ignored due to having no semantic value in the context file
static _EntityCoverage__compute_context_coverage(context1, context2)

Private method that compares the fields from the two schemas based on their semantic values

Parameters:
  • context1 – the final output of __build_context_dict() for the first schema
  • context2 – the final output of __build_context_dict() for the second schema
Return local_overlap_value:
 

a namedtuple containing relative and absolute coverage

Return overlap_output:
 

a dictionary that associate fields in schema 1 with their semantic twins in schema 2

Return unmatched_fields:
 

a dictionary of all fields of the second schema that haven’t been matched in the first schema

static _EntityCoverage__process_field(field_name, field_value, context, comparator)

Private method that catches a given field semantic value from the given context and adds it to the output

Parameters:
  • field_name – the name of the given field
  • field_value – the value of the given field
  • context – the context from which to retrieve the semantic value
  • comparator – the output of __build_context_dict()
Return comparator:
 

a dictionary of semantic values and corresponding fields from the given schema and context

__init__(schema_a, context_a, schema_b, context_b)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

semDiff/../_static/classes_semDiff_compareNetwork.png
class compareNetwork.NetworkCoverage(networks_array)[source]

This class compute the coverage of entities (schemas) among two networks (set of schemas) by comparing the semantic base type of each schema.

Parameters:networks_array – an array containing the two networks to compare
static _NetworkCoverage__compute_coverage(network_a, network_b)

Private method that compute the coverage between two networks

Parameters:
  • network_a – the output of __process_network for the first network
  • network_b – the output of __process_network for the second network
Return output:

an array containing the twined entities, the number of processed entities and the number of twins between both networks

static _NetworkCoverage__process_network(network)

Private method that retrieve the base type of each entity in a given network for later comparison

Parameters:network – a dictionary of schemas and their context (the network itself)
Return network_output:
 a dictionary of schemas and their base type retrieved from the context
__init__(networks_array)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

_images/classes_semDiff_fullDiff.png
class fullDiff.FullDiffGenerator(first_network, second_network)[source]

A Full Diff Generator that use regex to recompose context files URLS. It then resolves the two given networks of schemas and the two given networks of contexts, then proceeds to the actual comparison

__init__(first_network, second_network)[source]
Parameters:
  • first_network (dict) – the first network to compare from
  • second_network (dict) – the second network to compare against
__weakref__

list of weak references to the object (if defined)

class fullDiff.FullSemDiff(contexts, network_1, network_2)[source]

A class that computes the coverage at entity level and extracts ‘semantic synonyms’ (named twins in the code) between two network. It will then compute the coverage at attribute level between ‘semantic synonyms’.

__init__(contexts, network_1, network_2)[source]

The class constructor

Parameters:
  • contexts – an array containing the two context networks to use
  • network_1 – a dictionary containing the first set of schemas
  • network_2 – a dictionary containing the second set of schemas
__weakref__

list of weak references to the object (if defined)

class fullDiff.FullSemDiffMultiple(networks)[source]

A class that computes the coverage at entity level and extracts ‘semantic synonyms’ (named twins in the code) between multiple networks. It will then compute the coverage at attribute level between ‘semantic synonyms’.

__init__(networks)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

_images/classes_semDiff_mergeEntities.png
class mergeEntities.EntityMerge(schema1, context1, schema2, context2)[source]

A class that merge two schemas based on their semantic annotations

Parameters:
  • schema1 – dictionary of the first schema
  • context1 – dictionary of the first context as {“@context”:{}}
  • schema2 – dictionary of the second schema
  • context2 – dictionary of the second context as {“@context”:{}}
__init__(schema1, context1, schema2, context2)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

class mergeEntities.MergeEntityFromDiff(overlaps)[source]

A class that merges network2 into network1 based on overlaps from FullDiff

Parameters:overlaps – a variable containing
__init__(overlaps)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

add_schema(schema_name)[source]

Adds the schema to the merge

Parameters:schema_name
Returns:
find_references(field)[source]

Find $ref at root, in items or in allOf, anyOf, oneOf, adds the schema/context to the merge and change reference names

Parameters:field (dict) – a schema field
Returns:
modify_references()[source]

Modify the $ref names

Returns:
save(base_url)[source]

Saves the merge to disk and replace “id” attribute with the given base url + schema name

Parameters:base_url
Returns:
validate_output()[source]

Validates the output of the merge

Returns:

Code example

How to compare a set of JSON schemas: code example

The following code demonstrates how compare set of JSON schemas using the classes defined in this python package.

# In order to compare two set of schemas, you can use different classes depending
# on whether your networks are resolved or not


# First case scenario, your networks are not resolved
# You will need to provide a dictionary of regex term/switch that will help
# translate the schemas IDs into contexts IDs
def compare_unresolved_network():

    # import the corresponding class
    from semDiff.fullDiff import FullDiffGenerator
    import json

    # set your main schemas URL
    MIACME_schema_url = "https://w3id.org/mircat/miacme/schema/miacme_schema.json"
    MyFlowCyt_schema_url = "https://w3id.org/mircat/miflowcyt/schema/miflowcyt_schema.json"

    # set the regex dictionary
    regex = {
        "/schema": "/context/obo",
        "_schema.json": "_obo_context.jsonld"
    }

    # Prepare the two networks
    MIACME_network = {
        "name": "MIACME",
        "regex": regex,
        "url": MIACME_schema_url
    }
    MyFlowCyt_network = {
        "name": "MyFlowCyt",
        "regex": regex,
        "url": MyFlowCyt_schema_url
    }

    # Run the comparison
    overlaps = FullDiffGenerator(MIACME_network, MyFlowCyt_network)
    print(json.dumps(overlaps, indent=4))


# Second case scenario, your networks are already resolved
def compare_resolved_network(network1, network2):

    # import the corresponding class
    from semDiff.fullDiff import FullSemDiffMultiple
    import json

    # prepare the input
    prepared_input = [
        {
            "name": network1['name'],
            "schemas": network1['schemas'],
            "contexts": network1['contexts']
        },
        {
            "name": network2['name'],
            "schemas": network2['schemas'],
            "contexts": network2['contexts']
        }
    ]

    # run the comparison
    overlaps = FullSemDiffMultiple(prepared_input)
    print(json.dumps(overlaps, indent=4))

How to merge set of JSON schemas: code examples

How to use classes to merge set of json schemas

# In order to merge to set of schemas, you first need to get the output of the semantic
# diff comparators. You can then pass that result as an input to the merge class.


def merge_sets():
    # import the corresponding class
    import json
    from semDiff.fullDiff import FullSemDiffMultiple
    from semDiff.mergeEntities import MergeEntityFromDiff

    # Load your inputs
    with open('../tests/fullDiffOutput/network1.json', 'r') as networkFile:
        network1 = json.load(networkFile)
        networkFile.close()
    with open('../tests/fullDiffOutput/network2.json', 'r') as networkFile:
        network2 = json.load(networkFile)
        networkFile.close()

    # Prepare the input fot semantic diff
    prepared_input = [
        {
            "name": network1['name'],
            "schemas": network1['schemas'],
            "contexts": network1['contexts']
        },
        {
            "name": network2['name'],
            "schemas": network2['schemas'],
            "contexts": network2['contexts']
        }
    ]

    # Run the diff
    overlaps = FullSemDiffMultiple(prepared_input)

    # Prepare the merging input
    merging = {
        "network1": overlaps.networks[0],
        "network2": overlaps.networks[1],
        "overlaps": overlaps.output[0][0],
        "fields_to_merge": overlaps.ready_for_merge[0]
    }

    merged_schema = MergeEntityFromDiff(merging)  # Merge
    merged_schema.save('https://example.com/')  # Save the new schema set to the disk

CEDAR Templating Tools

Cedar utilities

Code documentation for the CEDAR utilities

The CEDAR client will provide all the links the the CEDAR API functionality such as get, post and updates on templates, template elements, folders, instances ect …

class client.ClientBase[source]

The base class for all client classes .. warning:: Do not use

Parameters:db (StorageEngine) – the storage engine
class client.FullSemDiffClient[source]

Resolves the two given networks and compares their semantic values based on bound context files

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.InstanceValidatorClient[source]

Validates a given instance against a given schema

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.MergeEntitiesClient[source]

Resolves a network from given URL and validates each schema

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.NetworkCompilerClient[source]

Resolves all references and sub references for a given schema URL

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.NetworkValidatorClient[source]

Resolves a network from given URL and validates each schema

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.Schema2ContextClient[source]

Fully resolves a schema set from a given URL and creates the context template for each given ontology

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.SchemaValidatorClient[source]

Validates a schema with the jsonschema library

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
client.create_client()[source]

Simple function that instantiates the app and creates the bridge to the API

Returns:the falcon app
client.max_body(limit)[source]

Simple function to limit the size of the request

Parameters:limit (int) – the maxiùum size of the request
Returns:

The schema2cedar classes will help you transform your JSON schemas draft 04 into compatible CEDAR schemas

class schema2cedar.Schema2CedarBase[source]

The base converter class

Warning

This class should not be used! Use its children for converting to cedar template and template elements

static json_pretty_dump(json_object, output_file)[source]

Dump a given json in the given file

Parameters:
  • json_object (dict) – the input JSON to dump
  • output_file (TextIOWrapper) – the file to dump the JSON to
Returns:

the dumping result

Return type:

string

static set_context()[source]

Set the base context for a given template

Returns:the base context
static set_prop_context(schema)[source]

Set the required context for the properties attribute of the given schema

Parameters:schema – an input JSON schema
Returns:the properties context required by CEDAR
static set_properties_base_item()[source]

Set the base properties required by CEDAR

Returns:the base property dictionary
static set_property_context()[source]

Set the base context for a given template

Returns:the base context
static set_required_item(schema)[source]

Set the required items that a CEDAR schema needs for a given schema

Parameters:schema – the input schema
Returns:the dictionary of required items
static set_stripped_properties(schema)[source]

Set the properties of a given schema

Parameters:schema – the input schema
Returns:a dictionary of properties
static set_sub_context(schema)[source]

Set the context required by CEDAR for each individual attribute/field for a given schema

Parameters:schema – the input schema
Returns:the dictionary of required context for each field
static set_template_element_property_minimals(sub_context, schema)[source]

Set the minimal elements of the properties attributes of a given schema and its sub-context

Parameters:
  • sub_context – the schema sub-context
  • schema – the input schema
Returns:

(dict) a dictionary of the required properties for CEDAR conversion

class schema2cedar.Schema2CedarTemplate[source]

Schema 2 Template Converter, this is the one you want to use if you want to convert a schema into a template

convert_template(input_json_schema)[source]

Method to convert a given schema into a CEDAR template

Parameters:input_json_schema – the input JSON schema
Returns:the schema converted into a template
class schema2cedar.Schema2CedarTemplateElement[source]

Schema to TemplateElement converter.

Warning

Should only be used to convert schemas to template element. If you want to convert a schema to a template, use Schema2CedarTemplate (it will automatically create nested template elements for you)

convert_template_element(input_json_schema, **kwargs)[source]

Method to convert a given schema into a CEDAR template element

Parameters:
  • input_json_schema – the input schema
  • kwargs – optional parameter to provide the field name referencing that schema
Returns:

the schema converted to a CEDAR template element

find_sub_specs(schema, sub_spec_container)[source]

Inspect a given schema to find and load its schemas dependencies

Parameters:
  • schema – the input schema
  • sub_spec_container – a container that will hold the dependencies
Return sub_spec_container:
 

the filled container with the schema dependencies

load_sub_spec(path_to_load, parent_schema, field_key)[source]

Load the given sub schema into memory

Parameters:
  • path_to_load – path to the sub schema
  • parent_schema – the parent schema that this sub-schema is referenced from
  • field_key – the parent schema field name that this sub-schema is referenced from
Returns:

the string containing the loaded JSON schema

Cedar Usage

Code usage for the CEDAR utilities

The CEDAR utilities main function is to allow a user to transform a JSON-Schema draft 4 network into a CEDAR schema network.

# import the modules
import json
import os
from cedar import schema2cedar, client

# get your api key, folder_id and user_id from the configuration file:
configfile_path = os.path.join(os.path.dirname(__file__), "../../tests/test_config.json")
if not (os.path.exists(configfile_path)):
    print("Please, create the config file.")
with open(configfile_path) as config_data_file:
    config_json = json.load(config_data_file)
config_data_file.close()

production_api_key = config_json["production_key"]
folder_id = config_json["folder_id"]
user_id = config_json["user_id"]

# Create your schema or load it from a file
# Note that all sub-schemas will be instantiated and uploaded automatically as TemplateElements
schema = {
          "id": "https://example.com/test1_main_schema.json",
          "$schema": "http://json-schema.org/draft-04/schema",
          "title": "Test case 1 for unit testing main schema",
          "description": "JSON-schema representing the first schema of the first "
                         "network used by JSONLD-SCHEMA merging unit tests.",
          "type": "object",
          "_provenance": {
              "url": "http://w3id.org/mircat/miaca/provenance.json"
          },
          "properties": {
              "@context": {
                  "description": "The JSON-LD context",
                  "anyOf": [
                      {
                          "type": "string"
                      },
                      {
                          "type": "object"
                      },
                      {
                          "type": "array"
                      }
                  ]
              },
              "@id": {
                  "description": "The JSON-LD identifier",
                  "type": "string",
                  "format": "uri"
              },
              "@type": {
                  "description": "The JSON-LD type",
                  "type": "string",
                  "enum": [
                      "Test1Main"
                  ]
              }
          }
      }

# instantiate the client
client = client.CEDARClient()

# instantiate the converter
template = schema2cedar.Schema2CedarTemplate(production_api_key, folder_id, user_id)

# Run the conversion
output_schema = template.convert_template(schema)
validation_response, validation_message = client.validate_template(
    "production",
    template.production_api_key,
    json.loads(output_schema))

# push to server
response = client.create_template(
                "production",
                template.production_api_key,
                template.folder_id,
                output_schema)
print(response.json())

Utility for preparing function inputs and contexts

Utilities for preparing the input for the different functions

schema2context.create_and_save_contexts(mapping, semantic_types, write_to_file)[source]

Generates the context files for each schema in the given network and write these files to the disk

Parameters:
  • mapping (dict) – a file containing a mapping dict {“schemaName”: “schemaURL”}
  • semantic_types (dict) – a mapping dict of ontologies {“ontologyName”: “Ontology URL”}
  • write_to_file (str) – the directory absolute path to output the variables
Returns:

the resolved contexts

schema2context.create_context_template(schema, semantic_types, name)[source]

Create the context template

Parameters:
  • schema – the schema for which to build the context
  • semantic_types – the schema base type
  • name – the schema name
Returns:

a dictionary representing the schema context

schema2context.create_context_template_from_url(schema_url, semantic_types)[source]

Create a context template from the given URL

Parameters:
  • schema_url (str) – the schema URL
  • semantic_types (dict) – a dictionary with {“ontologyName”:”ontologyBaseURL”}
Returns:

a dictionary with a context variable for easy ontology

schema2context.create_network_context(mapping, semantic_types)[source]

Generates the context files for each schema in the given network

Parameters:
  • mapping (dict) – a file containing a mapping dict {“schemaName”: “schemaURL”}
  • semantic_types (dict) – a mapping dict of ontologies {“ontologyName”: “Ontology URL”}
Returns:

the resolved contexts

schema2context.generate_context_mapping(schema_url, regex_input)[source]

Resolves all schemas from given schema URL and creates the context mapping

Parameters:
  • schema_url (str) – a schema URL
  • regex_input (dict) – keys are the regex to locate and value the replace value
Returns:

a context mapping

schema2context.generate_context_mapping_dict(schema_url, regex_input, network_name)[source]

Generates the mapping dictionary used by full diff

Parameters:
  • schema_url (str) – the url of the main schema
  • regex_input (dict) – a set of regex to indicate how to transform schemas URL to contexts URL
  • network_name (str) – the name of the current network
Returns:

the mapping dictionary of schemas and contexts

schema2context.generate_contexts_from_regex(schema_url, regex_input)[source]

Creates the context URL for the given schema url based on given regex

Parameters:
  • schema_url (str) – a schema URL
  • regex_input (dict) – keys are the regex to locate and value the replace value
Returns:

a context URL

schema2context.generate_labels_from_contexts(contexts, labels)[source]

Generate labels from given context using OLS

Parameters:
  • contexts (dict) – a dictionary containing contexts associated to schema names
  • labels (dict) – pre-existing labels to avoid triggering twice the same query
Returns:

labels

schema2context.get_json_from_url(json_url)[source]

Gets the content of a json file from its URL - it can be a schema or a context file, or any other json file.

Parameters:json_url – a URL for a json file (e.g. a schema or a context file)
Returns:a dictionary with the json content
schema2context.prepare_input(schema_url, network_name)[source]

Enable to resolve all references from a given schema and create the output for create_network_context

Parameters:
  • schema_url (str) – url of the schema
  • network_name (str) – the name of the network
Returns:

a TextIOWrapper with the location of the mapping file

schema2context.process_schema_name(name)[source]

Extract the name out of a schema composite name by remove unnecessary strings

Parameters:name – a schema name
Returns:a string representing the processed schema

compile_schema.get_name(schema_url)[source]

Extract the item name from it’s URL

Parameters:schema_url – the URL of the schema
Return name:the name of the schema (eg: ‘item_schema.json’)
compile_schema.resolve_reference(schema_url)[source]

Load and decode the schema from a given URL

Parameters:schema_url – the URL to the schema
Returns:an exception or a decoded json schema as a dictionary
compile_schema.resolve_schema_references(schema, loaded_schemas, schema_url=None, refs=None)[source]

Resolves and replaces json-schema $refs with the appropriate dict. Recursively walks the given schema dict, converting every instance of $ref in a ‘properties’ structure with a resolved dict. This modifies the input schema and also returns it.

Parameters:
  • schema – the schema dict
  • loaded_schemas – a recursive dictionary that stores the path of already loaded schemas to prevent circularity issues
  • refs – a dict of <string, dict> which forms a store of referenced schemata
  • schema_url – the URL of the schema
Returns:

schema


prepare_fulldiff_input.load_context(context)[source]

Load the context variable from the given URL mapping

Parameters:context – a mapping of context URL
Returns:a context variable
prepare_fulldiff_input.prepare_input(schema_1_url, schema_2_url, mapping_1, mapping_2)[source]

Function to help preparing the full_diff input

Parameters:
  • schema_1_url – url of the first schema
  • schema_2_url – url of the second schema
  • mapping_1 – a mapping to contexts
  • mapping_2 – a mapping to contexts
Returns:

a fully prepared variable with all resolved references ready to be used by full_diff

prepare_fulldiff_input.resolve_network_url(schema_url)[source]

Function that triggers the resolved_schema_ref function

Parameters:schema_url – a schema URL
Returns:a fully resolved network
prepare_fulldiff_input.resolve_schema_ref(schema, resolver, network)[source]

Recursively resolves the references in the schemas and add them to the network

Warning

use resolve network instead

Parameters:
  • schema – the schema to resolve
  • resolver – the refResolver object
  • network – the network to add the schemas to
Returns:

a fully processed network with resolved ref

Validating a single schema and a set of schemas:

How to validate JSON-schemas and instances: code documentation

jsonschema_validator.validate_instance(schemapath, schemafile, instancepath, instancefile, error_printing, store)[source]

Validate a JSON instance against a JSON schema.

Parameters:
  • schemapath – the path to the schema directory
  • schemafile – the name of the schema file
  • instancepath – the path of the instance direvotyr
  • instancefile – the name of the instance path
  • error_printing – the error log
  • store – a store required by RefResolver
Returns:

errors

jsonschema_validator.validate_instance_against_schema(instance, resolver, schema)[source]

Simple function to validate an instance against a schema on the fly

Parameters:
  • instance (dict) – the JSON instance to validate
  • resolver (RefResolver) – the resolver object used by Drat4Validator
  • schema (dict) – the root schema to validate against
Returns:

Draft4Validator.validate

jsonschema_validator.validate_schema(path, schema_file_name)[source]

Validate a JSON schema given the folder/path and file name of the schema file.

Parameters:
  • path – the path to the schema directory
  • schema_file_name – the name of the schema in that directory
Returns:

True or False

jsonschema_validator.validate_schema_file(schema_file)[source]

Validate a JSON schema given the schema file.

Parameters:schema_file – the string the the schema file location
Returns:True
class miflowcyt_validate.FlowRepoClient(mapping, client_id, number_of_items)[source]

A class that provides functionality to download experiments from the FlowRepository (https://flowrepository.org/), transform the XML into JSON and validate the JSON instances against their JSON schema. The transformation from XML to JSON relies on the JSONBender library (https://github.com/Onyo/jsonbender).

get_all_experiments(max_number, accessible_ids)[source]

Grab all experiments from the API for the given number

Parameters:
  • max_number (int) – the number of item to retrieve
  • accessible_ids (list) – the ids that this use can fetch
Returns:

the experiments XMLs

static get_mapping(mapping_file_name)[source]

Build the mapping dictionary based on the given mapping file

Parameters:mapping_file_name – the name of the mapping file
Return mapping:the mapping of the fields
get_user_content_id()[source]

Return all IDs found in the user content XML

Returns:a list of all IDs there were identified in the variable returned by the API
grab_experiment_from_api(item_identifier)[source]

Retrieve the experimental metadata and return it as XML document object

Parameters:item_identifier – the item identifier that should be retrieved
Returns:the XML document object
inject_context()[source]

Transform the myflowcyt JSON into a JSON-LD by injecting @context and @type keywords :return: a JSON-LD of the myflowcyt JSON

make_validation()[source]

Method to run the mapping for the given number of items

Returns:a dictionary containing the list of errors for all processed items
preprocess_content(content)[source]

Preprocess the XML into a JSON that is compliant with the schema. :param content: str containing the XML :type content: str :return: a JSON schema cleaned from residual artifacts

static validate_instance_from_file(instance, item_id, schema_name)[source]

Method to output the extracted JSON into a file and validate it against the given schema

Parameters:
  • instance – the instance to output into a file
  • item_id – the instance ID needed to create the file name
  • schema_name – the schema to check against
Return errors:

a list of fields that have an error for this instance

class miflowcyt_validate.FlowRepoClient(mapping, client_id, number_of_items)[source]

A class that provides functionality to download experiments from the FlowRepository (https://flowrepository.org/), transform the XML into JSON and validate the JSON instances against their JSON schema. The transformation from XML to JSON relies on the JSONBender library (https://github.com/Onyo/jsonbender).

get_all_experiments(max_number, accessible_ids)[source]

Grab all experiments from the API for the given number

Parameters:
  • max_number (int) – the number of item to retrieve
  • accessible_ids (list) – the ids that this use can fetch
Returns:

the experiments XMLs

static get_mapping(mapping_file_name)[source]

Build the mapping dictionary based on the given mapping file

Parameters:mapping_file_name – the name of the mapping file
Return mapping:the mapping of the fields
get_user_content_id()[source]

Return all IDs found in the user content XML

Returns:a list of all IDs there were identified in the variable returned by the API
grab_experiment_from_api(item_identifier)[source]

Retrieve the experimental metadata and return it as XML document object

Parameters:item_identifier – the item identifier that should be retrieved
Returns:the XML document object
inject_context()[source]

Transform the myflowcyt JSON into a JSON-LD by injecting @context and @type keywords :return: a JSON-LD of the myflowcyt JSON

make_validation()[source]

Method to run the mapping for the given number of items

Returns:a dictionary containing the list of errors for all processed items
preprocess_content(content)[source]

Preprocess the XML into a JSON that is compliant with the schema. :param content: str containing the XML :type content: str :return: a JSON schema cleaned from residual artifacts

static validate_instance_from_file(instance, item_id, schema_name)[source]

Method to output the extracted JSON into a file and validate it against the given schema

Parameters:
  • instance – the instance to output into a file
  • item_id – the instance ID needed to create the file name
  • schema_name – the schema to check against
Return errors:

a list of fields that have an error for this instance

Using the MiFlowCyt Tool: code examples

Using the API:

# The Miflowcyt functionality provide access to the flow repository endpoint.
# From there it gathers metadata about flow cytometry experiments as XML.
# It will then transform these XML into a JSON and inject attributes to
# obtain a final JSON-LD that can be validated against a schema

# To proceed, you must first create a variable that hold the path to a mapping file
# and your user API key from the config file
# You also need to import the FlowRepoClient class from the miflowcyt_validate file of the
# the validate module
import os
import json
from validate.miflowcyt_validate import FlowRepoClient

# Get the path to the mapping file
map_file = os.path.join(os.path.dirname(__file__),
                        "../tests/data/MiFlowCyt/experiment_mapping.json")

base_schema = "experiment_schema.json"  # Get the name of the schema

# Load your API KEY from the config.json file
configfile_path = os.path.join(os.path.dirname(__file__), "../tests/test_config.json")
with open(configfile_path) as config_data_file:
    config_json = json.load(config_data_file)
    config_data_file.close()
apiKey = config_json["flowrepo_userID"]
items = 1  # Initialize the number of instances to gather from the API

client = FlowRepoClient(map_file, apiKey, items)  # initialize the class
instances = client.inject_context()  # inject the attributes to obtain a JSON-LD
print(json.dumps(instances, indent=4))  # Enjoy

Using the API client for the functionality of this package

Code documentation

API client: code documentation

The JSONLDschema package offers and API client that provides most of the library functionality through a RESTFul service.

Here we include the documentation of each of the Python classes and properties for the API client.

class client.ClientBase[source]

The base class for all client classes .. warning:: Do not use

Parameters:db (StorageEngine) – the storage engine
class client.FullSemDiffClient[source]

Resolves the two given networks and compares their semantic values based on bound context files

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.InstanceValidatorClient[source]

Validates a given instance against a given schema

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.MergeEntitiesClient[source]

Resolves a network from given URL and validates each schema

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.NetworkCompilerClient[source]

Resolves all references and sub references for a given schema URL

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.NetworkValidatorClient[source]

Resolves a network from given URL and validates each schema

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.Schema2ContextClient[source]

Fully resolves a schema set from a given URL and creates the context template for each given ontology

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
class client.SchemaValidatorClient[source]

Validates a schema with the jsonschema library

on_get(req, resp)[source]

Process the get request

Parameters:
  • req – the user request
  • resp – the server response
client.create_client()[source]

Simple function that instantiates the app and creates the bridge to the API

Returns:the falcon app
client.max_body(limit)[source]

Simple function to limit the size of the request

Parameters:limit (int) – the maxiùum size of the request
Returns:

class utility.StorageEngine[source]

This class is the middle layer that binds the API calls to the actual python code

create_context(user_input)[source]

Resolve a network a creates the associated context files templates

Parameters:user_input (dict) – a dict that should contain a “schema_url” and a “vocab” attributes. vocab should contain the ontology names as keys and their base URL as value
Returns:a dict containing the context files of all schema in the network and for all given vocabulary
create_full_sem_diff(user_input)[source]

Compares two networks based on their semantics values

Parameters:user_input (dict) – a dictionary containing the network_1, network_2 and a mapping of all schemas to their context files
Returns:a list of siblings
merge_entities(user_input)[source]

Merge two given schemas

Parameters:user_input (dict ({schema_ulr_1; schema_url_2})) – contains the two schemas URL to merge
Returns:a merged schema
resolve_network(schema)[source]

Resolves all references of a given schema

Parameters:schema (dict) – a json containing the schema_url attribute
Returns:the resolved network
validate_instance(user_input)[source]

Validates an instance against a schema

Parameters:user_input (dict) – a dictionary containing the schema_url and instance_url attributes
Returns:a validation str or a list of errors
validate_network(user_input)[source]

Resolves a network and validates all of its schemas using Draft4Validator

Parameters:user_input (basestring) – a schema URL
Returns:a dictionary of all schemas with a string that give information on whether the schema is valid or not
validate_schema(user_input)[source]

Validate a schema against its draft using Draft4Validator

Parameters:user_input (basestring) – a schema URL
Returns:a string that give information on whether the schema is valid or not (should return a boolean or a dict containing both variables)

Code example

Accessing the JSONLDschema API for functions available in this package

Using the API:

# In order to use the JSONSchema-LD API, you first need to activate it in a separate python script
# You can then run queries on the different endpoints

# Let's first create a client
from wsgiref import simple_server
from api_client.client import create_client


if __name__ == '__main__':
    app = create_client()  # Create the client
    # Run the client forever
    httpd = simple_server.make_server('localhost', 8001, app)
    httpd.serve_forever()

# From there, all endpoints should be accessible.
# For more details on these endpoints, look at the example above.

import requests
import json
import os


class MircatClient:
    """
    A simple client example plugged on the api_client
    :param port: the port to target
    :type port: int
    :param client_url: the url of the api_client
    :type client_url: basestring
    """

    def __init__(self, client_url, port):
        self.headers = {
            "Content-Type": "application/json",
        }
        self.port = port
        self.base_URL = client_url
        self.request_base_url = self.base_URL + ":" + str(self.port)

    def create_context(self):
        """ Method to create contexts for the given network
        :return: a variable containing a context for each vocabulary and schema
        """
        extra_url = "/create_context"
        test_input = {
            "schema_url": "https://w3id.org/dats/schema/access_schema.json",
            "vocab":  {
                "obo": "http://purl.obolibrary.org/obo/",
                "sdo": "http://schema.org"
            }
        }
        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(test_input),
                                headers=self.headers)

        return response.text

    def resolve_network(self):
        extra_url = "/resolve_network"
        test_input = {"schema_url": "https://w3id.org/dats/schema/access_schema.json"}
        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(test_input),
                                headers=self.headers)
        return response.text

    def make_full_sem_diff(self):
        extra_url = "/semDiff"

        path = os.path.join(os.path.dirname(__file__), "../tests/data")
        with open(os.path.join(path, "dats.json"), 'r') as dats_file:
            # Load the JSON schema and close the file
            network1 = json.load(dats_file)
            dats_file.close()

        path = os.path.join(os.path.dirname(__file__), "../tests/data")
        with open(os.path.join(path, "miaca.json"), 'r') as miaca_file:
            # Load the JSON schema and close the file
            network2 = json.load(miaca_file)
            miaca_file.close()

        test_input = {
            "network_1": network1["schemas"],
            "network_2": network2["schemas"],
            "mapping": [network1["contexts"], network2["contexts"]]
        }
        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(test_input),
                                headers=self.headers)
        return response.text

    def validate_schema(self):
        extra_url = "/validate/schema"
        schema_url = "https://w3id.org/dats/schema/access_schema.json"

        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(schema_url),
                                headers=self.headers)

        return response.text

    def validate_instance(self):
        extra_url = "/validate/instance"

        user_input = {
            "schema_url": "https://w3id.org/dats/schema/activity_schema.json",
            "instance_url": "https://w3id.org/mircat/miflowcyt/schema/sample_schema.json"
        }

        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(user_input),
                                headers=self.headers)

        return response.text

    def validate_network(self):
        extra_url = "/validate/network"
        schema_url = "https://w3id.org/dats/schema/activity_schema.json"
        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(schema_url),
                                headers=self.headers)

        return response.text

    def merge_entities(self):
        extra_url = '/merge'
        urls = {
            "schema_url_1": "https://w3id.org/dats/schema/person_schema.json",
            "schema_url_2": "https://w3id.org/dats/schema/person_schema.json",
            "context_url_1": "https://raw.githubusercontent.com/"
                             "datatagsuite/context/master/obo/person_obo_context.jsonld",
            "context_url_2": "https://raw.githubusercontent.com/"
                             "FAIRsharing/mircat/master/miaca/context/"
                             "obo/source_obo_context.jsonld"
        }
        response = requests.get(self.request_base_url + extra_url,
                                data=json.dumps(urls),
                                headers=self.headers)

        return response.text


if __name__ == '__main__':
    client = MircatClient("http://localhost", 8001)
    # print(client.create_context())
    # print(client.resolve_network())
    # print(client.make_full_sem_diff())
    # print(client.validate_schema())
    # print(client.validate_instance())
    # print(client.validate_network())
    print(client.merge_entities())

Indices and tables