CZ URN:NBN API¶
Python API for czech URN:NBN resolver (documentation).
What is URN:NBN¶
URN:NBN is system for registration and assigning of special codes for electronic publications (it may look like this: urn:nbn:cz:edep-00000s). The codes can be then used to resolve (translate) the code to metadata information about publication and/or to get pointer to actual digital instance of the publication (=file).
The system works like magnet used by the popular torrent programs and websites. Once you have the URN, you can query independent resolvers (torrent trackers), which then points you to libraries (users), who store copy of the document (file).
cz-urnnbn-api¶
cz-urnnbn-api is Python package used to work with czech URN:NBN resolver. It allows you to register documents, add new digital instances and resolve strings back to URL of systems, where the document is stored.
- Warning:
- The package is not 100% complete, because complex nature of the API and because it was created for E-deposit project, which doesn’t require 100% functionality. Package is opensource, and pull requests are welcomed.
Package structure¶
The package itself is split into multiple files.
API¶
api submodule¶
Functions for interaction with URN:NBN API.
- cz_urnnbn_api.api.is_valid_reg_code(reg_code='edep')¶
Check whether reg_code is valid registrar code.
Parameters: reg_code (str) – Producent’s registration code. Returns: True, if the reg_code is valid. Return type: bool
- cz_urnnbn_api.api.iter_registrars()¶
Iterate over all registrars.
Yields: obj – Registrar instance with basic informations.
- cz_urnnbn_api.api.get_registrar_info(reg_code)¶
Get detailed informations about registrar with reg_code.
Parameters: reg_code (str) – Code identifier of registrar. Returns: Registrar instance with all informations. Return type: obj
- cz_urnnbn_api.api.register_document_obj(xml_composer, reg_code='edep')¶
Register document in mode BY_RESOLVER - let the resolver give you URN:NBN code.
Parameters: - xml_composer (obj) – Instance of the MonographComposer.
- reg_code (str, default settings.REG_CODE) – Registrar’s code.
Returns: Return type: obj
- cz_urnnbn_api.api.register_document(xml, reg_code='edep')¶
Register document in mode BY_RESOLVER - let the resolver give you URN:NBN code.
Parameters: - xml (str) – XML, which will be used for registration. See xml_composer for details.
- reg_code (str, default settings.REG_CODE) – Registrar’s code.
Returns: Return type: obj
- cz_urnnbn_api.api.register_digital_instance_obj(urn_nbn, digital_instance)¶
Register digital_instance object as new digital instance of the document for given urn_nbn
Parameters: Returns: DigitalInstance with more informations.
Return type: obj
- cz_urnnbn_api.api.register_digital_instance(urn_nbn, url, digital_library_id, format=None, accessibility=None)¶
Compose and register new digital instance of document.
Parameters: - urn_nbn (str) – URN:NBN identifier of registered document.
- url (str) – URL of the digital instance.
- digital_library_id (str) – ID of the digital library.
- format (str, def. None) – Format of the instance - pdf for example.
- accessibility (str, def. None) – Is the registration neccessary to access this instance?
Returns: DigitalInstance with more informations.
Return type: obj
- cz_urnnbn_api.api.get_digital_instances(urn_nbn)¶
Get list of DigitalInstance objects for given urn_nbn.
DigitalInstances are pointers to DigitalLibrary, where the instance of document is stored. There should be always a link to the document in url property.
Returns: DigitalInstance objects or blank list. Return type: list
xml_composer submodule¶
This module contains composers to allow creation of XML for URN:NBN, so you don’t have to create XML by hand.
- See:
- class cz_urnnbn_api.xml_composer.MonographComposer(**kwargs)¶
Bases: kwargs_obj.kwargs_obj.KwargsObj
Compostition class for Monograph publications.
- title¶
str, required – Title of the publication.
- subtitle¶
str – Subtitle of the publication.
- ccnb¶
str – CCNB number.
- isbn¶
str – ISBN string. You should validate this first.
- other_id¶
str – Useful for UUID and so on..
- document_type¶
str – Electronic? Scan?
- digital_born¶
bool – Was the publication digitally born, or is it scan?
str – Author of the publication.
- publisher¶
str – Publishers name.
- place¶
str – Place where the publication was published (usually city).
- year¶
str – Year when the publication was published.
- format¶
str, required – PDF? EPUB?
- to_xml_dict()¶
Compose hierarchical structure from ordered dicts, which will hold the XML.
Returns: Structure from ordered dicts. Return type: odict
- to_xml()¶
Convert itself to XML string.
Returns: XML. Return type: str
- class cz_urnnbn_api.xml_composer.MultiMonoComposer(**kwargs)¶
Bases: cz_urnnbn_api.xml_composer.MonographComposer
Composition class for Multi monograph XMLs for URN:NBN.
- title¶
str, required – Title of the publication.
- volume_title¶
str, required – Title of the whole volume.
- subtitle¶
str – Subtitle of the publication.
- ccnb¶
str – CCNB number.
- isbn¶
str – ISBN string. You should validate this first.
- other_id¶
str – Useful for UUID and so on..
- document_type¶
str – Electronic? Scan?
- digital_born¶
bool – Was the publication digitally born, or is it scan?
str – Author of the publication.
- publisher¶
str – Publishers name.
- place¶
str – Place where the publication was published (usually city).
- year¶
str – Year when the publication was published.
- format¶
str, required – PDF? EPUB?
- to_xml_dict()¶
Convert itself to XML string.
Returns: XML. Return type: str
xml_convertor submodule¶
This module contains convertors for converting MODS to XML required by URN:NBN project.
- See:
- cz_urnnbn_api.xml_convertor.pick_only_text(fn)¶
- class cz_urnnbn_api.xml_convertor.MonographPublication(mods_xml)¶
Bases: object
This class accepts MODS monographic data, which can then convert to XML for URN:NBN.
- get_title(*args, **kwargs)¶
- get_subtitle(*args, **kwargs)¶
Returns: Author’s name. Return type: str
- get_form()¶
Returns: Form of the book. Electronic source, and so on.. Return type: str
- get_place()¶
Returns: Place where the book was released. Return type: str
- get_publisher(*args, **kwargs)¶
- get_year(*args, **kwargs)¶
- get_identifier(*args, **kwargs)¶
- get_ccnb(*args, **kwargs)¶
- get_isbn(*args, **kwargs)¶
- get_uuid(*args, **kwargs)¶
- compose()¶
Convert self to nested ordered dicts, which may be serialized to XML using xmltodict module.
Returns: XML parsed to ordered dicts. Return type: OrderedDict
- add_format(file_format)¶
Add informations about file_format to internal XML dict.
Parameters: file_format (str) – PDF, jpeg, etc..
- to_xml()¶
Convert itself to XML unicode string.
Returns: XML. Return type: unicode
- class cz_urnnbn_api.xml_convertor.MonographVolume(mods_xml)¶
Bases: cz_urnnbn_api.xml_convertor.MonographPublication
Conversion of Multi-monograph data to XML required by URN:NBN.
- get_volume_title(*args, **kwargs)¶
- compose()¶
Convert self to nested ordered dicts, which may be serialized to XML using xmltodict module.
Returns: XML parsed to ordered dicts. Return type: OrderedDict
- cz_urnnbn_api.xml_convertor.convert_mono_xml(mods_xml, file_format)¶
Convert MODS monograph record to XML, which is required by URN:NBN resolver.
Parameters: mods_xml (str) – MODS volume XML. Returns: XML for URN:NBN resolver. Return type: str Raises: ValueError – If can’t find required data in MODS (author, title).
- cz_urnnbn_api.xml_convertor.convert_mono_volume_xml(mods_volume_xml, file_format)¶
Convert MODS monograph, multi-volume record to XML, which is required by URN:NBN resolver.
Parameters: mods_volume_xml (str) – MODS volume XML. Returns: XML for URN:NBN resolver. Return type: str Raises: ValueError – If can’t find required data in MODS (author, title).
settings submodule¶
Module is containing all necessary global variables for the package.
Module also has the ability to read user-defined data from two paths:
- $HOME/_SETTINGS_PATH
- /etc/_SETTINGS_PATH
See _SETTINGS_PATH for details.
Note
If the first path is found, other is ignored.
Example of the configuration file ($HOME/edeposit/urnnbn.json):
{
"EXPORT_DIR": "/somedir/somewhere"
}
Attributes¶
- cz_urnnbn_api.settings.USERNAME = ''¶
Username for the URN – NBN resolver website
- cz_urnnbn_api.settings.PASSWORD = ''¶
Password for the URN – NBN resolver website
- cz_urnnbn_api.settings.REG_CODE = 'edep'¶
Registration code (used in API)
- cz_urnnbn_api.settings.URL = 'https://resolver-test.nkp.cz/api/v3/'¶
URL of the URN – NBN resolver
- cz_urnnbn_api.settings.REG_URL = 'https://resolver-test.nkp.cz/api/v3/registrars/'¶
- cz_urnnbn_api.settings.get_all_constants()¶
Get list of all uppercase, non-private globals (doesn’t start with _).
Returns: Uppercase names defined in globals() (variables from this module). Return type: list
- cz_urnnbn_api.settings.substitute_globals(config_dict)¶
Set global variables to values defined in config_dict.
Parameters: config_dict (dict) – dictionary with data, which are used to set globals. Note
config_dict have to be dictionary, or it is ignored. Also all variables, that are not already in globals, or are not types defined in _ALLOWED (str, int, float) or starts with _ are silently ignored.
Catalog structure¶
- class cz_urnnbn_api.api_structures.catalog.Catalog¶
Bases: cz_urnnbn_api.api_structures.catalog.Catalog
Class used for representing informations about Catalogs, where the URN:NBN points.
- uid¶
str – ID of the catalog.
- name¶
str – Name of the catalog.
- created¶
str – ISO 8601 date string .
- url_prefix¶
str – ..
DigitalInstance structure¶
- class cz_urnnbn_api.api_structures.digital_instance.DigitalInstance(url, digital_library_id, **kwargs)¶
Bases: kwargs_obj.kwargs_obj.KwargsObj
Container used to hold informations about instances of the documents in digital library - this is pointer to document in digital library.
- uid¶
str – ID of the library.
- url¶
str – URL of the library.
- digital_library_id¶
str – Id of the digitial library.
- active¶
bool, def. None – Is the record active?
- created¶
str, def. None – ISO 8601 string with date.
- deactivated¶
str, def. None – ISO 8601 string representation of date.
- format¶
str, def. None – Format of the book. jpg;pdf for example.
- accessibility¶
str, def. None – Free description of accessibility.
- to_xml()¶
Convert self to XML, which can be send to API to register new digital instance.
Returns: UTF-8 encoded string with XML representation. Return type: str Raises: AssertionError – If url or digital_library_id is not set.
- static instance_from_xmldict(dict_tag)¶
Create DigitalInstance from nested dicts (result of xmltodict).
Parameters: dict_tag (dict) – Nested dicts. Returns: DigitalInstance object. Return type: obj
- static from_xml(xml)¶
Parse xml string and DigitalInstances.
Parameters: xml (str) – Unicode/utf-8 XML. Returns: List of DigitalInstance objects. Return type: list
DigitalLibrary structure¶
Structure for representing informations about Digital libraries.
- class cz_urnnbn_api.api_structures.digital_library.DigitalLibrary(uid, name, description=None, url=None, created=None)¶
Bases: object
Container used to hold informations about given digital library, where the document to which the URN:NBN points, stored.
- url¶
str – URL of the library.
- uid¶
str – ID of the library.
- name¶
str – Name of the digital library.
- created¶
str – ISO 8601 string.
- description¶
str – Free text description of the library.
Modes structure¶
- class cz_urnnbn_api.api_structures.modes.Modes(by_resolver=False, by_registrar=False, by_reservation=False)¶
Bases: object
Container holding informations about modes which may be used by registrar to register documents.
- by_resolver¶
bool – True if the mode can be used.
- by_registrar¶
bool – True if the mode can be used.
- by_reservation¶
bool – True if the mode can be used.
Registrar structure¶
- class cz_urnnbn_api.api_structures.registrar.Registrar(code, uid, name=None, description=None, created=None, modified=None, modes=None)¶
Bases: object
Class holding informations about Registrar.
- code¶
str – Code of the registrar. Each organization has own.
- name¶
str – Full name of the registrar.
- created¶
str – ISO 8601 date string.
- modified¶
str – ISO 8601 date string.
- description¶
str – Description of the registrar.
- digital_libraries¶
list – List of DigitalLibrary instances with informations about digital libraries used by registrar.
URN_NBN structure¶
- class cz_urnnbn_api.api_structures.urn_nbn.URN_NBN(value, status=None, country_code=None, registrar_code=None, document_code=None, digital_document_id=None, registered=None)¶
Bases: str
Class used to hold URN:NBN string and also other informations returned from server.
Note
This class subclasses str, so URN:NBN string can be obtained painlessly.
- status¶
str – ACTIVE for example.
- registered¶
str – ISO 8601 date string.
- registrar_code¶
str – Identification of registrar.
- digital_document_id¶
str – ID of the document.
- static from_xmldict(xdom, tag_picker=<function <lambda> at 0x7fec634882a8>)¶
Parse itself from xmldict structure.
tools submodule¶
Useful functions used by multiple structures.
- cz_urnnbn_api.api_structures.tools.both_set_and_different(first, second)¶
If any of both arguments are unset (=``None``), return False. Otherwise return result of unequality comparsion.
Returns: True if both arguments are set and different. Return type: bool
- cz_urnnbn_api.api_structures.tools.to_list(tag)¶
Put tag to list if it ain’t list/tuple already.
Parameters: tag (obj) – Anything. Returns: Tag. Return type: list
Usage example¶
Example¶
To register new document, you can compose the XML manually:
from cz_urnnbn_api import api
urn_nbn = api.register_document_obj(
api.MonographComposer(
title="Title of the book",
author="Name of the author",
format="pdf"
)
)
or use MODS metada, if you have them:
from cz_urnnbn_api import
urn_nbn = api.register_document(
api.convert_mono_xml(open("mods_metadata.xml").read(), "pdf")
)
Then you can add digital instances to your urn_nbn identifiers:
api.register_digital_instance(
urn_nbn=urn_nbn,
url="someurl - lets say kramerius",
digital_library_id="to get this, look at get_registrar_info()",
format="epub",
accessibility="public"
)
For list of allowed digital libraries, call get_registrar_info(), which will return Registrar with property Registrar.digital_libraries (DigitalLibrary).
Installation¶
Module is hosted at PYPI, and can be easily installed using PIP:
sudo pip install cz-urnnbn-api
Source code¶
Project is released under MIT license. Source code can be found at GitHub:
Unittests¶
Almost every feature of the project is tested by unittests. You can run those tests using provided run_tests.sh script, which can be found in the root of the project.
The run_tests.sh script can be used to run unittests (-u switch), which doesn’t activelly work with online API, and integration tests (-i switch), which works only with online API. Both tests can be run using -a switch.
If you have any trouble, just add --pdb switch at the end of your run_tests.sh command like this: ./run_tests.sh -a --pdb. This will drop you to PDB shell.
Requirements¶
This script expects that packages pytest and fake-factory is installed. In case you don’t have it yet, it can be easily installed using following command:
pip install --user pytest fake-factory
or for all users:
sudo pip install pytest fake-factory
Example¶
$ ./run_tests.sh -a
============================= test session starts ==============================
platform linux2 -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
plugins: cov
collected 29 items
tests/integration/test_api.py ....
tests/unit/test_rest.py ....
tests/unit/test_xml_composer.py .........
tests/unit/test_xml_convertor.py ..
tests/unit/api_structures/test_digital_instance.py .....
tests/unit/api_structures/test_modes.py ...
tests/unit/api_structures/test_tools.py ..
========================== 29 passed in 0.75 seconds ===========================