Welcome to JamDB’s documentation!¶
JamDB is a schema-less, immutable database that can optionally enforce a schema and stores provenance. It supports efficient full-text search, filtering by nested keys, and is accessible a REST API.
It has pluggable storage backends. It defaults to using both MongoDB and Elasticsearch.
The full BDD test suite lives in the features directory and describes in human-readable prose how each endpoint works. What it’s expecting for input and what it expects for output.
Contents:
Prerequisites¶
This tutorial assumes that the JamDB server you’re interacting with will be
at http://localhost:1212
. It also assumes access to a terminal shell on
either OSX or Linux with the curl
command installed and executable.
Installing JamDB¶
JamDB is written in Python using the Tornado Web Framework.
It requires MongoDB version >= 3.2 and Elasticsearch version 1.7.
Steps¶
- Clone the JamDB git repo
- Create a new virtual environment for JamDB called
jam
- Setup your new virtual environment for JamDB
- Once in your virtual env, change into the directory you cloned JamDB into and execute
python setup.py develop
- Then execute
pip install -r requirements.txt
- Once in your virtual env, change into the directory you cloned JamDB into and execute
- Install MongoDB and Elasticsearch
- Confirm they’re both running:
ps aux | grep -i 'elasticsearch\|mongod'
- Run
jam server
- In another terminal, run
curl http://localhost:1212/v1/namespaces/
to confirm you can connect to the server. The response should be:{"data": [], "links": {}, "meta": {"perPage": 50, "total": 0}}
Next you should see the Namespaces section for instructions on creating your first namespace.
Namespaces¶
A namespace is the equivalent of a database in MongoDB or PostgreSQL.
It will act as a top-level container for any collections you make and store permissions that apply to itself and cascade to anything inside of it.
Administrator privileges are required to make any modification to a namespace.
Creating a namespace¶
Currently, there is no way to create a namespace through the API. If you’re working with a remote instance of JamDB, contact the server administrator to create a namespace. If you’re running a local instance of JamDB, you can create a namespace by running jam create <namespace id> -u 'jam-ProgrammingLanguages:Programmers-Ash
.
We’ll be usingProgrammingLanguages
as the example namespace id for the rest of this document. Namespace ids are case-sensitive.
Once your namespace is setup, you’ll need to send the proper Authorization
header to access it.
Authorizing against a namespace¶
JamDB uses json web tokens, JWT for short, in the Authorization
header or the token
query string parameter.
There are three ways to acquire a JWT:
- Contact the server administrator and request a temporary token.
- Authenticate via the Auth Endpoint
- If you are running a JamDB server locally you can generate a token by running
jam token 'jam-ProgrammingLanguages:Programmers-Ash'
We’ll be usingmycooljwt
as the example JWT for the rest of this document.
Investigating a namespace¶
You can get information about your namespace by making an HTTP request using curl, Paw, or a similar program.
HTTP Request:
GET /v1/namespaces/ProgrammingLanguages HTTP/1.1
Authorization: mycooljwt
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages",
"type": "namespaces",
"attributes": {
"name": "ProgrammingLanguages",
"permissions": {
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN"
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Permissions may be different depending on how you got your JWT.
Namespace Permissions¶
Giving other users permissions to a namespace is easy.
We can update our namespace in two ways.
We can use jsonpatch to add just the field we want.
HTTP Request:
PATCH /v1/namespaces/ProgrammingLanguages HTTP/1.1
Authorization: mycooljwt
Content-Type: Content-Type: application/vnd.api+json; ext=jsonpatch
[{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-*", "value": "READ"}]
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages",
"type": "namespaces",
"attributes": {
"name": "ProgrammingLanguages",
"permissions": {
"jam-ProgrammingLanguages:Programmers-*": "READ",
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN"
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Many jsonpatch objects may be sent at once.
HTTP Request:
PATCH /v1/namespaces/ProgrammingLanguages HTTP/1.1
Authorization: mycooljwt
Content-Type: Content-Type: application/vnd.api+json; ext=jsonpatch
[
{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-*", "value": "READ"},
{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-Misty", "value": "ADMIN"},
{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-Brock", "value": "ADMIN"}
]
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages",
"type": "namespaces",
"attributes": {
"name": "ProgrammingLanguages",
"permissions": {
"jam-ProgrammingLanguages:Programmers-*": "READ",
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN",
"jam-ProgrammingLanguages:Programmers-Misty": "ADMIN",
"jam-ProgrammingLanguages:Programmers-Brock": "ADMIN",
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Or we can just PATCH up our updated data and let the JamDB server figure it out.
This is potentially a destructive action. Any existing permissions will be completely replaced. If you want to do a partial update use the JSONPatch method above.
HTTP Request:
PATCH /v1/namespaces/ProgrammingLanguages HTTP/1.1
Authorization: mycooljwt
{
"data": {
"id": "ProgrammingLanguages",
"type": "namespaces",
"attributes": {
"permissions": {
"jam-ProgrammingLanguages:Programmers-*": "READ",
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN"
}
}
}
}
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages",
"type": "namespaces",
"attributes": {
"permissions": {
"jam-ProgrammingLanguages:Programmers-*": "READ",
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN"
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Collections are the next step in the documentation.
Collections¶
A collection is a bucket for arbitrary data. It’s the equivalent of a table in a SQL or NoSQL database. It may enforce a schema on its data. It also may extend the permissions of the namespace.
Creating a collection¶
To create a collection we just have to POST the data about our collection to our namespace’s collections endpoint.
HTTP Request:
POST /v1/namespaces/ProgrammingLanguages/collections HTTP/1.1
Authorization: mycooljwt
{
"data": {
"id": "Functional",
"type": "collections",
"attributes": {}
}
}
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages.Functional",
"type": "collections",
"attributes": {
"permissions": {
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN"
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Please Note:
- We have been given ADMIN access to this collection because we created it.
- The id has been extended to
ProgrammingLanguages.Functional
because theFunctional
collection belongs to theProgrammingLanguages
namespace.- The full id (
ProgrammingLanguages.Functional
) or the truncated id (Functional
) may be used when sending update requests.- The truncated id will be used for the rest of this document.
Now our fellow programmers are free to browse through the Functional collection, which we will add information into later.
It’s a lot of work to load all this data into our collection by ourselves. Let’s get some help!
Adding collection permissions¶
We want to give a couple of our friends access to insert data into this collection but we don’t want to grant them access to all of our collection.
Using collection-level permissions, we can do just that.
Collections can be updated the same way that namespace are, either POSTing or PATCHing data.
Please note:
- The JSONPatching format is a bit nicer to look at so we’ll be using that method for the rest of this document.
- Keep in mind that you could just as easily PATCH the updated document instead.
PATCH /v1/namespaces/ProgrammingLanguages/collections/Functional HTTP/1.1
Authorization: mycooljwt
[
{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-Gary", "value": "CREATE, UPDATE"},
{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-ProfessorOak", "value": "CREATE, UPDATE"},
{"op": "add", "path": "/permissions/jam-ProgrammingLanguages:Programmers-ProfessorBirch", "value": "CREATE, UPDATE"}
]
{
"data": {
"id": "ProgrammingLanguages.Functional",
"type": "collections",
"attributes": {
"permissions": {
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN",
"jam-ProgrammingLanguages:Programmers-Gary": "CU",
"jam-ProgrammingLanguages:Programmers-ProfessorOak": "CU",
"jam-ProgrammingLanguages:Programmers-ProfessorBirch": "CU"
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Please note:
- Our permissions got compressed from CREATE, UPDATE to CU. This is the format JamDB stores permissions. CREATE, UPDATE and CU are equivalent. We could have set Gary’s, Professor Oak’s, and Professor Birch’s permissions to CU but CREATE, UPDATE is a bit easier to read.
- Remember that we gave
jam-ProgrammingLanguages:Programmers-*
READ permissions earlier.- Whenever Gary, Professor Oak, or Professor Birch access the Functional collection they will have that permission added to their CREATE, UPDATE permissions.
While we trust our friends, we may want to enforce data validation.
We are going to leverage the power of JSONSchema and JamDB’s schema validation for this.
Note: For the sake of length and readability we are going to use an abbreviated schema. The actual Functional schema is much longer because we’re huge nerds.
HTTP Request:
PATCH /v1/namespaces/ProgrammingLanguages/collections/Functional HTTP/1.1
Authorization: mycooljwt
[
{
"op": "add",
"path": "/schema",
"value": {
"type": "jsonschema",
"schema": {
"id": "/",
"type": "object",
"properties": {
"name": {
"id": "name",
"type": "string"
},
"type": {
"id": "type",
"type": "string"
},
"Number": {
"id": "Number",
"type": "integer"
},
"Interpreted": {
"id": "Interpreted",
"type": "boolean"
}
},
"required": [
"name",
"type",
"Number",
"Interpreted"
]
}
}
}
]
Please Note:
schema.type
must be set to the type of the schema. The actual schema lives atschema.schema
. This is so that JamDB may support other forms of schema validation in the future. Currently JSONSchema is the only supported validator.$
are illegal in JamDB key names.- Make sure not to use
$schema
or$ref
in your JSONSchema.
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages.Functional",
"type": "collections",
"attributes": {
"permissions": {
"jam-ProgrammingLanguages:Programmers-Ash": "ADMIN",
"jam-ProgrammingLanguages:Programmers-Gary": "CU",
"jam-ProgrammingLanguages:Programmers-ProfessorOak": "CU",
"jam-ProgrammingLanguages:Programmers-ProfessorBirch": "CU"
},
"schema": {
"type": "jsonschema",
"schema": {
"id": "/",
"type": "object",
"properties": {
"type": {
"id": "type",
"type": "string"
},
"Number": {
"id": "Number",
"type": "integer"
},
"Interpreted": {
"id": "Interpreted",
"type": "boolean"
}
},
"required": [
"name",
"type",
"Number",
"Interpreted"
]
}
}
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
Documents would be a good place to continue on to.
Documents¶
A document is any JSON object with a string identifier that lives in a collection.
Strings, numbers, and arrays are all valid JSON but the root of a document must be a JSON object.
Time for the fun part: Filling out the functional collection!
Creating Documents¶
Documents are created like anything else: by POSTing to functional collection’s documents endpoint.
HTTP Request:
POST /v1/namespaces/ProgrammingLanguages/collections/Functional/documents HTTP/1.1
Authorization: mycooljwt
{
"data": {
"id": "Clojure",
"type": "documents",
"attributes": {
"Number": 35,
"Interpreted": true,
"type": "JVM"
}
}
}
HTTP Response:
{
"data": {
"id": "ProgrammingLanguages.Functional.Clojure",
"type": "documents",
"attributes": {
"Number": 35,
"Interpreted": true,
"type": "JVM"
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
}
For this next portion, we’re going to assume that our friends have filled out the rest of the Functional collection for us. Such nice friends.
Filtering, Pagination, and Sorting¶
Now that we have all our data loaded up, let’s search it. We’ll start with finding all entries of the type JVM
.
Filtering¶
- Filtering is available on the
documents
endpoint- The query string parameter is
filter[{key}]={value}
{key}
is the key that you want to filter on{value}
is the value that you want to filter the key by.
s are used to separate keys when referring to a nested object,filter[nested.keys.like.this]=value
Page size¶
- To save space we’ll be using a page size of 2
- Page size may be anywhere between 0 and 100, inclusive, and defaults to 50
- The query string parameter is
page[size]={value}
HTTP Request:
GET /v1/namespaces/ProgrammingLanguages/collections/Functional/documents?filter[type]=JVM&page[size]=2 HTTP/1.1
Authorization: mycooljwt
HTTP Response:
{
"data": [
{
"id": "ProgrammingLanguages.Functional.Clojure",
"type": "documents",
"attributes": {
"Number": 35,
"Interpreted": true,
"type": "JVM"
},
"meta": {/*...*/},
"relationships": {/*...*/}
}, {
"id": "ProgrammingLanguages.Functional.Haskell",
"type": "documents",
"attributes": {
"Number": 60,
"Interpreted": false,
"type": "native"
},
"meta": {/*...*/},
"relationships": {/*...*/}
},
],
"links": {/*...*/}
}
Sorting¶
Next let’s find the entry with the highest Number that is an JVM type.
This can be achieved by filtering on type and then sorting on Number.
Please note:
- Sorts may be done ascending or descending by prefixing the key you wish to sort on with
+
or-
, respectively- Sort order defaults to ascending
- Sort defaults to id
- If you want to sort on id, descending use
sort=-ref
. This is subject to change- The query string parameter is
sort={order}{value}
HTTP Request:
GET /v1/namespaces/ProgrammingLanguages/collections/Functional/documents?filter[type]=JVM&page[size]=1&sort=Number HTTP/1.1
Authorization: mycooljwt
HTTP Response:
{
"data": [
{
"id": "ProgrammingLanguages.Functional.Elixir",
"type": "documents",
"attributes": {
"Number": 90,
"Interpreted": false,
"type": "erlang"
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
],
"links": {/*...*/}
}
Searching¶
Finally, Gary is trying to remember the id of a specific entry but only remembers that is ends with “oq”
What an excellent opportunity for us to tap into JamDB’s Elastic Search API.
Please note:
- The power of elasticsearch’s query string syntax is exposed as the
q
query string parameter on the _search endpoint.- In accordance with the other query string parameters, to query the id of a document use the
ref
key instead of id.- The query string parameter is
q={url_escaped_elasticsearch_query}
HTTP Request:
GET /v1/namespaces/ProgrammingLanguages/collections/Functional/documents?q=ref:*oq HTTP/1.1
Authorization: mycooljwt
HTTP Response:
{
"data": [
{
"id": "ProgrammingLanguages.Functional.Coq",
"type": "documents",
"attributes": {
"Number": 106,
"Interpreted": true,
"type": "OCaml"
},
"meta": {/*...*/},
"relationships": {/*...*/}
}
],
"links": {/*...*/}
}
Authentication¶
JamDB uses Json Web Tokens for authentication.
Authenticating¶
JamDB allows authentication through many providers. Currently osf
and self
are the only available providers.
A user may authenticate to JamDB by sending a properly formatted POST
request to Jam’s auth endpoint, /v1/auth
POST /v1/auth HTTP/1.1
{
"data": {
"type": "users",
"attributes": {
"provider": ...
...
}
}
}
Note
The elements of attributes
have been left blank in this example as they vary per provider.
The following sections will cover what each provider needs to properly authenticate
A successful authentication request will return the following data
{
"data": {
"id": "<type>-<provider>-<id>",
"type": "users",
"attributes": {
"id": "<id>",
"type": "<type>",
"provider": "<provider>",
"token": "<jwt>",
}
}
}
data.id
is the user id it will be matched against user selectors to calculate it’s permissions.
data.attributes.id
is the provider specific id for this user.
data.attributes.type
is the type of user for this user.
data.attributes.provider
is the provider that was used to authenticate as this user.
data.attributes.token
is the jwt used to authorize requests to JamDB
OSF¶
You will need an OSF account and an OAuth2 access token to authenticate via the OSF provider.
You may sign up for an account at osf.io.
To acquire an access token you may either generate a personal access token in user settings or via an OAuth2 authorization flow of an OSF app.
GET /v1/auth HTTP/1.1
{
"data": {
"type": "users",
"attributes": {
"provider": "osf",
"access_token": "<token>",
}
}
}
Authorizing¶
Authorization may be provided for an HTTP request in either the Authorization
header or the token
query parameter.
Note: TheAuthorization
header takes precedence over thetoken
query parameter
GET /v1/namespaces/ProgrammingLanguages HTTP/1.1
Authorization: mycooljwt
PUT /v1/namespaces/ProgrammingLanguages?token=mycooljwt HTTP/1.1
User Ids¶
User Ids are made of three parts separated by -
s.
<type>-<provider>-<id>
Note:*
,-
and.
are illegal characters in user ids
Type¶
Currently there are 3 types, user
, anon
, and jam
.
user
indicates that the user was authenticated via a 3rd party service, such as the OSF, Google, or even Facebook.
anon
indicates that the user simply requested a token to access JamDB, anyone may be a anon user.
jam
indicates that the user was authenticated via a collection existing in jam.
Provider¶
A provider is simply the service that was used to authenticate.
In the case of the user
type this may be osf
, google
, facebook
, etc.
anon
users do not have a provider.
For the jam
user type, provider is the namespace and collection that the user “logged into” separated by a :
. ie ProgrammingLanguages:Functional
Id¶
An id is any given string used by their provider to identify a user.
User Selectors¶
Selector | Meaning |
---|---|
* |
Matches ALL users, authenticated or not |
<type>-* |
Matches all authenticated users with the type
<type> |
<type>-<provide
r>-* |
Matches all users of the given type that have
authenticated via <provider> |
<type>-<provide
r>-<id> |
Matches an exact user |
User Selectors¶
Objective | Selector |
---|---|
Match everyone | * |
Match all users authenticated via OSF | user-osf-* |
Match all users authenticated via a 3rd party service | user-* |
Match anonymous users | anon-* |
Match a specific user | user-osf-juwia |
Data Modeling¶
JamDB is non-relational. A document is any JSON object with a string identifier that lives in a collection.
Strings, numbers, and arrays are all valid JSON but the root of a document must be a JSON object.
Your data model is up to you. You can use IDs to create pseudo-relationships.
+--------------------------------------------------------+
|AuthorizationNamespace |
+--------------------------------------------------------+
| |
| +---------------+ |
| |UserCollection | |
| +---------------+ |
| |Id +----------+ |
| |Name | | |
| |FavoriteColor | | |
| |ShirtSize | | |
| |Gender | +--------------------+ |
| |Email | ||UserGroupCollection| |
| +---------------+ |--------------------+ |
| ||UserId | |
| +---------------------------+GroupId | |
| |GroupCollection|| | | |
| +----------------| +-------------------+ |
| |Id ++ |
| |Name | |
| +---------------+ |
| |
+--------------------------------------------------------+
Contributing¶
We welcome contributions via [GitHub](https://github.com/CenterForOpenScience/jamdb).
Before submitting your pull request, please make sure that all unit tests are passing, by running the command below:
behave
To preview changes to the documentation, install the requirements in dev-requirements.txt, then run the following:
cd docs
make html
Limitations¶
Due to some of the underlying technologies, limits exist on the amount and types of data that may be stored.
Imposed by JamDB¶
- Namespace, collection, and document IDs may not excede 64 characters
Elasticsearch¶
- String values may not exceed 32766 bytes
- The string values
Infinity
and-Infinity
may not be used where other documents would have a numeric value
API Semantics¶
Notes¶
- ids are must match the regex [\d\w-]{3,64}
Routes list¶
/v1/namespaces
/v1/namespaces/<namespace_id>
/v1/namespaces/<namespace_id>/collections
/v1/namespaces/<namespace_id>/collections/<collection_id>
/v1/namespaces/<namespace_id>/collections/<collection_id>/_search
/v1/namespaces/<namespace_id>/collections/<collection_id>/documents
/v1/namespaces/<namespace_id>/collections/<collection_id>/documents/<document_id>
/v1/namespaces/<namespace_id>/collections/<collection_id>/documents/<document_id>/history
/v1/namespaces/<namespace_id>/collections/<collection_id>/documents/<document_id>/history/<history_id>
/v1/id/namespaces
/v1/id/namespaces/<namespace_id>
/v1/id/namespaces/<namespace_id>/collections
/v1/id/collections/<namespace_id>.<collection_id>
/v1/id/collections/<namespace_id>.<collection_id>/_search
/v1/id/collections/<namespace_id>.<collection_id>/documents
/v1/id/documents/<namespace_id>.<collection_id>.<document_id>
/v1/id/documents/<namespace_id>.<collection_id>.<document_id>/history
/v1/id/history/<namespace_id>.<collection_id>.<document_id>.<history_id>
Extensions¶
JSONPatch¶
JamDB implements JSONAPI’s jsonpatch extension as described here.
Example payloads and responses may be seen here, here, or here.
Deviations¶
JSONPatch is not currently supported with bulk operations.
Bulk¶
JamDB implements JSONAPI’s bulk extension as described here.
Example payloads and responses may be seen here.
Deviations¶
Bulk deletes are not currently supported.
Bulk operations are not transactional.
If a document creation fails for any reason it will not impede the creation of other documents.
The failure will be returned in the errors
key of the response JSON corresponding to it’s index in the POSTed data
field.
The behavior is demonstrated here.