python-sjsclient¶
Python bindings to Spark Job Server API.
Features¶
- Supports Spark Jobserver 0.6.0+
Library Installation¶
$ pip install python-sjsclient
Getting started¶
First create a client instance:
>>> from sjsclient import client
>>> sjs = client.Client("http://JOB_SERVER_URL:PORT")
Uploading a jar to Spark Jobserver:
>>> jar_file_path = os.path.join("path", "to", "jar")
>>> jar_blob = open(jar_file_path, 'rb').read()
>>> app = sjs.apps.create("test_app", jar_blob)
Uploading a python egg to Spark Jobserver:
>>> from sjsclient import app
>>> egg_file_path = os.path.join("path", "to", "egg")
>>> egg_blob = open(egg_file_path, 'rb').read()
>>> app = sjs.apps.create("test_python_app", egg_blob, app.AppType.PYTHON)
Listing available apps:
>>> for app in sjs.apps.list():
... print app.name
...
test_app
my_streaming_app
Creating an adhoc job:
>>> test_app = sjs.apps.get("test_app")
>>> class_path = "spark.jobserver.VeryShortDoubleJob"
>>> config = {"test_config": "test_config_value"}
>>> job = sjs.jobs.create(test_app, class_path, conf=config)
>>> print("Job Status: ", job.status)
Job Status: STARTED
Polling for job status:
>>> job = sjs.jobs.create(...)
>>> while job.status != "FINISHED":
>>> time.sleep(2)
>>> job = sjs.jobs.get(job.jobId)
Getting job config:
>>> config = {"test_config": "test_config_value"}
>>> job = sjs.jobs.create(test_app, class_path, conf=config)
>>> job_config = job.get_config()
>>> print("test_config value: ", job_config["test_config"])
test_config_value: test_config_value
Listing jobs:
>>> for job in sjs.jobs.list():
... print job.jobId
...
8c5bd52f-6486-44ee-9ac3-a8327ee40494
24b67573-3115-49c7-983c-d0eff0499b71
99c8be9e-a0ec-42dd-8a2c-9a8680bc5051
bb82f712-d4b4-43a4-8e4d-e4bb272e85db
Limiting jobs list:
>>> for job in sjs.jobs.list(limit=1):
... print job.jobId
...
8c5bd52f-6486-44ee-9ac3-a8327ee40494
Creating a named context:
>>> ctx_config = {'num-cpu-cores': '1', 'memory-per-node': '512m'}
>>> ctx = sjs.contexts.create("test_context", ctx_config)
Running a job in a named context:
>>> test_app = sjs.apps.get("test_app")
>>> test_ctx = sjs.contexts.get("test_context")
>>> config = {"test_config": "test_config_value"}
>>> job = sjs.jobs.create(test_app, class_path, ctx=test_ctx, conf=config)
>>> print("Job Status: ", job.status)
Job Status: STARTED
Documentation¶
Discussion list¶
spark-jobserver google group: https://groups.google.com/forum/#!forum/spark-jobserver
Requirements¶
- Python >= 2.7.0
License¶
python-sjsclient is offered under the Apache 2 license.
Source code¶
The latest developer version is available in a github repository: https://github.com/spark-jobserver/python-sjsclient
Contents:
Client API Reference¶
Client¶
- class sjsclient.client.Client(endpoint, auth=None)[source]
Bases: object
Client for Spark Job Server
App¶
- class sjsclient.app.App(manager, attrs=None)[source]
Bases: sjsclient.base.Resource
An app is a spark application.
- name = None
Name of the App
- time = None
App creation time
- class sjsclient.app.AppManager(client)[source]
Bases: sjsclient.base.ResourceManager
Manage App resources.
- base_path = 'binaries'
- create(name, app_binary, app_type='java')[source]
Create an app.
Parameters: - name – Descriptive name of application
- app_binary – Application binary
- app_type – App type, for example java or python, default: java
Return type: App
- delete(name)[source]
Delete a specific App.
Parameters: name – The name of the App to delete.
- get(name)[source]
Get a specific App.
Parameters: name – The name of the App to get. Return type: App
- list()[source]
Lists Apps.
- resource_class
alias of App
Job¶
- class sjsclient.job.Job(manager, attrs=None)[source]
Bases: sjsclient.base.Resource
A Spark job.
- classpath = None
Main java class path
- context = None
Context name
- delete()[source]
Delete job.
- duration = None
Time taken by the job to finish
- get_config()[source]
Get job configuration.
- jobId = None
Job ID
- result = None
Response from Spark.
- status = None
Jobs status
- class sjsclient.job.JobConfig[source]
Bases: dict
A Spark job config dictionary.
- class sjsclient.job.JobManager(client)[source]
Bases: sjsclient.base.ResourceManager
Manage Job resources.
- base_path = 'jobs'
- create(app, class_path, conf=None, ctx=None, sync=False)[source]
Create a Spark job.
Parameters: - app – Instance of App
- class_path – Main class path of spark job.
- conf – Configuration json
- ctx – Instance of Context
- sync – Set to True for synchronous job creation
Return type: Job
- delete(job_id)[source]
Delete a specific Job.
Parameters: job_id – The jobId of the Job to get.
- get(job_id)[source]
Get a specific Job. This returns more information than create.
Parameters: job_id – The jobId of the Job to get. Return type: Job
- get_config(job_id)[source]
Get job configuration.
Parameters: job_id – The jobId of the Job to get. Return type: JobConfig
- resource_class
alias of Job
- class sjsclient.job.JobStatus[source]
Bases: object
A Helper class that contains the job status
- ERROR = 'ERROR'
- FINISHED = 'FINISHED'
- RUNNING = 'RUNNING'
Context¶
- class sjsclient.context.Context(manager, attrs=None)[source]
Bases: sjsclient.base.Resource
A Spark context.
- delete()[source]
Delete context.
- class sjsclient.context.ContextManager(client)[source]
Bases: sjsclient.base.ResourceManager
Manage Context resources.
- base_path = 'contexts'
- create(name, params=None)[source]
Create a Spark context.
Parameters: - name – Descriptive name of context
- params – Dictionary of context parameters
Return type: Context
- delete(name)[source]
Delete a specific Context.
Parameters: name – The name of the Context to delete.
- get(name)[source]
Get a specific Context.
Parameters: name – The name of the Context to get. Return type: Context
- resource_class
alias of Context