dbgdb¶
This is a library containing Luigi tasks to help you get that data into (and out of) your GIS databases. The library also comes with a command-line interface based on Click that can be helpful for running tasks individually.
Getting Started¶
Prerequisites¶
Using Tasks¶
This library contains a number of Luigi tasks that you can use in your own pipelines. These include:
dbgdb.tasks.postgres.load.PgLoadTask
- for loading data (like file geodatabases) into your PostgreSQL database;
dbgdb.tasks.postgres.extract.PgExtractTask
- for extrating data from your PostgreSQL database; and
dbgdb.tasks.postgres.drop.PgDropSchemaTask
- if you need to drop an import schema because you’re starting over.
Using the Command Line¶
Most of the commands you run with the command-line interface (CLI) create Luigi tasks which are then submitted to a Luigi scheduler.
Note
The -l parameter indicates that the tasks should be run using the local scheduler. The examples listed below use this parameter. If you want to submit tasks to the Luigi daemon (luigid) you can simply omit this parameter.
Getting Help¶
dbgdb has its own command-line help.
dbgdb --help
Loading Data¶
You can load a file geodatabase with the load subcommand.
dbgdb -l load --schema myschema /path/to/your/data.gdb
Extracting Data¶
You can extract all of the data within a schema to an output file with the extract subcommand.
dbgdb -l extract --schema myschema /path/to/your/exported/data.db
Note
At the moment, we can export to GeoPackage or Spatialite formats. Support for ESRI File Geodatabases (gdb) is still in the works.
Dropping a Schema¶
If the target schema for your load command already exists, you may notice Luigi reporting there was nothing to do because, from the task’s perspective, the work has already been done. If you need to drop a schema, you can use the drop subcommand.
dbgdb -l drop schema myschema
API Documentation¶
dbgdb¶
Data pipeline utilities to help you get data into and out of your database.
dbgdb.db.postgres¶
This module contains utility functions to help when working with PostgreSQL databases.
-
dbgdb.db.postgres.
connect
(url: str, dbname: str = None, autocommit: bool = False)[source]¶ Create a connection to a Postgres database.
Parameters: - url – the Postgres instance URL
- dbname – the target database name (if it differs from the one specified in the URL)
- autocommit – Set the autocommit flag on the connection?
Returns: a psycopg2 connection
-
dbgdb.db.postgres.
create_db
(url: str, dbname: str, admindb: str = 'postgres')[source]¶ Create a database on a Postgres instance.
Parameters: - url – the Postgres instance URL
- dbname – the name of the database
- admindb – the name of an existing (presumably the main) database
Returns:
-
dbgdb.db.postgres.
create_extensions
(url: str)[source]¶ Create the necessary database extensions.
Parameters: url – the URL of the database instance
-
dbgdb.db.postgres.
create_schema
(url: str, schema: str)[source]¶ Create a schema in the database.
Parameters: - url – the URL of the database instance
- schema – the name of the schema
-
dbgdb.db.postgres.
db_exists
(url: str, dbname: str = None, admindb: str = 'postgres') → bool[source]¶ Does a given database on a Postgres instance exist?
Parameters: - url – the Postgres instance URL
- dbname – the name of the database to test
- admindb – the name of an existing (presumably the main) database
Returns: True if the database exists, otherwise False
-
dbgdb.db.postgres.
drop_schema
(url: str, schema: str)[source]¶ Drop a schema from the database.
Parameters: - url – the URL of the database instance
- schema – the name of the schema
dbgdb.cli¶
This is the entry point for the command-line interface (CLI) application. It can be used as a handy facility for running the task from a command line.
Note
To learn more about Click visit the project website. There is also a very helpful tutorial video.
To learn more about running Luigi, visit the Luigi project’s Read-The-Docs page.
-
class
dbgdb.cli.
Info
[source]¶ Bases:
object
This is an information object that can be used to pass data between CLI functions.
-
dbgdb.cli.
pass_info
(f)¶ pylint: disable=invalid-name
dbgdb.ogr.postgres¶
This module contains wrapper functions for work that would generally be handled by OGR/GDAL.
-
class
dbgdb.ogr.postgres.
OgrDrivers
[source]¶ Bases:
enum.Enum
These are the supported OGR drivers.
-
GeoPackage
= 'GPKG'¶
-
PostGIS
= 'PostgreSQL'¶
-
Spatialite
= 'SQLITE'¶
-
-
dbgdb.ogr.postgres.
extract
(outdata: pathlib.Path, schema: str = 'imports', url: str = 'postgresql://postgres@localhost:5432/postgres', driver: dbgdb.ogr.postgres.OgrDrivers = <OgrDrivers.Spatialite: 'SQLITE'>)[source]¶ Extract a schema from a PostgreSQL database to a file geodatabase.
Parameters: - outdata – the path to the output
- schema – the schema to export
- url – the URL of the Postgres database instance
- driver – the OGR driver to use
-
dbgdb.ogr.postgres.
load
(indata: pathlib.Path, url: str = 'postgresql://postgres@localhost:5432/postgres', schema: str = 'imports', overwrite: bool = True, progress: bool = True, use_copy: bool = True, driver: dbgdb.ogr.postgres.OgrDrivers = <OgrDrivers.PostGIS: 'PostgreSQL'>)[source]¶ Load a file geodatabase (GDB) into a Postgres database.
Parameters: - indata – the path to the file geodatabase
- url – the URL of the Postgres instance
- schema – the target schema
- overwrite – Overwrite existing data?
- progress – Show progress?
- use_copy – Use COPY instead of INSERT when loading?
- driver – the OGR driver to use
dbgdb.targets.postgres¶
This module contains PostgreSQL targets.
-
class
dbgdb.targets.postgres.
PgSchemaTarget
(url: str, schema: str, dne: bool = False)[source]¶ Bases:
luigi.target.Target
This is a target that represents a file geodatabase (GDB).
dbgdb.tasks.postgres.drop¶
This module contains the PgDropSchemaTask
task which you can use to
drop a schema.
-
class
dbgdb.tasks.postgres.drop.
PgDropSchemaTask
(*args, **kwargs)[source]¶ Bases:
luigi.task.Task
This task loads a file geodatabase into a database instance.
Variables: - url – the URL of the target database
- schema – the target schema
-
output
() → dbgdb.targets.postgres.PgSchemaTarget[source]¶ This task returns a
PgSchemaTarget
that points to the target schema where the GDB was loaded.Returns: the PostgreSQL schema target
-
schema
= <luigi.parameter.Parameter object>¶
-
url
= <luigi.parameter.Parameter object>¶
dbgdb.tasks.postgres.extract¶
This module contains the ExtractTask
task which you can use to
extract data from your database instance.
-
class
dbgdb.tasks.postgres.extract.
PgExtractTask
(*args, **kwargs)[source]¶ Bases:
luigi.task.Task
This task loads a file geodatabase into a database instance.
Variables: - url – the URL of the target database
- schema – the target schema
- outdata – the path to which data should be exported
- driver – the driver to use for exporting data
-
driver
= <luigi.parameter.EnumParameter object>¶
-
outdata
= <luigi.parameter.Parameter object>¶
-
output
() → luigi.local_target.LocalTarget[source]¶ This task returns a
PgSchemaTarget
that points to the target schema where the GDB was loaded.Returns: the PostgreSQL schema target
-
schema
= <luigi.parameter.Parameter object>¶
-
url
= <luigi.parameter.Parameter object>¶
dbgdb.tasks.postgres.load¶
This module contains the LoadTask
task which you can use to
load a file geodatabase into your database instance.
-
class
dbgdb.tasks.postgres.load.
PgLoadTask
(*args, **kwargs)[source]¶ Bases:
luigi.task.Task
This task loads a file geodatabase into a database instance.
Variables: - url – the URL of the target database
- schema – the target schema
- indata – the path to the input data
-
indata
= <luigi.parameter.Parameter object>¶
-
output
() → dbgdb.targets.postgres.PgSchemaTarget[source]¶ This task returns a
PgSchemaTarget
that points to the target schema where the data was loaded.Returns: the PostgreSQL schema target
-
schema
= <luigi.parameter.Parameter object>¶
-
url
= <luigi.parameter.Parameter object>¶
Python Module Dependencies¶
The requirements.txt
file contains this project’s module dependencies. You can install these dependencies
using pip
.
pip install -r requirements.txt
requirements.txt¶
addict>=2.1.3,<3
click>=6.7,<7
luigi>=2.7.5,<3
luijo>=0.0.17,<0.1
mock>=2.0.0,<3
parameterized>=0.6.1,<1
pip-check-reqs>=2.0.1,<3
psycopg2-binary>=2.7.4,<3
pylint>=1.8.4,<2
pytest>=3.4.0,<4
pytest-cov>=2.5.1,<3
pytest-pythonpath>=0.7.2,<1
setuptools>=38.4.0
Sphinx==1.7.2
sphinx-rtd-theme==0.3.0
testing.postgresql>=1.3.0,<2
tox>=3.0.0,<4
twine>=1.11.0,<2