Pachyderm is a Data Lake -- a place to dump and process gigantic data sets. Pachyderm is inspired by the Hadoop ecosystem but shares no code with it. Instead, we leverage the container ecosystem to provide the broad functionality of Hadoop with the ease of use of Docker.
Pachyderm offers the following core functionality:
Virtually limitless storage for any data. Virtually limitless processing power using any tools. Tracking of data history, provenance and ownership. (Version Control for data). Automatic processing on new data as it’s ingested. (Streaming). Chaining processes together. (Pipelining)
16 hours, 28 minutes ago passed
.. image:: http://readthedocs.org/projects/pachyderm/badge/?version=latest :target: http://docs.pachyderm.io/en/latest/?badge=latest :alt: Documentation Status
<a href='http://docs.pachyderm.io/en/latest/?badge=latest'> <img src='http://readthedocs.org/projects/pachyderm/badge/?version=latest' alt='Documentation Status' /> </a>
Project Privacy Level