Versions

Build a version

Description

Pachyderm is a Data Lake -- a place to dump and process gigantic data sets. Pachyderm is inspired by the Hadoop ecosystem but shares no code with it. Instead, we leverage the container ecosystem to provide the broad functionality of Hadoop with the ease of use of Docker.

Pachyderm offers the following core functionality:

Virtually limitless storage for any data. Virtually limitless processing power using any tools. Tracking of data history, provenance and ownership. (Version Control for data). Automatic processing on new data as it’s ingested. (Streaming). Chaining processes together. (Pipelining)

Repository

https://github.com/pachyderm/pachyderm.git

Last Built

1 day, 19 hours ago passed

Owners

Home Page

https://pachyderm.io

Badge

Tags

Data Analytics, Big Data, Data Science

Project Privacy Level

Public

Short URLs

pachyderm.readthedocs.io
pachyderm.rtfd.io

Default Version

latest

'latest' Version

master