Versions

Description

Large quantities of data are increasingly available thanks to initiatives sponsoring the collection of large-scale data and efforts to increase the publication of already collected datasets. As a result, progress in research is increasingly limited by the speed at which we can organize and analyze data. To help improve reseachers' ability to quickly access and analyze data, we have developed software that designs database structures for these datasets and then downloads the data, pre-processes it, and installs it into major database management systems (at the moment we support MySQL, PostgreSQL, SQLite, XML, Json and Microsoft Access). Once the Data Retriever has loaded the data into the database it is easy to connect to the database using standard tools (e.g., MS Access, Filemaker, etc.).The Data Retriever can download and install small datasets in seconds and large datasets in minutes. The program also cleans up known issues with the datasets and automatically restructures them into a format appropriate for standard database management systems. The automation of this process reduces the time for a user to get most large datasets up and running by hours, and in some cases days.

Repository

https://github.com/weecology/retriever.git

Project Slug

retriever

Last Built

1 year ago passed

Maintainers

Home Page

http://data-retriever.org

Badge

Tags

data, data-retrieval, data-science, dataset, datasets, python

Short URLs

retriever.readthedocs.io
retriever.rtfd.io

Default Version

latest

'latest' Version

main