Versions

Description

A crawler based on PhantomJS. Allows discovery of dynamic content and supports custom scrapers. For all your ajaxy crawling & scraping needs. * Parallel crawling/scraping via Phantom pooling. * Custom-defined link discovery. * Custom-defined runners (scrape, test, validate, etc.) * Can follow redirects (and because it's based on PhantomJS, JavaScript redirects will be followed as well as <meta> redirects.) * Streaming * Resilient to PhantomJS crashes * Ignores page errors

Repository

https://github.com/crawlkit/crawlkit.git

Project Slug

crawlkit

Last Built

6 years, 10 months ago passed

Maintainers

Badge

Tags

axe, crawling, phantomjs, scraper

Short URLs

crawlkit.readthedocs.io
crawlkit.rtfd.io

Default Version

latest

'latest' Version

master