Welcome to Django-ProxyList’s documentation!¶
django-proxylist-for-grab is a reusable app for maintain an up-to-date (through checking) list of proxy servers.
Contents:
Introduction¶
Checking proxies¶
Proxy checking is the core of django-proxylist-for-grab, in order to make it you need three things:
- A checker: It shots a request throught the proxy.
- A proxy: Of course ...
- A mirror: It’s a special page that reflects the checker request and allow us to see how an third page would see us.
Dependencies¶
django-proxylist-for-grab depends on:
Packages¶
- django-celery
- django-countries
- pygeoip
- grub
Backend¶
Cache¶
The checking machinery uses django’s cache backend and the default cache but you can alter this behaviour changing the PROXYLIST_CACHE variable.
Database¶
django-proxylist-for-grab does not depends on any database backend by itself, but if you have a big list of proxies and you want to check it at sorts intervals you should avoid SQLite.
Installation¶
Installing the package¶
django-proxylist-for-grab can be easily installed using pip:
$ pip install django-proxylist-for-grab
Configuration¶
After that you need to include django-proxylist-for-grab into your INSTALLED_APPS list of your django settings file.
INSTALLED_APPS = (
...
'proxylist',
...
)
django-proxylist-for-grab has a list of variables that you can configure throught django’s settings file. You can see the entire list at Advanced Configuration.
Use case¶
Command reference¶
update_proxies¶
Add new proxies from a file.
$ python manage.py update_proxies [file1] <file2> <...>
Advanced configuration¶
PROXYLIST_CACHE_TIMEOUT
Maximum number of seconds to mantain a lock at the cache framework.
Default: 0
PROXYLIST_CONNECTION_TIMEOUT
Number of seconds to wait for a connection to open, before canceling the attempt and generate an error.
Default: 30
PROXYLIST_ERROR_DELAY
Number of seconds to add to each check if the last one produced an error.
Default: 300
PROXYLIST_GEOIP_PATH
Path to GeoIP data file.
Default: /usr/share/GeoIP/GeoIP.dat
PROXYLIST_MAX_CHECK_INTERVAL
Maximum number of seconds to the next check if the last one was successful.
Default: 900
PROXYLIST_MIN_CHECK_INTERVAL
Minimum number of seconds to the next check if the last one was successful.
Default: 300
PROXYLIST_OUTIP_INTERVAL
Number of seconds between outbound IP checking (per worker). If you have a fixed IP address you can set this value to 0 (infinity).
Default: 300
PROXYLIST_USER_AGENT
User-Agent for requests.
Default: Django-ProxyList 1.0.0