|
1 | 1 | Redis-based components for Scrapy
|
2 | 2 | =================================
|
3 | 3 |
|
4 |
| -This is a initial work on Scrapy-Redis integration, not production-tested. |
5 |
| -Use it at your own risk! |
| 4 | +This project attempts to provide Redis-backed components for Scrapy. |
6 | 5 |
|
7 | 6 | Features:
|
8 | 7 |
|
9 | 8 | * Distributed crawling/scraping
|
| 9 | + You can start multiple spider instances that share a single redis queue. |
| 10 | + Best suitable for broad multi-domain crawls. |
10 | 11 | * Distributed post-processing
|
| 12 | + Scraped items gets pushed into a redis queued meaning that you can start as |
| 13 | + many as needed post-processing processes sharing the items queue. |
11 | 14 |
|
12 | 15 | Requirements:
|
13 | 16 |
|
14 |
| -* Scrapy >= 0.13 (development version) |
| 17 | +* Scrapy >= 0.14 |
15 | 18 | * redis-py (tested on 2.4.9)
|
16 |
| -* redis server (tested on 2.2-2.4) |
| 19 | +* redis server (tested on 2.4-2.6) |
17 | 20 |
|
18 | 21 | Available Scrapy components:
|
19 | 22 |
|
@@ -149,6 +152,32 @@ Then:
|
149 | 152 | redis-cli lpush myspider:start_urls http://google.com
|
150 | 153 |
|
151 | 154 |
|
| 155 | +Changelog |
| 156 | +--------- |
| 157 | + |
| 158 | +0.5 |
| 159 | + * Added `REDIS_URL` setting to support Redis connection string. |
| 160 | + * Added `SCHEDULER_IDLE_BEFORE_CLOSE` setting to prevent the spider closing too |
| 161 | + quickly when the queue is empty. Default value is zero keeping the previous |
| 162 | + behavior. |
| 163 | + |
| 164 | +0.4 |
| 165 | + * Added `RedisSpider` and `RedisMixin` classes as building blocks for spiders |
| 166 | + to be fed through a redis queue. |
| 167 | + * Added redis queue stats. |
| 168 | + * Let the encoder handle the item as it comes instead converting it to a dict. |
| 169 | + |
| 170 | +0.3 |
| 171 | + * Added support for different queue classes. |
| 172 | + * Changed requests serialization from `marshal` to `cPickle`. |
| 173 | + |
| 174 | +0.2 |
| 175 | + * Improved backward compatibility. |
| 176 | + * Added example project. |
| 177 | + |
| 178 | +0.1 |
| 179 | + * Initial version. |
| 180 | + |
152 | 181 |
|
153 | 182 | .. image:: https://d2weczhvl823v0.cloudfront.net/darkrho/scrapy-redis/trend.png
|
154 | 183 | :alt: Bitdeli badge
|
|
0 commit comments