Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Crawlab | 10,521 | a year ago | 1 | March 03, 2019 | 58 | bsd-3-clause | Go | |||
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架 | ||||||||||
Scrapy Redis | 5,546 | 176 | 21 | 9 months ago | 18 | July 26, 2022 | 29 | mit | Python | |
Redis-based components for Scrapy. | ||||||||||
Haipproxy | 5,384 | 1 | 2 years ago | 7 | June 18, 2018 | 44 | mit | Python | ||
:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis | ||||||||||
Distribute_crawler | 3,176 | 8 years ago | 26 | Python | ||||||
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现 | ||||||||||
Scrapy Cluster | 1,137 | 18 | 2 | a year ago | 15 | December 23, 2020 | 17 | mit | Python | |
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster. | ||||||||||
Python Spider | 680 | 3 years ago | apache-2.0 | Python | ||||||
豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章 | ||||||||||
Spiderman | 498 | 2 years ago | 3 | mit | Python | |||||
基于 scrapy-redis 的通用分布式爬虫框架 | ||||||||||
Zi5book | 183 | 6 years ago | Python | |||||||
book.zi5.me全站kindle电子书籍爬取,按照作者书籍名分类,每本书有mobi和equb两种格式,采用分布式进行全站爬取 | ||||||||||
Zhihuspider | 149 | 7 years ago | mit | Python | ||||||
知乎分布式爬虫(Scrapy、Redis) | ||||||||||
Crawlerproject | 147 | 3 years ago | 20 | Python | ||||||
爬虫项目:链家网(普通/scrapy)、虎扑、维基百科、百度地图api、房天下(分布式爬虫)、微信公众号(代理池爬取) |