Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Proxy_pool | 19,442 | 4 months ago | 273 | mit | Python | |||||
Python ProxyPool for web spider | ||||||||||
Pyspider | 15,943 | 30 | 2 | 10 months ago | 17 | April 18, 2018 | 297 | apache-2.0 | Python | |
A Powerful Spider(Web Crawler) System in Python. | ||||||||||
Crawlab | 10,521 | 4 months ago | 1 | March 03, 2019 | 58 | bsd-3-clause | Go | |||
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架 | ||||||||||
Scrapy Redis | 5,438 | 176 | 21 | 5 months ago | 18 | July 26, 2022 | 29 | mit | Python | |
Redis-based components for Scrapy. | ||||||||||
Haipproxy | 5,384 | 1 | a year ago | 7 | June 18, 2018 | 44 | mit | Python | ||
:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis | ||||||||||
Proxypool | 5,154 | 4 months ago | 40 | mit | Python | |||||
An Efficient ProxyPool with Getter, Tester and Server | ||||||||||
Distribute_crawler | 3,176 | 7 years ago | 26 | Python | ||||||
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现 | ||||||||||
Anemone | 1,615 | 385 | 34 | 4 years ago | 23 | May 30, 2012 | 55 | mit | Ruby | |
Anemone web-spider framework | ||||||||||
Scrapy Cluster | 1,137 | 18 | 2 | 6 months ago | 15 | December 23, 2020 | 17 | mit | Python | |
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster. | ||||||||||
Listed Company News Crawl And Text Analysis | 689 | a year ago | 5 | mit | Python | |||||
从新浪财经、每经网、金融界、中国证券网、证券时报网上,爬取上市公司(个股)的历史新闻文本数据进行文本分析、提取特征集,然后利用SVM、随机森林等分类器进行训练,最后对实施抓取的新闻数据进行分类预测 |