Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Funpyspidersearchengine | 862 | 2 years ago | 3 | mit | Python | |||||
Word2vec 千人千面 个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索 | ||||||||||
Z_knowledge_graph | 487 | 4 years ago | TSQL | |||||||
Bulding kg from 0 | ||||||||||
Freshonions Torscraper | 313 | 4 years ago | 22 | agpl-3.0 | Python | |||||
Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion | ||||||||||
Gopa | 281 | 3 years ago | 6 | May 19, 2021 | 11 | other | Go | |||
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn | ||||||||||
Java Spider | 276 | 3 years ago | 6 | Java | ||||||
一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。 | ||||||||||
Itsy | 168 | 6 | 9 years ago | 3 | October 05, 2015 | 5 | Clojure | |||
A threaded web-spider written in Clojure | ||||||||||
Hot Samer | 59 | 3 years ago | 4 | JavaScript | ||||||
hot-samer | ||||||||||
Uappexplorer | 41 | 6 years ago | 8 | gpl-3.0 | JavaScript | |||||
Moved to GitLab | ||||||||||
Deadpool | 22 | 4 years ago | Python | |||||||
该项目是一个使用celery作为主体框架的爬虫应用,能够灵活的添加爬虫任务,并且同时运行多站点的爬虫工作,所有组件都能够原生支持规模并发和分布式,加上celery原生的分布式调用,实现大规模并发。 | ||||||||||
Newscrawler | 13 | 2 years ago | 13 | mit | Python | |||||
News crawler |