Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Sparkler | 401 | a year ago | 55 | apache-2.0 | Java | |||||
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark. | ||||||||||
Fxdesktopsearch | 168 | 5 months ago | 19 | apache-2.0 | Java | |||||
A JavaFX based desktop search application. | ||||||||||
Nutch Htmlunit | 122 | 9 years ago | 1 | apache-2.0 | Java | |||||
基于Apache Nutch和Htmlunit的扩展实现AJAX页面爬虫抓取解析插件 | ||||||||||
Crawl Anywhere | 98 | 7 years ago | 38 | other | PHP | |||||
Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration. | ||||||||||
Roboto | 63 | 15 | 2 | 7 years ago | 42 | August 24, 2014 | 12 | JavaScript | ||
A web crawler/scraper/spider for nodejs | ||||||||||
Mturk Tracker | 35 | 6 years ago | 1 | other | Python | |||||
Depracated - Software for gathering historical data from Amazon Mechanical Turk Service | ||||||||||
Darkwebbot | 22 | 7 years ago | other | Python | ||||||
Dark Web Crawler for crawling the hidden onion sites and indexing them in Solr | ||||||||||
Policyfeed | 18 | 12 years ago | 7 | agpl-3.0 | JavaScript | |||||
Government news aggregator | ||||||||||
Rsparkler | 10 | 6 years ago | R | |||||||
RsparkleR provides an R interface for launching virtual machines and deploying Sparkler | ||||||||||
Nutch Crawler | 9 | 9 years ago | 2 | apache-2.0 | Java | |||||
Apache Nutch fork tunned for web services and data discovery. |