Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Fscrawler | 1,279 | 1 | 3 months ago | 5 | January 10, 2022 | 145 | apache-2.0 | Java | ||
Elasticsearch File System Crawler (FS Crawler) | ||||||||||
Sparkler | 401 | a year ago | 55 | apache-2.0 | Java | |||||
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark. | ||||||||||
Memex Explorer | 106 | 8 years ago | 67 | bsd-2-clause | Python | |||||
Viewers for statistics and dashboarding of Domain Search Engine data | ||||||||||
Harvester | 59 | 7 years ago | 3 | gpl-3.0 | JavaScript | |||||
Web crawling and document processing through a usable interface. | ||||||||||
Leechcrawler | 8 | 2 years ago | 2 | bsd-3-clause | Java | |||||
Incremental crawling capabilities for Apache Tika. Crawl content out of e.g. file systems, http(s) sources (webcrawling) imap(s) servers or your own arbitrary data sources. LeechCrawler offers additional Tika parsers providing these crawling capabilities. |