Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Hakrawler | 4,120 | 3 months ago | 11 | February 22, 2021 | 9 | gpl-3.0 | Go | |||
Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application | ||||||||||
Trafilatura | 2,447 | 66 | 3 months ago | 39 | November 29, 2023 | 66 | gpl-3.0 | Python | ||
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments | ||||||||||
Discv4 Dns Lists | 63 | 3 months ago | 2 | |||||||
Rec A Sketch | 42 | 7 years ago | mit | JavaScript | ||||||
content discovery... IN 3D | ||||||||||
Crawlkit | 23 | 6 | 5 | 7 years ago | 34 | May 23, 2016 | 1 | mit | JavaScript | |
A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers. | ||||||||||
Domain_discovery_tool_deprecated | 23 | 7 years ago | 21 | JavaScript | ||||||
Seed acquisition tool to bootstrap focused crawlers | ||||||||||
Block Crawler | 21 | 6 years ago | 3 | mit | JavaScript | |||||
🕸️ discovery tool for legally restricted or censored HTTP resources (code 451 / RFC7725) | ||||||||||
Ndcrawl | 19 | 7 years ago | 1 | mit | Python | |||||
CDP/LLDP Network Discovery Crawler via Python/Netmiko | ||||||||||
Content Discovery Hit Lists | 11 | 7 years ago | gpl-3.0 | Roff | ||||||
This repository contains hit lists to use for web application content discovery. | ||||||||||
Nutch Indexer Discovery | 9 | 6 years ago | 1 | Java | ||||||
Watson Discovery Service indexing plugin for Apache Nutch |