Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Ayakashi | 177 | 2 | a year ago | 40 | June 29, 2023 | 8 | other | TypeScript | ||
:zap: Ayakashi.io - The next generation web scraping framework | ||||||||||
Aws Pdf Textract Pipeline | 148 | 5 months ago | 5 | mit | TypeScript | |||||
:mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript | ||||||||||
Crawl Anywhere | 98 | 7 years ago | 38 | other | PHP | |||||
Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration. | ||||||||||
Dotnetcrawler | 63 | 5 years ago | C# | |||||||
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c |