Aws Pdf Textract Pipeline

🔍 Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
Alternatives To Aws Pdf Textract Pipeline
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Ayakashi1772a year ago40June 29, 20238otherTypeScript
:zap: Ayakashi.io - The next generation web scraping framework
Aws Pdf Textract Pipeline148
5 months ago5mitTypeScript
:mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
Crawl Anywhere98
7 years ago38otherPHP
Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration.
Dotnetcrawler63
5 years agoC#
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Alternatives To Aws Pdf Textract Pipeline
Select To Compare


Alternative Project Comparisons
Popular Pipeline Projects
Popular Web Crawler Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Typescript
Amazon Web Services
Pipeline
Stack
Lambda Functions
Serverless
Jest
S3
Cloudformation
Web Crawler
Dynamodb
Puppeteer
Sns